CN114913441A - Channel pruning method, target detection method and remote sensing image vehicle detection method - Google Patents

Channel pruning method, target detection method and remote sensing image vehicle detection method Download PDF

Info

Publication number
CN114913441A
CN114913441A CN202210738608.9A CN202210738608A CN114913441A CN 114913441 A CN114913441 A CN 114913441A CN 202210738608 A CN202210738608 A CN 202210738608A CN 114913441 A CN114913441 A CN 114913441A
Authority
CN
China
Prior art keywords
convolution
channel
model
decoupling
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210738608.9A
Other languages
Chinese (zh)
Other versions
CN114913441B (en
Inventor
方乐缘
朱定舜
吴洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN202210738608.9A priority Critical patent/CN114913441B/en
Publication of CN114913441A publication Critical patent/CN114913441A/en
Application granted granted Critical
Publication of CN114913441B publication Critical patent/CN114913441B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/17Terrestrial scenes taken from planes or by drones
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/54Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Remote Sensing (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention discloses a channel pruning method, which comprises the steps of determining a target network model; training a target network model to obtain a basic network model; equivalently decoupling the convolution layer of the basic network model to obtain a basic network decoupling model; training a basic network decoupling model to obtain a decoupling model; determining channels which can be compressed finally and reserved channels; and equivalently combining the decoupling models to obtain a network model after channel pruning, and finishing final channel pruning. The invention also discloses a target detection method comprising the channel pruning method and a remote sensing image vehicle detection method comprising the target detection method. Equivalently decoupling convolution layers in the model into cascade of the original convolution and the structural convolution, separately training and equivalently combining the cascade into an original network, and finally cutting channels according to parameters in the structural convolution; therefore, the method not only can keep the original precision of the model, but also has high compression ratio and good reliability.

Description

Channel pruning method, target detection method and remote sensing image vehicle detection method
Technical Field
The invention belongs to the field of digital signal processing, and particularly relates to a channel pruning method, a target detection method and a remote sensing image vehicle detection method.
Background
With the development of economic technology and the improvement of living standard of people, the target detection technology is widely applied to the production and life of people, and brings endless convenience to the production and life of people. Therefore, ensuring the accuracy and rapidity of target detection becomes the key point of research on target detection technology.
At present, the mode of adopting unmanned aerial vehicles to detect targets is already used in a large range. Different from an offline target detection process, target detection on edge devices such as an unmanned aerial vehicle needs to detect a target in a shot image in real time. However, the platform is limited by computing power, memory and power consumption, and generally, a target detection method based on deep learning cannot realize real-time deployment, so that high-precision and light-weight target detection is realized, and the method is particularly important for edge devices such as unmanned aerial vehicles.
In order to meet the real-time deployment of the deep neural network on the end side, researchers have conducted a great deal of research on a model compression method, and the purpose of the method is to simplify the model so as to reduce the calculation amount and the storage amount of the model, and meanwhile, the performance of the model is not affected. The channel pruning method is an important model compression method, the structure of the model does not need to be redefined, and the size of the model is reduced by directly deleting redundant channels, so that the training time of a deep neural network is reduced, and the reasoning speed of the model is accelerated. The channel pruning method provides possibility for the target detection method of deep learning to be deployed on edge equipment.
However, the performance of the deep neural network is closely related to the number of convoluted channels, and the convoluted channels may affect the performance of the model to some extent after pruning, so that a trade-off needs to be made between the pruning degree and the performance. In the training process of traditional model pruning, each parameter participates in training and pruning at the same time, namely precision training and pruning training are coupled; on one hand, however, the optimization target of the model can be changed by a weight penalty term (such as structure sparsity) introduced in pruning training, and the performance of the deep neural network in the training process can be seriously reduced; on the other hand, if the pruning constraint is reduced in order to maintain the model performance and the pruning degree cannot be guaranteed, a pruning model with a high compression rate cannot be obtained.
Disclosure of Invention
The invention aims to provide a channel pruning method which is high in compression rate and good in reliability and can keep the original precision of a model.
It is a further object of the present invention to provide a method of object detection comprising said method of channel pruning.
The invention also aims to provide a remote sensing image vehicle detection method comprising the target detection method.
The channel pruning method provided by the invention comprises the following steps:
s1, determining a target network model;
s2, acquiring a training data set and a loss function, and training the target network model determined in the step S1 by using the acquired training data set and the loss function to obtain a basic network model;
s3, carrying out equivalent decoupling on the convolution layer of the basic network model obtained in the step S2 to obtain a basic network decoupling model;
s4, training the basic network decoupling model obtained in the step S3 by adopting the training data set and the loss function obtained in the step S2 to obtain a decoupling model;
s5, determining a channel which can be compressed finally and a reserved channel according to the decoupling model obtained in the step S4;
and S6, equivalently combining the decoupling models obtained in the step S4 according to the channels which can be compressed and the reserved channels determined in the step S5 to obtain the network models after the channels are pruned, and finishing the channel pruning of the final target network model.
The acquiring of the training data set in step S2 specifically includes the following steps:
acquiring a training picture; carrying out random multi-scale transformation on the obtained training picture; after transformation, randomly turning left and right according to a set probability; finally, unifying the picture size in a gray value supplementing mode;
arranging the pictures into a uniform format: the unified format isn,x,y,w,h) WhereinnIs a target category; (x,y) The central coordinate of the target frame after the relative length and width normalization is obtained; (w,h) The width and the height of the target frame after normalization.
Step S3, performing equivalent decoupling on the convolutional layer of the basic network model obtained in step S2 to obtain a basic network decoupling model, specifically including the following steps:
the basic network model obtained in the step S2WTo (1) acA convolution layerw c Equivalent decoupling as cascaded protolayersw c And structural convolutionw e
Wherein the structure is convolutedw e A convolutional layer of 1 x 1 cores; structural convolutionw e Has an initial weight ofd o *d o The unit matrix of (a) is,d o is laminated to the original coilw c The number of output channels.
To speed up the data processing flow, the structure is convolvedw e Translating to the original convolution layerw c The latter batch normalization layer.
Step S4, which is to train the basic network decoupling model obtained in step S3 by using the training data set and the loss function obtained in step S2, to obtain a decoupling model, specifically includes the following steps:
A. setting a learning rate by adopting the training data set and the loss function obtained in the step S2, and training the basic network decoupling model obtained in the step S3 again;
during training, beforeNTraining the wheel normally;Nafter the round, sorting according to the size of the parameters of the structural convolution, selecting channels needing to be compressed, and applying an extra punishment gradient to the parameters corresponding to the structural convolution;
B. updating parameters of the structural convolution to
Figure 100002_DEST_PATH_IMAGE002
DThe number of convolution kernel channels of the structure convolution layer; then, the original convolution is calculated according to the following formula by using the parameters of the structural convolutiondChannel importance of individual channelsI d
Figure 100002_DEST_PATH_IMAGE004
In the formula
Figure 100002_DEST_PATH_IMAGE006
Convolving each channel with the structuredA parameter of a location;
C. selecting the number of channels to be compressedM: at the beginningM= 0; from the firstNAt the beginning of the wheel, each timeXAfter the training of each batch, the training is completed,Mincrease ofYUntil reaching the preset channel compression ratio; meanwhile, the number of channels of each convolution is not lower than a set value when the channels are selectedS(ii) a Wherein the content of the first and second substances,XYandSare all set positive integers, and
Figure 100002_DEST_PATH_IMAGE008
D. the convolution parameters are updated by
Figure 100002_DEST_PATH_IMAGE010
Wherein
Figure 100002_DEST_PATH_IMAGE012
For the purpose of the updated convolution parameters,Win order to obtain the convolution parameters before the update,lin order to obtain a learning rate,Gpair volumes for loss functionThe backtransmission gradient of the product;
in the structural convolution, for a channel which does not need to be compressed, the parameter updating mode is the same as the original convolution parameter updating mode; for the channel needing to be compressed, the gradient updating mode is changed, an additional penalty gradient is applied to the channel, and the parameter updating mode is
Figure 100002_DEST_PATH_IMAGE014
WhereinQFor the parameters before the update of the structural convolution,
Figure 100002_DEST_PATH_IMAGE016
for the updated parameters of the structural convolution,
Figure 100002_DEST_PATH_IMAGE018
is an imposed penalty gradient;
Figure 100002_DEST_PATH_IMAGE020
is a penalty factor, and
Figure 100002_DEST_PATH_IMAGE022
Figure 100002_DEST_PATH_IMAGE024
Figure 100002_DEST_PATH_IMAGE026
is a function of a sign and
Figure 100002_DEST_PATH_IMAGE028
step S5, determining a channel that can be finally compressed and a channel that is reserved according to the decoupling model obtained in step S4, specifically including the steps of:
calculating the channel importance of each channel of the original convolution by using the parameters of the structural convolution, whereinIThe channel importance of the strip channel isI d
If importance of each channel of the original convolutionI d Satisfy the requirement of
Figure 100002_DEST_PATH_IMAGE030
WhereinkIs a pruning threshold value andk=and 0.01, determining that the channel corresponding to the original convolution is a cut channel, and not reducing the performance of the model after cutting.
The equivalently merging of the decoupling model obtained in step S4 according to the channel that can be compressed and the channel that is reserved and determined in step S5 described in step S6 specifically includes the following steps:
a. combining the calculation formulas of the convolution layer and the batch normalization layer to obtain
Figure 100002_DEST_PATH_IMAGE032
In the formulaxIn order to input the features of the image,yis the output of the input features after passing through the convolutional layer and the batch normalization layer,wis a weight parameter of the convolutional layer,bas a bias parameter for the convolutional layer,
Figure 100002_DEST_PATH_IMAGE034
for the scaling factor of the batch normalization layer,
Figure 100002_DEST_PATH_IMAGE036
is the average of the batch normalization layers,
Figure 100002_DEST_PATH_IMAGE038
is the standard deviation of the batch normalization layer,
Figure 100002_DEST_PATH_IMAGE040
taken for a set minimum
Figure 100002_DEST_PATH_IMAGE042
Figure 100002_DEST_PATH_IMAGE044
Offset coefficients for batch normalization layer, convolution operator;
b. the combined calculation formula is arranged into a convolution calculation format to obtain
Figure 100002_DEST_PATH_IMAGE046
The corresponding convolution is a new convolution;
c. and (c) calculating the weight and the bias of the new convolution obtained in the step (b) by adopting the following formula:
Figure 100002_DEST_PATH_IMAGE048
Figure 100002_DEST_PATH_IMAGE050
in the formula
Figure 100002_DEST_PATH_IMAGE052
Weight parameters for the new convolution;
Figure 100002_DEST_PATH_IMAGE054
bias parameters for the new convolution, and convolution operator;
d. and c, combining the new convolution obtained in the step b with the structural convolution, and calculating the weight and the bias of the combined convolution:
Figure 100002_DEST_PATH_IMAGE056
Figure 100002_DEST_PATH_IMAGE058
in the formula
Figure 100002_DEST_PATH_IMAGE060
Weight of the merged convolution layer;w Q is the weight of the structural convolution;wis the weight of the original convolution;
Figure DEST_PATH_IMAGE062
bias for the merged convolution layer;bas the bias of the original convolution, as the convolution operator;
e. in the combined convolution layer of step d, if the convolution layer includes a channel to be cut,then
Figure 790288DEST_PATH_IMAGE060
And
Figure 919918DEST_PATH_IMAGE062
and simultaneously deleting the parameters on the corresponding channels to finish the cutting of the corresponding channels.
The invention also provides a target detection method comprising the channel pruning method, which comprises the following steps:
(1) constructing an original model of target detection;
(2) performing channel pruning on the target detection original model constructed in the step (1) by adopting the channel pruning method, thereby obtaining a target detection model;
(3) and (3) adopting the target detection model obtained in the step (2) to carry out actual target detection.
The invention also provides a remote sensing image vehicle detection method comprising the target detection method, which comprises the following steps:
1) acquiring a remote sensing image vehicle detection data set;
2) constructing a target detection original model as a Yolov5 model;
3) performing channel pruning on the target detection original model constructed in the step 2) by adopting the channel pruning method, thereby obtaining a cut target detection model;
4) and 3) carrying out actual vehicle detection on the remote sensing image by adopting the target detection model obtained in the step 3).
According to the channel pruning method, the target detection method and the remote sensing image vehicle detection method, convolution layers in a model are equivalently decoupled into cascade connection of original convolution and structural convolution according to the characteristic that the convolution is linear, detachable and combinable, then precision related training is trained according to a normal training mode, pruning related training is only operated on the structural convolution, equivalent combination is carried out into an original network after training is finished, and finally channel pruning is carried out according to parameters in the structural convolution; therefore, the method not only can keep the original precision of the model, but also has high compression ratio and good reliability.
Drawings
FIG. 1 is a schematic process flow diagram of the channel pruning method of the present invention.
Fig. 2 is a schematic view of the pruning principle of the channel pruning method of the present invention.
FIG. 3 is a schematic method flow chart of the target detection method of the present invention.
FIG. 4 is a schematic method flow diagram of the remote sensing image vehicle detection method of the present invention.
Detailed Description
Fig. 1 is a schematic flow chart of the channel pruning method of the present invention: the channel pruning method provided by the invention comprises the following steps:
s1, determining a target network model, such as yolov5 network;
s2, acquiring a training data set and a loss function, and training the target network model determined in the step S1 by using the acquired training data set and the loss function to obtain a basic network model; the method specifically comprises the following steps:
acquiring a training picture; carrying out random multi-scale transformation and parameter on the obtained training picturesPreferably, it is
Figure DEST_PATH_IMAGE064
(ii) a After transformation, randomly turning left and right according to a set probability (preferably 50%); finally, unifying the sizes of the pictures (preferably unifying the sizes to 640 × 640) in a mode of complementing the gray values;
arranging the pictures into a uniform format: the unified format isn,x,y,w,h) In whichnIs a target category; (x,y) The central coordinate after the relative length and width normalization of the target frame is obtained; (w,h) The width and the height of the target frame after normalization;
then, training the target network model determined in the step S1 by using the obtained training data set and the loss function; in training, the learning rate is preferably set to 0.01; after the training is finished, a basic network model is obtainedW
S3, carrying out equivalent decoupling on the convolution layer of the basic network model obtained in the step S2 to obtain a basic network decoupling model; the method specifically comprises the following steps:
the basic network model obtained in the step S2WTo (1) acA convolution layerw c Equivalent decoupling as cascaded protolayersw c And structural convolutionw e
Wherein the structure is convolutedw e A convolutional layer of 1 x 1 cores; structural convolutionw e Has an initial weight ofd o *d o The unit matrix of (a) is,d o is laminated to the original coilw c The number of output channels.
To speed up the data processing flow, the structure is convolvedw e Translating to the original convolution layerw c The subsequent batch normalization layer;
the specific process of the step is as follows:
for modelWTo (1) acA convolution layerw c Let the input feature map bex i The output characteristic diagram isy c Then the process is represented as
Figure DEST_PATH_IMAGE066
Then, the layers are laminated on the original windingw c Post-join structural convolutionw e Wherein, the input and output channels of the original convolution are respectivelyd i Andd o w e convolution layer of 1 x 1 kernel with initial weight ofd o *d o The unit matrix of (2) is set as an input characteristic diagramx e =y c The feature map is output through structural convolution asy e The process is represented as
Figure DEST_PATH_IMAGE068
(ii) a Since the initial weight of the structural convolution is an identity matrix, the structural convolution is performed by using the identity matrixx e =y e
Laminating the layersw c Equivalent decoupling as cascaded protolayersw c And structural convolutionw e The whole process is
Figure DEST_PATH_IMAGE070
(ii) a Therefore, the decoupling transformation is completely equivalent mathematically, and the performance of the model before and after the structural convolution is added is completely consistent; in order to simplify data processing brought by the merging model in the step 5, in the actual decoupling operation, after the structural convolution is translated to the batch normalization layer, the performances before and after translation are still completely consistent;
s4, training the basic network decoupling model obtained in the step S3 by adopting the training data set and the loss function obtained in the step S2 to obtain a decoupling model; the method specifically comprises the following steps:
A. setting a learning rate (the same as the learning rate set in step S2) by using the training data set and the loss function (the same as in step S2) acquired in step S2, and training the basic network decoupling model obtained in step S3 again;
during training, beforeNTraining the wheel normally;Nafter the round, the model finishes adapting to the decoupled parameters, then sorts the parameters according to the size of the structural convolution, selects the channels to be compressed, and applies an extra punishment gradient to the parameters corresponding to the structural convolution;Npreferably 5;
B. parameters of the structural convolution are updated to
Figure DEST_PATH_IMAGE072
DThe number of convolution kernel channels of the structure convolution layer; then, the original convolution is calculated according to the following formula by using the parameters of the structural convolutiondChannel importance of individual channelsI d
Figure DEST_PATH_IMAGE074
In the formula
Figure DEST_PATH_IMAGE076
Convolving each channel with the structuredA parameter of a location;
C. selecting the number of channels to be compressedM: at the beginningM=0; from the firstNAt the beginning of the wheel, each timeXAfter the training of each batch, the training is completed,Mincrease ofYUntil reaching the preset channel compression ratio; meanwhile, the number of channels of each convolution is not lower than a set value when the channels are selectedS(ii) a Wherein the content of the first and second substances,XYandSare all set positive integers, and
Figure DEST_PATH_IMAGE078
Xpreferably a mixture of 256 and preferably,Ypreferably 16 when the compression requirements are not highSPreferably 8, which can make the network performance better;
D. the convolution parameters are updated by
Figure DEST_PATH_IMAGE080
Wherein
Figure DEST_PATH_IMAGE082
For the purpose of the updated convolution parameters,Win order to update the convolution parameters before the update,lin order to obtain the learning rate of the learning,Gregression gradient of convolution for loss function;
in the structural convolution, for a channel which does not need to be compressed, the parameter updating mode is the same as the original convolution parameter updating mode; for the channel needing to be compressed, the gradient updating mode is changed, an additional penalty gradient is applied to the channel, and the parameter updating mode is
Figure DEST_PATH_IMAGE084
WhereinQFor the parameters before the update of the structural convolution,
Figure DEST_PATH_IMAGE086
for the updated parameters of the structural convolution,
Figure DEST_PATH_IMAGE088
is an imposed penalty gradient;
Figure DEST_PATH_IMAGE090
is a penalty factor, and
Figure DEST_PATH_IMAGE092
Figure DEST_PATH_IMAGE094
Figure DEST_PATH_IMAGE096
is a function of a sign and
Figure DEST_PATH_IMAGE098
under the action of the penalty gradient, for the channel needing to be compressed, the corresponding parameter in the structural convolution gradually approaches to zero; when a certain line of parameters in the structural convolution approaches zero, the neuron of a certain channel of the preceding convolution layer is inactivated, and the channel can be removed in the subsequent steps;
s5, determining a channel which can be compressed finally and a reserved channel according to the decoupling model obtained in the step S4; the method specifically comprises the following steps:
after pruning training, some channels in the structure convolution approach to zero, namely the output of the corresponding channel of the original convolution filter can be ignored under the effect of the structure convolution, so that the removal of the channels can not affect the network performance;
calculating the channel importance of each channel of the original convolution by using the parameters of the structural convolution, whereinIThe channel importance of the strip channel isI d
Setting a threshold valuek(ii) a If importance of each channel of the original convolutionI d Satisfy the requirement of
Figure DEST_PATH_IMAGE100
WhereinkIs a pruning threshold value andk=0.01, determining the channel corresponding to the original convolution as a cut channel, wherein the performance of the model cannot be reduced after cutting;
s6, equivalently combining the decoupling models obtained in the step S4 according to the channels which can be compressed and the reserved channels determined in the step S5 to obtain network models after channel pruning, and finishing the channel pruning of the final target network model; the method specifically comprises the following steps:
a. the convolution layer is calculated as
Figure DEST_PATH_IMAGE102
(ii) a Batch normalization layer calculation as
Figure DEST_PATH_IMAGE104
(ii) a Combining the calculation formulas of the convolution layer and the batch normalization layer to obtain
Figure DEST_PATH_IMAGE106
In the formulaxIn order to input the features of the image,yis the output of the input features after passing through the convolutional layer and the batch normalization layer,wis a weight parameter of the convolutional layer,bas a bias parameter for the convolutional layer,
Figure DEST_PATH_IMAGE108
for the scaling factor of the batch normalization layer,
Figure DEST_PATH_IMAGE110
is the average of the batch normalization layers,
Figure DEST_PATH_IMAGE112
is the standard deviation of the batch normalization layer,
Figure DEST_PATH_IMAGE114
is taken as a minimum
Figure DEST_PATH_IMAGE116
Figure DEST_PATH_IMAGE118
Offset coefficients for batch normalization layer, convolution operator;
b. the combined calculation formula is arranged into a convolution calculation format to obtain
Figure DEST_PATH_IMAGE120
The corresponding convolution is a new convolution;
c. calculating the weight and bias of the new convolution obtained in step b using the following formula:
Figure DEST_PATH_IMAGE122
Figure DEST_PATH_IMAGE124
in the formula
Figure DEST_PATH_IMAGE126
Weight parameters for the new convolution;
Figure DEST_PATH_IMAGE128
bias parameters for the new convolution, and convolution operator;
d. the new convolution is calculated as
Figure DEST_PATH_IMAGE130
The calculation formula of the structural convolution is
Figure DEST_PATH_IMAGE132
The convolution calculation format of the combined calculation formula is
Figure DEST_PATH_IMAGE134
(ii) a And then, combining the new convolution obtained in the step b with the structural convolution, and calculating the weight and the bias of the combined convolution layer:
Figure DEST_PATH_IMAGE136
Figure DEST_PATH_IMAGE138
in the formula
Figure DEST_PATH_IMAGE140
Is the weight of the merged convolution layer;w Q is a knotConstructing weights of the convolution;wis the weight of the original convolution;
Figure DEST_PATH_IMAGE142
bias for the merged convolution layer;bas the bias of the original convolution, as the convolution operator;
e. d, in the combined coiling layer in the step d, if the coiling layer comprises a channel to be cut, the coiling layer comprises a channel to be cut
Figure DEST_PATH_IMAGE144
And
Figure DEST_PATH_IMAGE146
and simultaneously deleting the parameters on the corresponding channels to finish the cutting of the corresponding channels.
Fig. 2 is a schematic view of the pruning principle of the channel pruning method according to the present invention: the method for pruning the structural decoupling channel utilizes the size of the parameter of the structural convolution to represent the importance of each channel of the original convolution and can represent the strength of the information transmission capability of the corresponding convolution channel. After channels which can be cut are selected gradually and iteratively, unimportant channels are gradually attenuated to zero by utilizing punishment gradient, the channels are gradually inactivated in iterative pruning, and the channels can be cut with almost no reduction of performance when networks are combined, so that the pruning with lossless performance is achieved.
Fig. 3 is a schematic flow chart of the method of the target detection method of the present invention: the invention provides a target detection method comprising the channel pruning method, which comprises the following steps:
(1) constructing an original model of target detection;
(2) performing channel pruning on the target detection original model constructed in the step (1) by adopting the channel pruning method, thereby obtaining a target detection model;
(3) and (3) adopting the target detection model obtained in the step (2) to carry out actual target detection.
FIG. 4 is a schematic flow chart of the method of the remote sensing image vehicle detection method of the present invention: the invention provides a remote sensing image vehicle detection method comprising the target detection method, which comprises the following steps:
1) acquiring a remote sensing image vehicle detection data set;
2) constructing a target detection original model as a Yolov5 model;
3) performing channel pruning on the target detection original model constructed in the step 2) by adopting the channel pruning method, thereby obtaining a cut target detection model;
4) and 3) carrying out actual vehicle detection on the remote sensing image by adopting the target detection model obtained in the step 3).

Claims (9)

1. A channel pruning method comprises the following steps:
s1, determining a target network model;
s2, acquiring a training data set and a loss function, and training the target network model determined in the step S1 by using the acquired training data set and the loss function to obtain a basic network model;
s3, carrying out equivalent decoupling on the convolution layer of the basic network model obtained in the step S2 to obtain a basic network decoupling model;
s4, training the basic network decoupling model obtained in the step S3 by adopting the training data set and the loss function obtained in the step S2 to obtain a decoupling model;
s5, determining a channel which can be compressed finally and a reserved channel according to the decoupling model obtained in the step S4;
and S6, equivalently combining the decoupling models obtained in the step S4 according to the channel which can be compressed and the reserved channel determined in the step S5 to obtain a network model after channel pruning, and finishing the channel pruning of the final target network model.
2. The channel pruning method according to claim 1, wherein the step of obtaining the training data set in step S2 specifically includes the following steps:
acquiring a training picture; carrying out random multi-scale transformation on the obtained training picture; after transformation, randomly turning left and right according to a set probability; finally, unifying the picture size in a gray value supplementing mode;
arranging the pictures into a uniform format: the unified format isn,x,y,w,h) In whichnIs a target category; (x,y) The central coordinate of the target frame after the relative length and width normalization is obtained; (w,h) The width and the height of the target frame after normalization are obtained.
3. The channel pruning method according to claim 2, wherein the step S3 of equivalently decoupling the convolutional layer of the basic network model obtained in the step S2 to obtain a basic network decoupling model specifically comprises the following steps:
the basic network model obtained in the step S2WTo (1) acA convolution layerw c Equivalent decoupling as cascaded protolayersw c And structural convolutionw e
Wherein the structure is convolutedw e A convolutional layer of 1 x 1 cores; structural convolutionw e Has an initial weight ofd o *d o The unit matrix of (a) is,d o is laminated to the original volumew c The number of output channels.
4. Method for pruning channels according to claim 3, characterized in that for speeding up the data processing flow, the structure is convolvedw e Translating to the original convolution layerw c The latter batch normalization layer.
5. The channel pruning method according to claim 4, wherein the step S4 of training the basic network decoupling model obtained in the step S3 by using the training data set and the loss function obtained in the step S2 to obtain the decoupling model specifically comprises the steps of:
A. setting a learning rate by adopting the training data set and the loss function obtained in the step S2, and training the basic network decoupling model obtained in the step S3 again;
during training, beforeNTraining the wheel normally;Nafter the round, sorting according to the size of the parameters of the structural convolution, selecting channels needing to be compressed, and applying an extra punishment gradient to the parameters corresponding to the structural convolution;
B. updating parameters of the structural convolution to
Figure DEST_PATH_IMAGE002
DThe number of convolution kernel channels of the structure convolution layer; then, the original convolution is calculated according to the following formula by using the parameters of the structural convolutiondChannel importance of individual channelsI d
Figure DEST_PATH_IMAGE004
In the formula
Figure DEST_PATH_IMAGE006
Convolving each channel with the structuredA parameter of a location;
C. selecting the number of channels to be compressedM: at the beginningM= 0; from the firstNAt the beginning of the wheel, each timeXAfter the training of each batch, the training is completed,Mincrease ofYUntil reaching the preset channel compression ratio; meanwhile, the number of channels of each convolution is not lower than a set value when the channels are selectedS(ii) a Wherein the content of the first and second substances,XYandSare all set positive integers, and
Figure DEST_PATH_IMAGE008
D. the convolution parameters are updated by
Figure DEST_PATH_IMAGE010
Wherein
Figure DEST_PATH_IMAGE012
For the purpose of the updated convolution parameters,Win order to update the convolution parameters before the update,lin order to obtain a learning rate,Ga return gradient of the convolution for the loss function;
in the structural convolution, for a channel which does not need to be compressed, the parameter updating mode is the same as the original convolution parameter updating mode; for the channel needing to be compressed, the gradient updating mode is changed, an additional penalty gradient is applied to the channel, and the parameter updating mode is
Figure DEST_PATH_IMAGE014
WhereinQFor the parameters before the update of the structural convolution,
Figure DEST_PATH_IMAGE016
for the updated parameters of the structural convolution,
Figure DEST_PATH_IMAGE018
is an imposed penalty gradient;
Figure DEST_PATH_IMAGE020
is a penalty factor, and
Figure DEST_PATH_IMAGE022
Figure DEST_PATH_IMAGE024
Figure DEST_PATH_IMAGE026
is a function of a sign and
Figure DEST_PATH_IMAGE028
6. the channel pruning method according to claim 5, wherein the step S5 of determining the channels that can be finally compressed and the remaining channels according to the decoupling model obtained in the step S4 specifically comprises the steps of:
calculating the channel importance of each channel of the original convolution by using the parameters of the structural convolution, whereinIThe channel importance of the strip channel isI d
If importance of each channel of the original convolutionI d Satisfy the requirement of
Figure DEST_PATH_IMAGE030
In whichkIs a pruning threshold value andk=and 0.01, determining that the channel corresponding to the original convolution is a cut channel, and not reducing the performance of the model after cutting.
7. The channel pruning method according to claim 6, wherein the step S6 of equivalently merging the decoupling models obtained in the step S4 according to the channels that can be compressed and the reserved channels determined in the step S5 includes the following steps:
a. combining the calculation formulas of the convolution layer and the batch normalization layer to obtain
Figure DEST_PATH_IMAGE032
In the formulaxIn order to input the features of the image,yis the output of the input features after passing through the convolutional layer and the batch normalization layer,wis a weight parameter of the convolutional layer,bas a bias parameter for the convolutional layer,
Figure DEST_PATH_IMAGE034
for the scaling factor of the batch normalization layer,
Figure DEST_PATH_IMAGE036
is the average of the batch normalization layers,
Figure DEST_PATH_IMAGE038
is the standard deviation of the batch normalization layer,
Figure DEST_PATH_IMAGE040
is a set minimum value of the amount of the active ingredient,
Figure DEST_PATH_IMAGE042
offset coefficients for batch normalization layer, convolution operator;
b. the combined calculation formula is arrangedFor the convolution calculation format, obtain
Figure DEST_PATH_IMAGE044
The corresponding convolution is a new convolution;
c. calculating the weight and bias of the new convolution obtained in step b using the following formula:
Figure DEST_PATH_IMAGE046
Figure DEST_PATH_IMAGE048
in the formula
Figure DEST_PATH_IMAGE050
Weight parameters for the new convolution;
Figure DEST_PATH_IMAGE052
bias parameters for the new convolution, and convolution operator;
d. and c, combining the new convolution obtained in the step b with the structural convolution, and calculating the weight and the bias of the combined convolution:
Figure DEST_PATH_IMAGE054
Figure DEST_PATH_IMAGE056
in the formula
Figure DEST_PATH_IMAGE058
Weight of the merged convolution layer;w Q is the weight of the structural convolution;wis the weight of the original convolution;
Figure DEST_PATH_IMAGE060
bias for the merged convolution layer;bas the bias of the original convolution, as the convolution operator;
e. d, in the combined coiling layer in the step d, if the coiling layer comprises a channel to be cut, the coiling layer comprises a channel to be cut
Figure 63214DEST_PATH_IMAGE058
And
Figure 245933DEST_PATH_IMAGE060
and simultaneously deleting the parameters on the corresponding channels to finish the cutting of the corresponding channels.
8. An object detection method comprising the channel pruning method according to any one of claims 1 to 7, characterized by comprising the steps of:
(1) constructing an original model of target detection;
(2) performing channel pruning on the target detection original model constructed in the step (1) by adopting the channel pruning method, thereby obtaining a target detection model;
(3) and (3) adopting the target detection model obtained in the step (2) to carry out actual target detection.
9. A remote sensing image vehicle detection method including the object detection method of claim 8, characterized by comprising the steps of:
1) acquiring a remote sensing image vehicle detection data set;
2) constructing a target detection original model as a Yolov5 model;
3) performing channel pruning on the target detection original model constructed in the step 2) by adopting the channel pruning method, thereby obtaining a cut target detection model;
4) and 3) carrying out actual vehicle detection on the remote sensing image by adopting the target detection model obtained in the step 3).
CN202210738608.9A 2022-06-28 2022-06-28 Channel pruning method, target detection method and remote sensing image vehicle detection method Active CN114913441B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210738608.9A CN114913441B (en) 2022-06-28 2022-06-28 Channel pruning method, target detection method and remote sensing image vehicle detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210738608.9A CN114913441B (en) 2022-06-28 2022-06-28 Channel pruning method, target detection method and remote sensing image vehicle detection method

Publications (2)

Publication Number Publication Date
CN114913441A true CN114913441A (en) 2022-08-16
CN114913441B CN114913441B (en) 2024-04-16

Family

ID=82772813

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210738608.9A Active CN114913441B (en) 2022-06-28 2022-06-28 Channel pruning method, target detection method and remote sensing image vehicle detection method

Country Status (1)

Country Link
CN (1) CN114913441B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115730654A (en) * 2022-11-23 2023-03-03 湖南大学 Layer pruning method, kitchen garbage detection method and remote sensing image vehicle detection method
CN116579409A (en) * 2023-07-11 2023-08-11 菲特(天津)检测技术有限公司 Intelligent camera model pruning acceleration method and acceleration system based on re-parameterization

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110009095A (en) * 2019-03-04 2019-07-12 东南大学 Road driving area efficient dividing method based on depth characteristic compression convolutional network
CN111680781A (en) * 2020-04-20 2020-09-18 北京迈格威科技有限公司 Neural network processing method, neural network processing device, electronic equipment and storage medium
CN111967594A (en) * 2020-08-06 2020-11-20 苏州浪潮智能科技有限公司 Neural network compression method, device, equipment and storage medium
US20210049423A1 (en) * 2019-07-31 2021-02-18 Zhejiang University Efficient image classification method based on structured pruning
WO2021129570A1 (en) * 2019-12-25 2021-07-01 神思电子技术股份有限公司 Network pruning optimization method based on network activation and sparsification
CN113222142A (en) * 2021-05-28 2021-08-06 上海天壤智能科技有限公司 Channel pruning and quick connection layer pruning method and system
CN113255892A (en) * 2021-06-01 2021-08-13 上海交通大学烟台信息技术研究院 Method and device for searching decoupled network structure and readable storage medium
WO2021208151A1 (en) * 2020-04-13 2021-10-21 商汤集团有限公司 Model compression method, image processing method and device
CN114065923A (en) * 2021-11-30 2022-02-18 南京航空航天大学 Compression method, system and accelerating device of convolutional neural network
CN114594461A (en) * 2022-03-14 2022-06-07 杭州电子科技大学 Sonar target detection method based on attention perception and zoom factor pruning

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110009095A (en) * 2019-03-04 2019-07-12 东南大学 Road driving area efficient dividing method based on depth characteristic compression convolutional network
US20210049423A1 (en) * 2019-07-31 2021-02-18 Zhejiang University Efficient image classification method based on structured pruning
WO2021129570A1 (en) * 2019-12-25 2021-07-01 神思电子技术股份有限公司 Network pruning optimization method based on network activation and sparsification
WO2021208151A1 (en) * 2020-04-13 2021-10-21 商汤集团有限公司 Model compression method, image processing method and device
CN111680781A (en) * 2020-04-20 2020-09-18 北京迈格威科技有限公司 Neural network processing method, neural network processing device, electronic equipment and storage medium
CN111967594A (en) * 2020-08-06 2020-11-20 苏州浪潮智能科技有限公司 Neural network compression method, device, equipment and storage medium
CN113222142A (en) * 2021-05-28 2021-08-06 上海天壤智能科技有限公司 Channel pruning and quick connection layer pruning method and system
CN113255892A (en) * 2021-06-01 2021-08-13 上海交通大学烟台信息技术研究院 Method and device for searching decoupled network structure and readable storage medium
CN114065923A (en) * 2021-11-30 2022-02-18 南京航空航天大学 Compression method, system and accelerating device of convolutional neural network
CN114594461A (en) * 2022-03-14 2022-06-07 杭州电子科技大学 Sonar target detection method based on attention perception and zoom factor pruning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
XIAOHAN DING 等: ""ResRep: Lossless CNN Pruning via Decoupling Remembering and Forgetting"", ARXIV:2007.03260V4, 14 August 2021 (2021-08-14), pages 1 - 11 *
郭庆北: ""深度卷积神经网络的压缩与加速技术的研究"", 《中国博士学位论文全文数据库 (信息科技辑)》, 15 March 2022 (2022-03-15), pages 140 - 26 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115730654A (en) * 2022-11-23 2023-03-03 湖南大学 Layer pruning method, kitchen garbage detection method and remote sensing image vehicle detection method
CN115730654B (en) * 2022-11-23 2024-05-14 湖南大学 Layer pruning method, kitchen waste detection method and remote sensing image vehicle detection method
CN116579409A (en) * 2023-07-11 2023-08-11 菲特(天津)检测技术有限公司 Intelligent camera model pruning acceleration method and acceleration system based on re-parameterization

Also Published As

Publication number Publication date
CN114913441B (en) 2024-04-16

Similar Documents

Publication Publication Date Title
CN114913441A (en) Channel pruning method, target detection method and remote sensing image vehicle detection method
CN110532859B (en) Remote sensing image target detection method based on deep evolution pruning convolution net
CN107748895B (en) Unmanned aerial vehicle landing landform image classification method based on DCT-CNN model
CN110766063B (en) Image classification method based on compressed excitation and tightly connected convolutional neural network
CN110909667B (en) Lightweight design method for multi-angle SAR target recognition network
CN108288270B (en) Target detection method based on channel pruning and full convolution deep learning
CN111191583B (en) Space target recognition system and method based on convolutional neural network
CN110378381A (en) Object detecting method, device and computer storage medium
CN111340814A (en) Multi-mode adaptive convolution-based RGB-D image semantic segmentation method
CN106203363A (en) Human skeleton motion sequence Activity recognition method
CN113159173A (en) Convolutional neural network model compression method combining pruning and knowledge distillation
CN113870335A (en) Monocular depth estimation method based on multi-scale feature fusion
WO2022062164A1 (en) Image classification method using partial differential operator-based general-equivariant convolutional neural network model
CN116416561A (en) Video image processing method and device
CN116071668A (en) Unmanned aerial vehicle aerial image target detection method based on multi-scale feature fusion
CN110781912A (en) Image classification method based on channel expansion inverse convolution neural network
CN112561041A (en) Neural network model acceleration method and platform based on filter distribution
CN115966010A (en) Expression recognition method based on attention and multi-scale feature fusion
CN115048870A (en) Target track identification method based on residual error network and attention mechanism
CN113554084A (en) Vehicle re-identification model compression method and system based on pruning and light-weight convolution
CN114154626B (en) Filter pruning method for image classification task
CN113850373B (en) Class-based filter pruning method
CN114972753A (en) Lightweight semantic segmentation method and system based on context information aggregation and assisted learning
CN113850365A (en) Method, device, equipment and storage medium for compressing and transplanting convolutional neural network
CN115620120B (en) Street view image multi-scale high-dimensional feature construction quantization method, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant