CN111612145A - Model compression and acceleration method based on heterogeneous separation convolution kernel - Google Patents
Model compression and acceleration method based on heterogeneous separation convolution kernel Download PDFInfo
- Publication number
- CN111612145A CN111612145A CN202010442785.3A CN202010442785A CN111612145A CN 111612145 A CN111612145 A CN 111612145A CN 202010442785 A CN202010442785 A CN 202010442785A CN 111612145 A CN111612145 A CN 111612145A
- Authority
- CN
- China
- Prior art keywords
- convolution
- channel
- convolution kernel
- representative
- spconv
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention provides an SPConv, a model compression and acceleration method based on heterogeneous separation convolution kernels, aiming at the phenomenon that a convolution characteristic graph has a large number of similarities. The SPConv divides an input feature map into a representative channel and a redundant channel, and extracts important essential information stored in the representative channel by using a convolution kernel which is large in calculation amount and strong in feature extraction capability; and the convolution kernel with very small computational overhead is utilized to extract the hidden tiny detail information in the 'redundant channel'. Then the two are subjected to feature fusion by the feature fusion method without parameters designed by the invention. The SPConv designed by the invention is a plug-and-play convolution module and can be directly replaced in the current network architecture. Experiments on image classification and target detection data sets show that the model performance and the reasoning speed on the GPU exceed those of a reference method under the condition that the parameter quantity and the floating point number calculation quantity are greatly reduced.
Description
Technical Field
The invention belongs to the field of computer vision-basic network, and is suitable for the computer vision sub-fields of image classification, target detection and the like.
Background
With the development of the neural network, the performance of the neural network model on the computer vision is broken suddenly, the scale of the model is enlarged continuously, and the computational power requirement on the graphic computation display card is higher and higher. However, the computational resources are limited, and at present, there is a great demand on how to embed a huge neural network into a mobile terminal device, so how to compress and accelerate a huge neural network model under the condition of limited computational resources, and meanwhile, ensuring that the model performance is not greatly lost becomes one of the research hotspots of the current neural network.
Designing an efficient convolution operation is one of the main research directions for current model compression. Many methods achieve very good model compression effects by making reasonable use of group-wise convolution, layer-wise convolution and point-wise convolution, such as Xception, MobileNet, resenext and ShuffleNet. These methods show that the power of a large convolution kernel can be approximated by a number of discrete small convolution kernels. Other methods such as HetConv, OctConv, GhostConv, etc. can also reduce the model parameters and floating point number computations by a large amount by making reasonable improvements to the original convolution kernel. The other main direction of model compression is to quantize the model parameters, and the main methods are binary networks, ternary networks and the like.
Although model compression and acceleration have been well developed at present, most methods do not allow for both model compression and model acceleration. On one hand, the efficiency of a large convolution kernel is often realized by fitting a discrete small convolution kernel through a method for designing a high-efficiency convolution kernel, but the discrete small convolution kernel is actually not beneficial to calculation on a graphic calculation display card, so that the reasoning speed of the model is reduced. On the other hand, model compression is performed by parameter value quantization, and the performance of the model is also reduced due to the great reduction of parameter precision; therefore, a method is proposed herein that can both compress the model size while ensuring model performance, and enable it to accelerate the model inference speed on a graphics computing graphics card.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, finds the phenomenon that a convolution characteristic diagram has a large amount of similarity, provides SPConv, and provides a model compression and acceleration method for greatly reducing model parameters and floating point number calculation under the condition of ensuring model precision and reasoning speed.
The technical problem to be solved by the invention is realized by adopting the following technical scheme:
SPConv, a model compression and acceleration method based on heterogeneous separation convolution kernel, as shown in fig. 2, includes the following steps:
step 1, dividing an input characteristic diagram into two parts according to a proportion alpha, wherein one part is called a representative channel, and the other part is called a redundant channel;
and 2, respectively executing three operations for the representative channel: 1) the features are divided into several main parts by group convolution (groupwisconvolution); 2) acquiring mutual information between channels lost due to group Convolution through point Convolution (Pointwise Convolution); 3) directly adding the characteristic diagrams obtained by the two operations to complement information;
step 3, for the redundant channel, extracting hidden tiny detailed information on the redundant channel by using point convolution with small calculation amount;
and 4, performing feature fusion without parameter quantity on the features obtained in the steps 2 and 3.
The parameter-free feature fusion method in the step 4 comprises the following steps:
⑴, assume that the characteristic scale output from step 2 and step 3 is NxCxHxW, and is denoted as U3,U1(subscript represents the size of convolution kernel), global information on the space is obtained by respectively carrying out global mean pooling operation on the space to obtain S with the NxC scale3,S1Two matrices, called "feature importance matrices";
secondly, stacking channel importance matrixes S1 and S3 in channel dimension, and then performing SoftMax operation in the channel dimension to obtain weight coefficients beta and gamma of each channel;
performing weighted summation on the three;
Y=βU3+γU1
through the four steps of operation, the model can be ensured not to lose important information, and meanwhile, the reasoning speed is accelerated.
The invention has the advantages and positive effects that:
the method is reasonable in design, and provides the SPConv through a large number of similar phenomena in observed convolution characteristic graphs, and the method is a model compression and acceleration method based on heterogeneous separation convolution kernels. The SPConv achieves large-amplitude parameter quantity and floating point number calculation quantity reduction under the condition of ensuring model precision and GPU reasoning speed. This is superior to current work, which, although achieving larger magnitudes of parameter and floating-point number computations, has much lower GPU inference speeds than the baseline approach.
Drawings
FIG. 1 shows that the convolution characteristic diagram found by the present invention has a large number of similarity phenomena; fig. 2 is a diagram of the SPConv overall network framework proposed by the present invention.
Detailed Description
The invention is further described below with reference to examples:
the SPConv provided by the invention directly replaces a 3x3 convolution kernel in a classical network (such as ResNet, VGGNet and the like), and can improve the performance of a model and the inference speed on a GPU while ensuring the large reduction of parameter and floating point number calculation amount without adjusting other hyper-parameters and the like. The specific experiment is as follows: the following experiment was conducted according to the method of the present invention to explain the recognition effect of the present invention.
And (3) testing environment: python 3.6; a PyTorch frame; ubuntu16.04 system; NVIDIA Tesla V100 GPU
And (3) testing sequence: the selected datasets are the small classification dataset CIFAR-10, the large classification dataset ImageNet and the target detection dataset MS COCO. The CIFAR-10 comprises 5 ten thousand training pictures and 1 ten thousand verification pictures, the ImageNet data set comprises 128 ten thousand training images and 5 ten thousand verification pictures, and the MS COCO target detection data set comprises 3.5 ten thousand training pictures and 5 thousand verification pictures.
Testing indexes are as follows: for the picture classification task, the accuracy rates of Top1 and Top5 are used as evaluation indexes; for the target detection task, the mAP (mean Average precision) was used as an evaluation index. The index data are calculated by different algorithms of the current flow, and then the results are compared, so that the method has better performance than other current works in the fields of pattern classification and target detection.
The test results were as follows:
TABLE 1 comparison of Performance of the invention on CIFAR-10 dataset with different feature separation ratios
TABLE 2 comparison of Performance of the invention on ImageNet datasets with different feature separation ratios
TABLE 3 comparison of Performance of the invention on MS COCO data set with different feature separation ratios
It can be seen from the comparison data that in the image classification task, no matter whether the image classification task is a small data set CIFAR or a large data set ImageNet, the SPConv provided by the invention ensures that the parameter number and the floating point number calculation amount are greatly reduced, and meanwhile, the precision of the SPConv is still slightly better than that of a reference method and exceeds other current works; the inference speed on the GPU is slightly better than that of the reference method, and far exceeds other current works.
In the field of target detection, experimental results show that the performance of the method exceeds that of a benchmark method while the backsbone parameter quantity and floating point number calculation quantity are greatly reduced.
It should be emphasized that the embodiments described herein are illustrative rather than restrictive, and thus the present invention is not limited to the embodiments described in the detailed description, and that other embodiments derived from the teachings of the present invention by those skilled in the art are also within the scope of the present invention.
Claims (3)
- SPConv, a model compression and acceleration method based on heterogeneous separation convolution kernel, characterized by:the conventional 3x3 convolution kernel operation is performed on only a portion of a representative input channel; the input feature map is divided into two parts at a separation ratio α, one part is referred to as a "representative channel" and the other part is referred to as a "redundant channel".
- 2. The model compression and acceleration method based on the heterogeneous separation convolution kernel as claimed in claim 1, characterized in that:extracting important essential information in a representative channel by using a convolution kernel with a large calculation amount, and extracting hidden tiny detailed information in a redundant channel by using a convolution kernel with a small calculation amount; the method comprises the following two steps:2.1 respectively performing group convolution and point convolution on the representative channels to extract main essential features, and then directly adding the features after the two types of convolution;2.2 performing a computationally inexpensive point convolution on another portion of the "redundant channel" to complement the hidden detail information.
- 3. The model compression and acceleration method based on the heterogeneous separation convolution kernel as claimed in claim 1 or 2, characterized in that:designing a 'parameter-free feature fusion method' to fuse the features generated in claim 2; the method comprises the following 3 steps:3.1, respectively executing global mean pooling operation on feature maps obtained by convolution of the representative channel and the redundant channel in the space field;3.2 stacking the two obtained matrixes in the channel dimension, and then performing SoftMax operation in the channel dimension to obtain an importance matrix between channels;and 3.3, taking the obtained channel importance matrix as the weight among the channels to carry out weighted summation on the feature maps obtained by convolution of the representative channel and the redundant channel, so as to obtain the final convolution output.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010442785.3A CN111612145A (en) | 2020-05-22 | 2020-05-22 | Model compression and acceleration method based on heterogeneous separation convolution kernel |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010442785.3A CN111612145A (en) | 2020-05-22 | 2020-05-22 | Model compression and acceleration method based on heterogeneous separation convolution kernel |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111612145A true CN111612145A (en) | 2020-09-01 |
Family
ID=72199589
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010442785.3A Pending CN111612145A (en) | 2020-05-22 | 2020-05-22 | Model compression and acceleration method based on heterogeneous separation convolution kernel |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111612145A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113411583A (en) * | 2021-05-24 | 2021-09-17 | 西北工业大学 | Image compression method based on dimension splitting |
CN113762200A (en) * | 2021-09-16 | 2021-12-07 | 深圳大学 | Mask detection method based on LFFD |
CN113850368A (en) * | 2021-09-08 | 2021-12-28 | 深圳供电局有限公司 | Lightweight convolutional neural network model suitable for edge-end equipment |
-
2020
- 2020-05-22 CN CN202010442785.3A patent/CN111612145A/en active Pending
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113411583A (en) * | 2021-05-24 | 2021-09-17 | 西北工业大学 | Image compression method based on dimension splitting |
CN113411583B (en) * | 2021-05-24 | 2022-09-02 | 西北工业大学 | Image compression method based on dimension splitting |
CN113850368A (en) * | 2021-09-08 | 2021-12-28 | 深圳供电局有限公司 | Lightweight convolutional neural network model suitable for edge-end equipment |
CN113762200A (en) * | 2021-09-16 | 2021-12-07 | 深圳大学 | Mask detection method based on LFFD |
CN113762200B (en) * | 2021-09-16 | 2023-06-30 | 深圳大学 | Mask detection method based on LFD |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liu et al. | FDDWNet: a lightweight convolutional neural network for real-time semantic segmentation | |
Paszke et al. | Enet: A deep neural network architecture for real-time semantic segmentation | |
CN111612145A (en) | Model compression and acceleration method based on heterogeneous separation convolution kernel | |
CN114067153B (en) | Image classification method and system based on parallel double-attention light-weight residual error network | |
CN111639692A (en) | Shadow detection method based on attention mechanism | |
Li et al. | Depth-wise asymmetric bottleneck with point-wise aggregation decoder for real-time semantic segmentation in urban scenes | |
CN111091130A (en) | Real-time image semantic segmentation method and system based on lightweight convolutional neural network | |
CN111612017B (en) | Target detection method based on information enhancement | |
CN111028146A (en) | Image super-resolution method for generating countermeasure network based on double discriminators | |
CN111582044A (en) | Face recognition method based on convolutional neural network and attention model | |
CN111860683B (en) | Target detection method based on feature fusion | |
CN110866938B (en) | Full-automatic video moving object segmentation method | |
CN110598788A (en) | Target detection method and device, electronic equipment and storage medium | |
CN112037228A (en) | Laser radar point cloud target segmentation method based on double attention | |
CN113066089B (en) | Real-time image semantic segmentation method based on attention guide mechanism | |
CN111899203B (en) | Real image generation method based on label graph under unsupervised training and storage medium | |
CN112836651B (en) | Gesture image feature extraction method based on dynamic fusion mechanism | |
CN114155371A (en) | Semantic segmentation method based on channel attention and pyramid convolution fusion | |
CN113065426A (en) | Gesture image feature fusion method based on channel perception | |
CN114529982A (en) | Lightweight human body posture estimation method and system based on stream attention | |
CN114998756A (en) | Yolov 5-based remote sensing image detection method and device and storage medium | |
CN112989919B (en) | Method and system for extracting target object from image | |
CN118247645A (en) | Novel DDCE-YOLOv s model underwater image target detection method | |
CN113327227A (en) | Rapid wheat head detection method based on MobilenetV3 | |
CN113177546A (en) | Target detection method based on sparse attention module |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20200901 |