CN113780550A - Convolutional neural network pruning method and device for quantizing feature map similarity - Google Patents
Convolutional neural network pruning method and device for quantizing feature map similarity Download PDFInfo
- Publication number
- CN113780550A CN113780550A CN202110977310.9A CN202110977310A CN113780550A CN 113780550 A CN113780550 A CN 113780550A CN 202110977310 A CN202110977310 A CN 202110977310A CN 113780550 A CN113780550 A CN 113780550A
- Authority
- CN
- China
- Prior art keywords
- pruning
- model
- similarity
- neural network
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000013138 pruning Methods 0.000 title claims abstract description 168
- 238000000034 method Methods 0.000 title claims abstract description 53
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 49
- 238000013528 artificial neural network Methods 0.000 claims abstract description 22
- 238000004364 calculation method Methods 0.000 claims description 30
- 230000006835 compression Effects 0.000 claims description 18
- 238000007906 compression Methods 0.000 claims description 18
- 238000003062 neural network model Methods 0.000 claims description 13
- 244000141353 Prunus domestica Species 0.000 claims description 4
- 238000013215 result calculation Methods 0.000 claims description 3
- 238000005096 rolling process Methods 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 238000009966 trimming Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 abstract description 3
- 239000011159 matrix material Substances 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000012800 visualization Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 2
- 241000208340 Araliaceae Species 0.000 description 1
- 238000012935 Averaging Methods 0.000 description 1
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 1
- 235000003140 Panax quinquefolius Nutrition 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 235000008434 ginseng Nutrition 0.000 description 1
- 238000013140 knowledge distillation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a convolutional neural network pruning method and device for quantizing feature map similarity, and relates to the technical field of computer science. And by quantifying the 'information similarity' between the characteristic graphs, pruning to remove convolution kernels corresponding to the characteristic graphs with similar information, then carrying out fine tuning, and carrying out layer-by-layer iteration to obtain a new model, thereby reducing the storage size of model parameters and the model. The accuracy of the finally obtained pruning model is basically not changed, and the parameter quantity is greatly reduced after pruning is finished, so that the occupied memory of the model is reduced, the required computing power is reduced, and the computing speed is accelerated; and thus can be deployed on edge devices with limited computer resources in a more optimized implementation. And the deep neural network can be better applied under the situations of limited computing resources, real-time online processing and the like.
Description
Technical Field
The invention relates to the technical field of computer science, in particular to a convolutional neural network pruning method, a convolutional neural network pruning device, a convolutional neural network pruning method and a convolutional neural network pruning device for quantizing feature map similarity.
Background
Along with the increase of the model performance of the deep neural network, the depth and the breadth of the neural network are also increased, the corresponding disadvantages of high storage and high power consumption are caused, and the application of the deep neural network in the situations of limited computing resources, real-time online processing and the like is severely restricted. The deep neural network model with parameters over millions of orders of magnitude stores a large amount of redundant information, the original deep network model is compressed, the parameters are reduced, the weight of the deep neural network model is lightened, and the deep neural network model is applied to edge equipment with limited computing resources as far as possible without losing accuracy, so that the deep neural network model becomes a hot spot of attention of people at present.
Aiming at the light weight of the neural network, the methods proposed by the predecessors mainly comprise parameter pruning, parameter sharing, low-rank decomposition, knowledge distillation and the like.
The non-patent literature (Mingbao Lin, et al, HRank: Filter Pruning High-rate Feature map. Proc. arxiv,2020.) uses the statistical averaging idea to prune the corresponding convolution kernel by calculating the Rank of the Feature map, but it does not indicate the similarity between Feature maps.
The deep neural network is the most important problem to be solved at present, which enables the network to be applied to the edge equipment with limited computing resources accurately, and reduces the parameters of the deep neural network to make the deep neural network light.
Disclosure of Invention
The invention provides a convolutional neural network pruning method and a convolutional neural network pruning device for quantifying feature map similarity, aiming at the problems that in the prior art, a network cannot be accurately applied to edge equipment with limited computing resources and the convolutional neural network is light in weight.
In order to solve the technical problems, the invention provides the following technical scheme:
the embodiment of the invention provides a convolutional neural network pruning method and device for quantizing feature map similarity. The technical scheme is as follows:
in one aspect, the invention provides a convolutional neural network pruning method for quantizing feature map similarity, which comprises the following steps:
s1: according to the pruning compression ratio, pre-pruning calculation is carried out on the neural network model needing pruning to obtain the number N of the pruned convolution kernels of each layer in the convolution neural networki2;
S2: inputting pictures in a picture dataset to Li-1Model after layer pruning, pair LiPruning the layers; quantizing the similarity through an SSIM method to obtain the similarity between the quantized Feature maps Feature _ map; determining Delete according to similarity of quantized Feature map Feature _ mapi;
S3: by the resulting DeleteiPruning is carried out to obtain a Model after pruningiAnd for ModeliFine tuning is carried out to enable i to increase automatically, and the steps S2-S3 are repeated until the convolution layer L is subjected to fine tuning1…LnPruning operation is completed; obtaining the final pruned convolutional neural network Modeln。
Optionally, in step S1, pre-pruning calculation is performed on the neural network model to be pruned according to the pruning compression ratio to obtain the number N of truncated convolution kernels of each layer in the convolutional neural networki2(ii) a The well-trained pruning Model is a Model0The method comprises the following steps:
s11: establishing a picture data set Train for model pruning, wherein the Train comprises M pictures;
wherein N is0Number of channels, X, representing picture0,Y0The height and width of the picture are separated; determining the total convolution layer number n;
s12: determining the total number of convolution kernels per layer NiAccording to the compression ratio, for NiPruning calculation is carried out to obtain the number N of the truncated convolution kernels of each layeri2。
Alternatively, in step S12, according to the compression ratioFor the original convolution kernel number NiPruning calculation is carried out to obtain the number N of the truncated convolution kernels of each layeri2The method comprises the following steps:
pre-pruning the number of layers of the convolutional neural network in sequence of 1-N, and then carrying out pre-pruning on the original N of each layeriOne Fiter, N after pruningi2The cut filters; n is a radical ofiA set of filters is Wherein KiRepresenting the height and width of the convolution kernel.
Optionally, in step S2, a picture in the picture data set is input to Li-1Model after layer pruning, pair LiPruning the layers; quantizing the similarity through an SSIM method to obtain the similarity between the quantized Feature maps Feature _ map; determining Delete according to similarity of quantized Feature map Feature _ mapiThe method comprises the following steps:
s21: when it is to the convolution layer LiPruning is carried out, and a pruning Model is available at the momenti-1(ii) a Model for pruning Modeli-1Inputting the k picture in the picture data set, then rolling up the layer LiCorresponding inputs and outputs are respectivelyAnd
S23: if i is equal to 1, judging that i is less than or equal to n; if not, finishing the iteration and outputting a lightweight model; if yes, define DeleteiThe collection is empty; defining k as 1, and inputting M pictures in the Trian into the pruning model;
s24: taking out M Feature _ maps sets corresponding to M pictures in Train generated by the ith layer; judging that k is less than or equal to M, if so, calculating the similarity of any two Feature _ maps in the kth Feature _ maps setCalculating the rank of each Feature _ maps in the kth Feature _ maps setk is increased automatically;
s25: judging whether k satisfies k>M, if not, repeatedly executing S24-S25; if yes, then statistics is carried out on the average similarityAnd average rankFrom high to low, constitute SSIMiSet and RankiGathering;
s27: judge DeleteiWhether the number of elements is less than Ni2(ii) a If yes, screening preset conditions: when in useWhile, the Filter is put in(i,n)Put in Deletei(ii) a When in useWhile, the Filter is put in(i,m)Put in Deletei;
If not, then Delete is obtainediAnd (4) collecting.
Optionally, in step S3, Delete is obtained by the obtained resultiPruning is carried out to obtain a Model after pruningiAnd for ModeliPerforming fine tuning, including:
s31: pruning and DeleteiObtaining a Model of the Model by the corresponding Filter in the seti;
S32: for obtaining ModeliPerforming retraining trimming, i.e. ModeliFine tuning (Model)i-1==-=Deltei)。
In one aspect, the present invention provides a convolutional neural network pruning device for quantifying similarity of feature maps, where the device is applied to the method described above, and includes:
a pruning pre-judging module for performing pre-pruning calculation on the neural network model to be pruned according to the pruning compression ratio to obtain the number N of the pruned convolution kernels of each layer in the convolution neural networki2;
A similarity calculation module for inputting the pictures in the picture data set to Li-1Model after layer pruning, pair LiPruning the layers; quantizing the similarity through an SSIM method to obtain the similarity between the quantized Feature maps Feature _ map; determining Delete according to similarity of quantized Feature map Feature _ mapi;
A pruning result calculation module for obtaining Delete through the obtained pruning resultiPruning is carried out to obtain a Model after pruningiAnd for ModeliFine tuning is carried out, i is increased automatically, and the pruning model is obtained by repeated calculation until the convolutional layer L is subjected to fine tuning1…LnPruning operation is completed; obtaining the final pruned convolutional neural network Modeln。
Optionally, the similarity calculation module deletes redundant information in the convolutional neural network through the similarity between Feature _ maps.
Alternatively, between Feature _ maps, the similarity is determined by inputting the picture data set into a convolution network to obtain data, and calculating the statistical average of the data.
Optionally, the average rank calculation module clips the corresponding Filter through a feature map with higher similarity and lower rank.
Optionally, the preset conditions are:
Filter(i,m)and Filter(i,n)Are all belonged to Deletei;
the technical scheme of the embodiment of the invention at least has the following beneficial effects:
in the scheme, the invention provides a convolutional neural network pruning method and a convolutional neural network pruning device for quantizing the similarity of characteristic graphs, wherein the convolutional kernels corresponding to the characteristic graphs with similar information are pruned by quantizing the information similarity between the characteristic graphs, then fine tuning is carried out, layer-by-layer iteration is carried out, the difference between the accuracy rate of the obtained new model and the accuracy rate of the original model is within 1%, and the parameter quantity and the calculated quantity can be greatly reduced according to requirements, so that the storage size of the model is reduced, the lightweight of a deep neural network is realized, and the convolutional neural network pruning method and the convolutional neural network pruning device can be deployed on edge equipment with limited calculation resources.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a general flowchart of a convolutional neural network pruning method to quantify feature map similarity according to an embodiment of the present invention;
FIG. 2 is a detailed flowchart of a convolutional neural network pruning method for quantifying similarity of feature maps according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating a visualization result of convolutional layer output provided by an embodiment of the present invention;
fig. 4 is a schematic diagram of a process for deleting and optimizing redundant convolution kernels in a neural network convolution layer according to an embodiment of the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments.
As shown in fig. 1, an embodiment of the present invention provides a convolutional neural network pruning method for quantizing feature map similarity, including:
s1: according to the pruning compression ratio, pre-pruning calculation is carried out on the neural network model needing pruning to obtain the number N of the pruned convolution kernels of each layer in the convolution neural networki2;
S2: inputting pictures in a picture dataset to Li-1Model after layer pruning, pair LiPruning the layers; quantizing the similarity through an SSIM method to obtain the similarity between the quantized Feature maps Feature _ map; determining Delete according to similarity of quantized Feature map Feature _ mapi;
S3: by the resulting DeleteiPruning is carried out to obtain a Model after pruningiAnd for ModeliFine tuning is carried out to enable i to increase automatically, and the steps S2-S3 are repeated until the convolution layer L is subjected to fine tuning1…LnPruning operation is completed; obtaining the final pruned convolutional neural network Modeln。
Taking a VGG16(Visual Geometry Group 16, a Visual Geometry Group including 16 hidden layers) neural network commonly used in an image classification technology as an example, the VGG16 neural network includes 13 convolutional layers and 3 fully-connected layers, and local features of an image to be classified are extracted through parameters of the 13 convolutional layers; the images are classified by the parameters of the 3-layer fully connected layer.
When extracting features, for example VGG-16, Conv1-1, the input dimension is 224 x 3, the 64 filters of 3 x 3 dimension, and the dimension of the set of output feature-maps is 224 x 64.
And all the extracted local features are assembled into a complete graph through a weight matrix again through a full connection layer of the VGG neural network, the probability of the predicted corresponding bit is represented in each position in the weight matrix, and the classification of the pictures is completed by judging which bit has high probability. If the last convolution yields a feature-map of 7 x 512, the fully connected layer spreads the feature-map into a one-dimensional vector 1 x 4096, which is the feature map matrix T that provides the input to the classifier, and based on the probabilities in the matrix, the image classification is completed. The invention prunes and removes the convolution kernel corresponding to the characteristic graphs with similar information by quantizing the information similarity between the characteristic graphs, then carries out fine adjustment and obtains a new model through layer-by-layer iteration, thereby reducing the storage size of model parameters and models and accurately applying the network to edge equipment with limited computer resources.
As shown in fig. 2, in step S1, according to the pruning compression ratio, pre-pruning calculation is performed on the neural network model to be pruned, so as to obtain the number N of truncated convolution kernels in each layer of the convolutional neural networki2(ii) a The well-trained pruning Model is a Model0The method comprises the following steps:
s11: establishing a picture data set Train for model pruning, wherein the Train comprises M pictures;
wherein N is0Number of channels, X, representing picture0,Y0The height and width of the picture are separated; determining the total convolution layer number n;
s12: determining the total number of convolution kernels per layer NiAccording to the compression ratio, for NiPruning calculation is carried out to obtain the number N of the truncated convolution kernels of each layeri2。
In step S12, the original convolution kernel number N is compared with the compression ratioiPruning calculation is carried out to obtain the number N of the truncated convolution kernels of each layeri2The method comprises the following steps:
pre-pruning the number of layers of the convolutional neural network in sequence of 1-N, and then carrying out pre-pruning on the original N of each layeriOne Fiter, N after pruningi2The cut filters; n is a radical ofiA set of filters is Wherein KiRepresenting the height and width of the convolution kernel.
In this embodiment, the variable is first declared: neural network Model using VGG16(Visual Geometry Group 16, Visual Geometry Group containing 16 hidden layers)0For example, pruning compression is performed with a compression rate of 0.3, using 50000 pictures of the cifar10 data set as a training set, each picture having a size of 32 × 32 in 3 channels, Train ═ Image [ { Image }1,Image2,……,Image50000}∈R50000×3×32×32。
The number of convolution kernels per layer pruning for a compression ratio of 0.3 for VGG16 is shown in table 1:
TABLE 1
Layer name | Original convolution kernel number | Number of pruned convolution kernels (N)i2) |
Block1_conv1 | 64 | 0 |
Block1_conv2 | 64 | 19 |
Block2_conv1 | 128 | 38 |
Block2_conv2 | 128 | 38 |
Block3_conv1 | 256 | 76 |
Block3_conv2 | 256 | 76 |
Block3_conv3 | 256 | 76 |
Block4_conv1 | 512 | 153 |
Block4_conv2 | 512 | 153 |
Block4_conv3 | 512 | 153 |
Block5_conv1 | 512 | 153 |
Block5_conv2 | 512 | 153 |
Block5_conv3 | 512 | 153 |
In step S2, a picture in the picture data set is input to Li-1Model after layer pruning, pair LiPruning the layers; quantizing the similarity through an SSIM (structural similarity index) method to obtain the similarity between the quantized Feature maps Feature _ map; determining Delete according to similarity of quantized Feature map Feature _ mapiThe method comprises the following steps:
s21: when it is to the convolution layer LiPruning is carried out, and a pruning Model is available at the momenti-1(ii) a Model for pruning Modeli-1Inputting the k picture in the picture data set, then rolling up the layer LiCorresponding inputs and outputs are respectively
S23: if i is equal to 1, judging that i is less than or equal to n; if not, finishing the iteration and outputting a lightweight model; if yes, define DeleteiThe collection is empty; defining k as 1, and inputting M pictures in the Trian into the pruning model;
s24: taking out M Feature _ maps sets corresponding to M pictures in Train generated by the ith layer; judging that k is less than or equal to M, if so, calculating the similarity of any two Feature _ maps in the kth Feature _ maps setCalculating the rank of each Feature _ maps in the kth Feature _ maps setk is increased automatically;
s25: judging whether k satisfies k>M, if not, repeatedly executing S24-S25; if yes, then statistics is carried out on the average similarityAnd average rankFrom high to low, constitute SSIMiSet and RankiGathering;
s27: judge DeleteiWhether the number of elements is less than Ni2(ii) a If yes, screening preset conditions: when in useWhile, the Filter is put in(i,m)Put in Deletei(ii) a When in useWhile, the Filter is put in(i,n)Put in Deletei;
If not, then Delete is obtainediAnd (4) collecting.
In this example, pruning was performed in the order of Block1_ conv1 to Block5_ conv3, according to table 1.
For discrimination, here the pruning example process starts with convolutional layer Block1_ conv2, i.e. with the 2 nd pruning. Setting Delete2For storing cut filters, N22=19。
A statistical average of the structural similarity is calculated. The Model is now Model1Namely, the convolution layer Block1_ conv1 is pruned to form the corresponding model. Each pair of Model1Inputting a picture ImagekThe output of convolutional layer Block1_ conv2 isCalculate any two of the 64 Feature _ mapsAndvalue of SSIM similarity ofThe calculation formula is as follows:
wherein muxAnd muyIs the mean, σ, of two feature mapsxAnd σyIs the standard deviation, σ, between the two feature mapsxyIs the covariance between the two feature maps. The statistical average of the structural similarity is
A statistical average of the rank is calculated. Calculate eachRank ofThen O is2Middle mth Feature _ map(2,m)Is statistically averaged to be:
And (6) pruning. First SSIM2Arranged according to the size from high to low and then matched with Rank2Determining a Filter to be pruned, and setting the conditions as follows:
condition 1: filter(2,m)And Filter(2,n)Are all belonged to Delete2;
if conditions 1,2 and 3 are met simultaneously, the Filter is switched(2,n)Put in Delete2If conditions 1,2 and 4 are satisfied simultaneously, the Filter is used(2,m)Put in Delete2In the method, the operation is continuously carried out until Delete2If the number of filters in the set is 19, Delete is determined2。
As shown in fig. 3, which is a schematic diagram of the output visualization result of the second convolutional layer of the VGG16 convolutional neural network, for the second convolutional layer of VGG16, when the input picture size is 32 × 3, the visualization results of 64 Feature maps (after one pooling operation, the Feature _ map dimension is 16 × 16) can be seen to be relatively similar for the 31 th and 51 th Feature _ maps, relatively similar for the 7 th and 24 th Feature _ maps, relatively similar for the 27 th and 45 th Feature _ maps, and so on.
As shown in fig. 4, a pruning process is described that removes convolution kernels corresponding to similar feature maps. Specifically, for the pruning of the second convolutional layer of VGG16, there are 64 convolution kernels for the layer before pruning and 45 convolution kernels remaining for the layer after pruning. The parameters of the network model after pruning are reduced, and the calculation amount is reduced.
In the specific implementation process, the similarity between the Feature maps (Feature _ maps) used in the present invention may be calculated by using SSIM, or may be calculated by using another method for quantizing the similarity, such as PSNR (Peak Signal-to-Noise Ratio).
In step S3, Delete is obtained by obtainingiPruning is carried out to obtain a Model after pruningiAnd for ModeliPerforming fine tuning, including:
s31: pruning and DeleteiObtaining a Model of the Model by the corresponding Filter in the seti;
S32: for obtaining ModeliPerforming retraining trimming, i.e. ModeliFine tuning (Model)i-1-Deltei)。
In this embodiment, the output model is fine-tuned. Output Model after 2 nd pruning2I.e. by
Model2Fine tuning (Model)1-Delete2)
After fine adjustment, i is increased, and then steps S2-S3 are repeated, i.e., the convolutional layer Block2_ conv1 is pruned, i.e., the third pruning is performed.
Setting Delete3For storing cut filters, N32=38。
A statistical average of the structural similarity is calculated. The Model is now Model2Each pair of Model2Inputting a picture ImagekThe output of convolutional layer Block2_ conv1 is Calculate any two of the 128 Feature _ mapsAndvalue of SSIM similarity ofThe statistical average of the structural similarity is
A statistical average of the rank is calculated. MeterCalculate eachRank ofThen O is3Middle mth Feature _ map(3,m)Is statistically averaged to be:
And (6) pruning. First SSIM3Arranged according to the size from high to low and then matched with Rank3Determining a Filter to be pruned, and setting the conditions as follows:
condition 1: filter(3,m)And Filter(3,n)Are all belonged to Delete3;
if conditions 1,2 and 3 are met simultaneously, the Filter is switched(3,n)Put in Delete3If conditions 1,2 and 4 are satisfied simultaneously, the Filter is used(3,m)Put in Delete3In the method, the operation is continuously carried out until Delete3If the number of filters in the set is 38, find Delete3。
And (6) fine adjustment. Output through 3 rd pruningModel3I.e. by
Model3Fine tuning (Model)2-Delete3)
The above operations are repeatedly executed until Block5_ conv3 pruning is completed, namely 13 times of pruning are completed, and the output model is the compressed model.
Wherein the accuracy and two pairs of parameters before and after pruning are shown in table 2:
TABLE 2
Before pruning | After pruning | |
Accuracy of | 93.17% | 92.42% |
Amount of ginseng | 15,001,418 | 7,453,636 |
As can be seen from table 2, the accuracy of the pruning model finally obtained by the pruning method provided by the present invention is not changed basically, and the parameter amount after pruning is reduced greatly, so that the occupied memory of the model is reduced, the required computing power is reduced, and thus deployment on edge devices with limited computer resources can be realized more optimally. And the deep neural network can be better applied under the situations of limited computing resources, real-time online processing and the like.
In this embodiment, for the task of classifying a 3-channel color picture with 32 × 32 pixels, the original model and the pruned model are deployed on a computer, and the prediction of the picture category is performed. The parameters of the used computer equipment and the software environment are shown in table 3, the average execution time of each picture is shown in table 4, and the result shows that the method can be effectively applied to the reasoning scene of the accelerated convolution neural network.
TABLE 3
TABLE 4
The method for accelerating the forward propagation of the neural network provided by the invention can be suitable for an application scene of accelerating the inference of the convolutional neural network. Particularly, the method is applicable to the parts related to the inference acceleration of the convolutional neural network in the technical processes of improving image classification, target detection, face recognition and the like; the present invention can reduce time delay on the premise of ensuring original effect, so as to obtain the results of image classification, target detection, face recognition, etc. more quickly.
The invention also provides a convolutional neural network pruning device for quantizing the similarity of the characteristic graphs, which is applied to the method and comprises the following steps:
a pruning pre-judging module for performing pre-pruning calculation on the image data set to be pruned according to the pruning compression ratio to obtain the number N of the pruned convolution kernels of each layer in the convolution neural networki2(ii) a The well-trained pruning Model is a Model0;
A similarity calculation module forBy inputting picture pairs in a picture dataset into the convolutional layer LiPruning is carried out; quantizing the similarity through an SSIM method to obtain the similarity between the quantized Feature maps Feature _ map; determining Delete according to similarity of quantized Feature map Feature _ mapi;
A pruning result calculation module for obtaining Delete through the obtained pruning resultiPruning is carried out to obtain a Model after pruningiAnd for ModeliFine tuning is carried out, i is increased automatically, and the pruning model is obtained by repeated calculation until the convolutional layer L is subjected to fine tuning1…LnPruning operation is completed; obtaining the final pruned convolutional neural network Modeln。
And the similarity calculation module deletes redundant information in the convolutional neural network through the similarity between Feature _ maps.
Between Feature _ maps, the image data set is input into a convolution network to obtain data, and the statistical average of the data is calculated to determine the similarity.
And the average rank calculation module cuts the corresponding Filter through a feature map with higher similarity and lower rank.
The preset conditions are as follows:
Filter(i,m)and Filter(i,n)Are all belonged to Deletei;
in the specific implementation process, the similarity between the quantized Feature maps (Feature _ maps) used in the present invention may be calculated by using SSIM, or may be calculated by using other methods for quantizing the similarity, such as euclidean distance, PSNR (Peak Signal-to-Noise Ratio), and the like.
The invention prunes and removes the convolution kernel corresponding to the characteristic graphs with similar information by quantifying the information similarity between the characteristic graphs, then carries out fine adjustment, and obtains a new model through layer-by-layer iteration, thereby reducing the storage size of model parameters and models, and being capable of accurately applying the network to edge equipment with limited computing resources.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Claims (10)
1. A convolutional neural network pruning method for quantizing feature map similarity is characterized by comprising the following steps:
s1: according to the pruning compression ratio, pre-pruning calculation is carried out on the neural network model needing pruning to obtain the number N of the pruned convolution kernels of each layer in the convolution neural networki2;
S2: inputting pictures in a picture dataset to Li-1Model after layer pruning, pair LiPruning the layers; quantizing the similarity through an SSIM method to obtain the similarity between the quantized Feature maps Feature _ map; determining Delete according to similarity of quantized Feature map Feature _ mapi;
S3: by the resulting DeleteiPruning is carried out to obtain a Model after pruningiAnd for ModeliFine tuning is carried out to enable i to increase automatically, and the steps S2-S3 are repeated until the convolution layer L is subjected to fine tuning1…LnPruning operation is completed; obtaining the final pruned convolutional neural network Modeln。
2. The convolutional neural network pruning method for quantifying feature map similarity according to claim 1, wherein in step S1, pre-pruning calculation is performed on the neural network model to be pruned according to the pruning compression ratio to obtain the number N of pruned convolutional kernels in each layer of the convolutional neural networki2(ii) a The well-trained pruning Model is a Model0The method comprises the following steps:
s11: establishing a picture data set Train for model pruning, wherein the Train comprises M pictures;
wherein R is a real number set, N0Number of channels, X, representing picture0,Y0The height and width of the picture are separated; determining the total convolution layer number n;
s12: determining the total number of convolution kernels per layer NiAccording to the compression ratio, for NiPruning calculation is carried out to obtain the number N of the truncated convolution kernels of each layeri2Wherein the initial value of i is 1.
3. The convolutional neural network pruning method for quantizing feature map similarity according to claim 2, wherein in step S12, the original convolution kernel number N is compared with the original convolution kernel number N according to the compression ratioiPruning calculation is carried out to obtain the number N of the truncated convolution kernels of each layeri2The method comprises the following steps:
pre-pruning the number of layers of the convolutional neural network in sequence of 1-N, and then carrying out pre-pruning on the original N of each layeriOne Filter has N after pruningi2The cut filters; n is a radical ofiA set of filters is Wherein KiRepresenting the height and width of the convolution kernel.
4. The convolutional neural network pruning method for quantizing feature map similarity according to claim 1, wherein in step S2, the pictures in the picture data set are input to Li-1Model after layer pruning, pair LiPruning the layers; quantizing the similarity through an SSIM method to obtain the similarity between the quantized Feature maps Feature _ map; determining Delete according to similarity of quantized Feature map Feature _ mapiThe method comprises the following steps:
s21: when it is to the convolution layer LiPruning is carried out, and a pruning Model is available at the momenti-1(ii) a Model for pruning Modeli-1Inputting the k picture in the picture data set, then rolling up the layer LiCorresponding inputs and outputs are respectivelyAnd
S23: if i is equal to 1, judging that i is less than or equal to n; if not, finishing the iteration and outputting a lightweight model; if yes, define DeleteiIs integrated as space(ii) a Defining k as 1, and inputting M pictures in the Trian into the pruning model;
s24: taking out M Feature _ maps sets corresponding to M pictures in Train generated by the ith layer; judging that k is less than or equal to M, if so, calculating the similarity of any two Feature _ Maps in the kth Feature _ Maps setCalculating the rank of each Feature _ maps in the kth Feature _ maps setk is self-increasing, wherein m, N is 1,2, … NiAnd m is not equal to n;
s25: judging whether k satisfies k>M, if not, repeatedly executing S24-S25; if yes, then statistics is carried out on the average similarityAnd average rankFrom high to low, constitute SSIMiSet and RankiGathering;
s27: judge DeleteiWhether the number of elements is less than Ni2(ii) a If yes, screening preset conditions: when in useWhile, the Filter is put in(i,n)Put in Deletei(ii) a When in useWhile, the Filter is put in(i,m)Put in Deletei;
If not, D is obtainedeleteiAnd (4) collecting.
5. The convolutional neural network pruning method for quantifying feature map similarity according to claim 4, wherein in the step S3, Delete is obtained through the obtained resultiPruning is carried out to obtain a Model after pruningiAnd for ModeliPerforming fine tuning, including:
s31: pruning and DeleteiObtaining a Model of the Model by the corresponding Filter in the seti;
S32: for obtaining ModeliPerforming retraining trimming, i.e. ModeliFine tuning (Model)i-1-Deltei)。
6. A convolutional neural network pruning device for quantifying feature map similarity, the device being applied to any one of claims 1 to 5, comprising:
a pruning pre-judging module for performing pre-pruning calculation on the neural network model to be pruned according to the pruning compression ratio to obtain the number N of the pruned convolution kernels of each layer in the convolution neural networki2;
A similarity calculation module for inputting the pictures in the picture data set to Li-1Model after layer pruning, pair LiPruning the layers; quantizing the similarity through an SSIM method to obtain the similarity between the quantized Feature maps Feature _ map; determining Delete according to similarity of quantized Feature map Feature _ mapi;
A pruning result calculation module for obtaining Delete through the obtained pruning resultiPruning is carried out to obtain a Model after pruningiAnd for ModeliFine tuning is carried out, i is increased automatically, and the pruning model is obtained by repeated calculation until the convolutional layer L is subjected to fine tuning1…LnPruning operation is completed; obtaining the final pruned convolutional neural network Modeln。
7. The convolutional neural network pruning device for quantizing feature map similarity according to claim 6,
and the similarity calculation module deletes redundant information in the convolutional neural network through the similarity between the Feature _ maps.
8. The convolutional neural network pruning device for quantizing Feature map similarity according to claim 7, wherein between the Feature _ maps, the similarity is determined by inputting a picture data set into a convolutional network to obtain data and calculating a statistical average of the data.
9. The convolutional neural network pruning device for quantizing feature map similarity according to claim 6, wherein the average rank calculation module prunes the corresponding Filter through one feature map with higher similarity and lower rank.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110977310.9A CN113780550A (en) | 2021-08-24 | 2021-08-24 | Convolutional neural network pruning method and device for quantizing feature map similarity |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110977310.9A CN113780550A (en) | 2021-08-24 | 2021-08-24 | Convolutional neural network pruning method and device for quantizing feature map similarity |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113780550A true CN113780550A (en) | 2021-12-10 |
Family
ID=78839044
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110977310.9A Withdrawn CN113780550A (en) | 2021-08-24 | 2021-08-24 | Convolutional neural network pruning method and device for quantizing feature map similarity |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113780550A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114154589A (en) * | 2021-12-13 | 2022-03-08 | 成都索贝数码科技股份有限公司 | Similarity-based module branch reduction method |
CN114677545A (en) * | 2022-03-29 | 2022-06-28 | 电子科技大学 | Lightweight image classification method based on similarity pruning and efficient module |
-
2021
- 2021-08-24 CN CN202110977310.9A patent/CN113780550A/en not_active Withdrawn
Non-Patent Citations (1)
Title |
---|
WANG Z等: "Model pruning based on quantified similarity of feature maps", 《ARXIV:2105.06052》, pages 3 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114154589A (en) * | 2021-12-13 | 2022-03-08 | 成都索贝数码科技股份有限公司 | Similarity-based module branch reduction method |
CN114154589B (en) * | 2021-12-13 | 2023-09-29 | 成都索贝数码科技股份有限公司 | Module branch reduction method based on similarity |
CN114677545A (en) * | 2022-03-29 | 2022-06-28 | 电子科技大学 | Lightweight image classification method based on similarity pruning and efficient module |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110516596B (en) | Octave convolution-based spatial spectrum attention hyperspectral image classification method | |
Singh et al. | Play and prune: Adaptive filter pruning for deep model compression | |
CN109614979B (en) | Data augmentation method and image classification method based on selection and generation | |
CN112308158A (en) | Multi-source field self-adaptive model and method based on partial feature alignment | |
CN110728224A (en) | Remote sensing image classification method based on attention mechanism depth Contourlet network | |
CN109272500B (en) | Fabric classification method based on adaptive convolutional neural network | |
CN111696101A (en) | Light-weight solanaceae disease identification method based on SE-Inception | |
CN111325165B (en) | Urban remote sensing image scene classification method considering spatial relationship information | |
CN110175628A (en) | A kind of compression algorithm based on automatic search with the neural networks pruning of knowledge distillation | |
CN106845529A (en) | Image feature recognition methods based on many visual field convolutional neural networks | |
CN110533022B (en) | Target detection method, system, device and storage medium | |
CN114118402A (en) | Self-adaptive pruning model compression algorithm based on grouping attention mechanism | |
CN113780550A (en) | Convolutional neural network pruning method and device for quantizing feature map similarity | |
CN110008853B (en) | Pedestrian detection network and model training method, detection method, medium and equipment | |
CN111833322B (en) | Garbage multi-target detection method based on improved YOLOv3 | |
CN112101364B (en) | Semantic segmentation method based on parameter importance increment learning | |
CN112233129A (en) | Deep learning-based parallel multi-scale attention mechanism semantic segmentation method and device | |
CN113420651B (en) | Light weight method, system and target detection method for deep convolutional neural network | |
CN114742997A (en) | Full convolution neural network density peak pruning method for image segmentation | |
Zhang et al. | A channel pruning algorithm based on depth-wise separable convolution unit | |
CN112967296B (en) | Point cloud dynamic region graph convolution method, classification method and segmentation method | |
CN114882234A (en) | Construction method of multi-scale lightweight dense connected target detection network | |
CN114882278A (en) | Tire pattern classification method and device based on attention mechanism and transfer learning | |
CN112263224B (en) | Medical information processing method based on FPGA edge calculation | |
Geng et al. | Pruning convolutional neural networks via filter similarity analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20211210 |
|
WW01 | Invention patent application withdrawn after publication |