CN113392970A - Automatic neural network pruning method based on feature map activation rate - Google Patents
Automatic neural network pruning method based on feature map activation rate Download PDFInfo
- Publication number
- CN113392970A CN113392970A CN202110528976.6A CN202110528976A CN113392970A CN 113392970 A CN113392970 A CN 113392970A CN 202110528976 A CN202110528976 A CN 202110528976A CN 113392970 A CN113392970 A CN 113392970A
- Authority
- CN
- China
- Prior art keywords
- neural network
- feature map
- layer
- convolution kernel
- activation rate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 17
- 238000013138 pruning Methods 0.000 title claims abstract description 17
- 230000004913 activation Effects 0.000 title claims abstract description 14
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 14
- 238000010586 diagram Methods 0.000 claims abstract description 6
- 238000003062 neural network model Methods 0.000 claims description 5
- 244000141353 Prunus domestica Species 0.000 claims description 2
- 230000006835 compression Effects 0.000 abstract description 3
- 238000007906 compression Methods 0.000 abstract description 3
- 238000004364 calculation method Methods 0.000 abstract description 2
- 230000000694 effects Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Feedback Control In General (AREA)
Abstract
The invention provides an automatic neural network pruning method based on a characteristic diagram activation rate, which relates to the field of neural networks and provides a novel automatic neural network pruning algorithm. By the technical scheme provided by the invention, model compression can be carried out on different tasks, the algorithm can maximize the compression model while ensuring the precision, and the calculation cost of the algorithm is reduced.
Description
Technical Field
The invention belongs to the field of neural networks, and particularly relates to an automatic neural network pruning method based on a characteristic diagram activation rate.
Background
Early neural network pruning algorithms usually require the artificial presetting of the pruning rate of each layer, and cannot realize the optimal solution. By adopting the automatic pruning algorithm, the optimal solution of each layer can be searched by setting a hyper-parameter, so that the model size is compressed as much as possible and the reasoning speed is increased while the model precision is ensured.
Disclosure of Invention
In view of the above-mentioned defects of the prior art, the technical problem to be solved by the present invention is how to compress the model size and increase the inference speed as much as possible while ensuring the model accuracy.
In order to achieve the purpose, the invention provides an automatic neural network pruning method based on a feature map activation rate, relates to the field of neural networks, and provides a novel automatic neural network pruning algorithm which adaptively prunes out redundant filters of each layer according to a hyper-parameter, so that the redundant information of a model is reduced, and the reasoning speed of the model and the operation efficiency at the end side are greatly improved.
Further, the hyper-parameter, i.e. the activation rate a.
Further, A is in the range of 0-1.
Further, when the activation rate of a certain feature map is smaller than the A, the convolution kernel is judged as a redundant convolution kernel.
Further, the redundant convolution kernels may be pruned from the network.
Further, the method comprises the steps of:
step 1, acquiring the characteristic diagram;
step 2, judging the redundant convolution kernel;
and 3, pruning.
Further, the step 1 is to input data into a neural network model to be pruned to perform forward reasoning, and obtain the feature map output by the convolutional layer.
Further, the step 2 is to pass the feature map of each layer through an active layer from top to bottom, then calculate a non-zero ratio of each channel outputting the feature map, and if the non-zero ratio of the ith channel is smaller than the a, the ith convolution kernel of the convolutional layer is the redundant convolution kernel.
Further, the activation layer is selected from a Relu layer.
Further, in the step 3, the redundant convolution kernel of each convolution layer is pruned from the neural network model, and then fine tuning is performed by using original training data, so that a pruned model is finally obtained.
Compared with the prior art, the invention has the following beneficial effects:
(1) according to the technical scheme provided by the proposal, the pruning rate of each layer can be searched through one hyper-parameter, so that the limitation that the compression rate needs to be manually set in the traditional pruning technology is avoided, and the model is compressed more thoroughly.
(2) In addition, the algorithm can greatly compress the model parameters, reduce the calculated amount and reduce the calculation overhead during reasoning while ensuring the model precision.
The conception, the specific structure and the technical effects of the present invention will be further described with reference to the accompanying drawings to fully understand the objects, the features and the effects of the present invention.
Drawings
FIG. 1 is a flowchart of an algorithm in an embodiment of the present invention.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.
As shown in FIG. 1, the algorithm firstly sets a hyper-parametric activation rate A, wherein A ranges from 0 to 1, and when the activation rate of a characteristic diagram is smaller than the value, the convolution kernel is judged to be a redundant convolution kernel which can be cut from the network.
The specific operation is as follows:
and inputting the data into a neural network model to be pruned for forward reasoning to obtain a characteristic diagram output by the convolutional layer. From top to bottom, each layer of feature map is passed through an active layer (typically a Relu layer), then a non-zero ratio of each channel of the output feature map is calculated, and if the non-zero ratio of the ith channel is less than A, the ith convolution kernel of the convolutional layer is a redundant convolution kernel.
And finally, pruning redundant convolution kernels of each convolution layer from the network model, and then carrying out fine tuning by using original training data to finally obtain a pruned model.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.
Claims (10)
1. An automatic neural network pruning method based on feature map activation rate relates to the field of neural networks, and provides a novel automatic neural network pruning algorithm.
2. The method of claim 1, wherein the hyper-parameter is activation rate A.
3. The method of claim 2, wherein a is in the range of 0-1.
4. The method of claim 3, wherein when the activation rate of a feature map is less than A, the convolution kernel is determined to be a redundant convolution kernel.
5. The method of claim 4, wherein the redundant convolution kernels are pruned from the network.
6. The method of claim 5, wherein the method comprises the steps of:
step 1, acquiring the characteristic diagram;
step 2, judging the redundant convolution kernel;
and 3, pruning.
7. The method as claimed in claim 6, wherein step 1 is to input data into a neural network model to be pruned to perform forward reasoning, and obtain the feature map output by the convolutional layer.
8. The method according to claim 7, wherein the step 2 is to pass each layer of the feature map through the active layer from top to bottom, then calculate a non-zero ratio of each channel outputting the feature map, and if the non-zero ratio of the ith channel is smaller than the a, the ith convolution kernel of the convolutional layer is the redundant convolution kernel.
9. The method of claim 8, wherein the activation layer selects the Relu layer.
10. The method according to claim 9, wherein the step 3 prunes the redundant convolution kernel of each convolution layer from the neural network model, and then performs fine tuning by using original training data to obtain a pruned model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110528976.6A CN113392970A (en) | 2021-05-14 | 2021-05-14 | Automatic neural network pruning method based on feature map activation rate |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110528976.6A CN113392970A (en) | 2021-05-14 | 2021-05-14 | Automatic neural network pruning method based on feature map activation rate |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113392970A true CN113392970A (en) | 2021-09-14 |
Family
ID=77617108
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110528976.6A Pending CN113392970A (en) | 2021-05-14 | 2021-05-14 | Automatic neural network pruning method based on feature map activation rate |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113392970A (en) |
-
2021
- 2021-05-14 CN CN202110528976.6A patent/CN113392970A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110874631B (en) | Convolutional neural network pruning method based on feature map sparsification | |
CN111489364B (en) | Medical image segmentation method based on lightweight full convolution neural network | |
CN111738401A (en) | Model optimization method, grouping compression method, corresponding device and equipment | |
CN110363297A (en) | Neural metwork training and image processing method, device, equipment and medium | |
CN107292458B (en) | Prediction method and prediction device applied to neural network chip | |
CN109214353B (en) | Training method and device for rapid detection of face image based on pruning model | |
CN111814973B (en) | Memory computing system suitable for neural ordinary differential equation network computing | |
CN112733964B (en) | Convolutional neural network quantization method for reinforcement learning automatic perception weight distribution | |
CN112052951A (en) | Pruning neural network method, system, equipment and readable storage medium | |
CN110909874A (en) | Convolution operation optimization method and device of neural network model | |
CN114154646A (en) | Efficiency optimization method for federal learning in mobile edge network | |
CN113177580A (en) | Image classification system based on channel importance pruning and binary quantization | |
CN112861996A (en) | Deep neural network model compression method and device, electronic equipment and storage medium | |
CN110555518A (en) | Channel pruning method and system based on feature map importance score | |
CN113392970A (en) | Automatic neural network pruning method based on feature map activation rate | |
CN109389216A (en) | The dynamic tailor method, apparatus and storage medium of neural network | |
CN112766397A (en) | Classification network and implementation method and device thereof | |
CN114004327A (en) | Adaptive quantization method of neural network accelerator suitable for running on FPGA | |
CN112949814A (en) | Compression and acceleration method and device of convolutional neural network and embedded equipment | |
CN116187420A (en) | Training method, system, equipment and medium for lightweight deep neural network | |
WO2023045297A1 (en) | Image super-resolution method and apparatus, and computer device and readable medium | |
CN113516240A (en) | Neural network structured progressive pruning method and system | |
CN114372565A (en) | Target detection network compression method for edge device | |
CN112381206A (en) | Deep neural network compression method, system, storage medium and computer equipment | |
CN113033628A (en) | Self-adaptive neural network compression method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210914 |
|
RJ01 | Rejection of invention patent application after publication |