CN113392970A - Automatic neural network pruning method based on feature map activation rate - Google Patents

Automatic neural network pruning method based on feature map activation rate Download PDF

Info

Publication number
CN113392970A
CN113392970A CN202110528976.6A CN202110528976A CN113392970A CN 113392970 A CN113392970 A CN 113392970A CN 202110528976 A CN202110528976 A CN 202110528976A CN 113392970 A CN113392970 A CN 113392970A
Authority
CN
China
Prior art keywords
neural network
feature map
layer
convolution kernel
activation rate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110528976.6A
Other languages
Chinese (zh)
Inventor
张晋侨
姜晓栋
顾成飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Panchip Microelectronics Co ltd
Original Assignee
Shanghai Panchip Microelectronics Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Panchip Microelectronics Co ltd filed Critical Shanghai Panchip Microelectronics Co ltd
Priority to CN202110528976.6A priority Critical patent/CN113392970A/en
Publication of CN113392970A publication Critical patent/CN113392970A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention provides an automatic neural network pruning method based on a characteristic diagram activation rate, which relates to the field of neural networks and provides a novel automatic neural network pruning algorithm. By the technical scheme provided by the invention, model compression can be carried out on different tasks, the algorithm can maximize the compression model while ensuring the precision, and the calculation cost of the algorithm is reduced.

Description

Automatic neural network pruning method based on feature map activation rate
Technical Field
The invention belongs to the field of neural networks, and particularly relates to an automatic neural network pruning method based on a characteristic diagram activation rate.
Background
Early neural network pruning algorithms usually require the artificial presetting of the pruning rate of each layer, and cannot realize the optimal solution. By adopting the automatic pruning algorithm, the optimal solution of each layer can be searched by setting a hyper-parameter, so that the model size is compressed as much as possible and the reasoning speed is increased while the model precision is ensured.
Disclosure of Invention
In view of the above-mentioned defects of the prior art, the technical problem to be solved by the present invention is how to compress the model size and increase the inference speed as much as possible while ensuring the model accuracy.
In order to achieve the purpose, the invention provides an automatic neural network pruning method based on a feature map activation rate, relates to the field of neural networks, and provides a novel automatic neural network pruning algorithm which adaptively prunes out redundant filters of each layer according to a hyper-parameter, so that the redundant information of a model is reduced, and the reasoning speed of the model and the operation efficiency at the end side are greatly improved.
Further, the hyper-parameter, i.e. the activation rate a.
Further, A is in the range of 0-1.
Further, when the activation rate of a certain feature map is smaller than the A, the convolution kernel is judged as a redundant convolution kernel.
Further, the redundant convolution kernels may be pruned from the network.
Further, the method comprises the steps of:
step 1, acquiring the characteristic diagram;
step 2, judging the redundant convolution kernel;
and 3, pruning.
Further, the step 1 is to input data into a neural network model to be pruned to perform forward reasoning, and obtain the feature map output by the convolutional layer.
Further, the step 2 is to pass the feature map of each layer through an active layer from top to bottom, then calculate a non-zero ratio of each channel outputting the feature map, and if the non-zero ratio of the ith channel is smaller than the a, the ith convolution kernel of the convolutional layer is the redundant convolution kernel.
Further, the activation layer is selected from a Relu layer.
Further, in the step 3, the redundant convolution kernel of each convolution layer is pruned from the neural network model, and then fine tuning is performed by using original training data, so that a pruned model is finally obtained.
Compared with the prior art, the invention has the following beneficial effects:
(1) according to the technical scheme provided by the proposal, the pruning rate of each layer can be searched through one hyper-parameter, so that the limitation that the compression rate needs to be manually set in the traditional pruning technology is avoided, and the model is compressed more thoroughly.
(2) In addition, the algorithm can greatly compress the model parameters, reduce the calculated amount and reduce the calculation overhead during reasoning while ensuring the model precision.
The conception, the specific structure and the technical effects of the present invention will be further described with reference to the accompanying drawings to fully understand the objects, the features and the effects of the present invention.
Drawings
FIG. 1 is a flowchart of an algorithm in an embodiment of the present invention.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.
As shown in FIG. 1, the algorithm firstly sets a hyper-parametric activation rate A, wherein A ranges from 0 to 1, and when the activation rate of a characteristic diagram is smaller than the value, the convolution kernel is judged to be a redundant convolution kernel which can be cut from the network.
The specific operation is as follows:
and inputting the data into a neural network model to be pruned for forward reasoning to obtain a characteristic diagram output by the convolutional layer. From top to bottom, each layer of feature map is passed through an active layer (typically a Relu layer), then a non-zero ratio of each channel of the output feature map is calculated, and if the non-zero ratio of the ith channel is less than A, the ith convolution kernel of the convolutional layer is a redundant convolution kernel.
And finally, pruning redundant convolution kernels of each convolution layer from the network model, and then carrying out fine tuning by using original training data to finally obtain a pruned model.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims (10)

1. An automatic neural network pruning method based on feature map activation rate relates to the field of neural networks, and provides a novel automatic neural network pruning algorithm.
2. The method of claim 1, wherein the hyper-parameter is activation rate A.
3. The method of claim 2, wherein a is in the range of 0-1.
4. The method of claim 3, wherein when the activation rate of a feature map is less than A, the convolution kernel is determined to be a redundant convolution kernel.
5. The method of claim 4, wherein the redundant convolution kernels are pruned from the network.
6. The method of claim 5, wherein the method comprises the steps of:
step 1, acquiring the characteristic diagram;
step 2, judging the redundant convolution kernel;
and 3, pruning.
7. The method as claimed in claim 6, wherein step 1 is to input data into a neural network model to be pruned to perform forward reasoning, and obtain the feature map output by the convolutional layer.
8. The method according to claim 7, wherein the step 2 is to pass each layer of the feature map through the active layer from top to bottom, then calculate a non-zero ratio of each channel outputting the feature map, and if the non-zero ratio of the ith channel is smaller than the a, the ith convolution kernel of the convolutional layer is the redundant convolution kernel.
9. The method of claim 8, wherein the activation layer selects the Relu layer.
10. The method according to claim 9, wherein the step 3 prunes the redundant convolution kernel of each convolution layer from the neural network model, and then performs fine tuning by using original training data to obtain a pruned model.
CN202110528976.6A 2021-05-14 2021-05-14 Automatic neural network pruning method based on feature map activation rate Pending CN113392970A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110528976.6A CN113392970A (en) 2021-05-14 2021-05-14 Automatic neural network pruning method based on feature map activation rate

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110528976.6A CN113392970A (en) 2021-05-14 2021-05-14 Automatic neural network pruning method based on feature map activation rate

Publications (1)

Publication Number Publication Date
CN113392970A true CN113392970A (en) 2021-09-14

Family

ID=77617108

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110528976.6A Pending CN113392970A (en) 2021-05-14 2021-05-14 Automatic neural network pruning method based on feature map activation rate

Country Status (1)

Country Link
CN (1) CN113392970A (en)

Similar Documents

Publication Publication Date Title
CN110874631B (en) Convolutional neural network pruning method based on feature map sparsification
CN111489364B (en) Medical image segmentation method based on lightweight full convolution neural network
CN111738401A (en) Model optimization method, grouping compression method, corresponding device and equipment
CN110363297A (en) Neural metwork training and image processing method, device, equipment and medium
CN107292458B (en) Prediction method and prediction device applied to neural network chip
CN109214353B (en) Training method and device for rapid detection of face image based on pruning model
CN111814973B (en) Memory computing system suitable for neural ordinary differential equation network computing
CN112733964B (en) Convolutional neural network quantization method for reinforcement learning automatic perception weight distribution
CN112052951A (en) Pruning neural network method, system, equipment and readable storage medium
CN110909874A (en) Convolution operation optimization method and device of neural network model
CN114154646A (en) Efficiency optimization method for federal learning in mobile edge network
CN113177580A (en) Image classification system based on channel importance pruning and binary quantization
CN112861996A (en) Deep neural network model compression method and device, electronic equipment and storage medium
CN110555518A (en) Channel pruning method and system based on feature map importance score
CN113392970A (en) Automatic neural network pruning method based on feature map activation rate
CN109389216A (en) The dynamic tailor method, apparatus and storage medium of neural network
CN112766397A (en) Classification network and implementation method and device thereof
CN114004327A (en) Adaptive quantization method of neural network accelerator suitable for running on FPGA
CN112949814A (en) Compression and acceleration method and device of convolutional neural network and embedded equipment
CN116187420A (en) Training method, system, equipment and medium for lightweight deep neural network
WO2023045297A1 (en) Image super-resolution method and apparatus, and computer device and readable medium
CN113516240A (en) Neural network structured progressive pruning method and system
CN114372565A (en) Target detection network compression method for edge device
CN112381206A (en) Deep neural network compression method, system, storage medium and computer equipment
CN113033628A (en) Self-adaptive neural network compression method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210914

RJ01 Rejection of invention patent application after publication