CN111401523A - Deep learning network model compression method based on network layer pruning - Google Patents
Deep learning network model compression method based on network layer pruning Download PDFInfo
- Publication number
- CN111401523A CN111401523A CN202010177912.1A CN202010177912A CN111401523A CN 111401523 A CN111401523 A CN 111401523A CN 202010177912 A CN202010177912 A CN 202010177912A CN 111401523 A CN111401523 A CN 111401523A
- Authority
- CN
- China
- Prior art keywords
- layer
- gamma
- network
- parameters
- pruning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a deep learning network model compression method based on network layer pruning, which comprises the following steps: for the convolutional neural network which finishes training, leading in the trained weight and carrying out sparse training aiming at a BN layer, wherein the BN layer has two training parameters which are Gamma and Beta respectively, and the sparse process is finished through multiple iterations; obtaining Gamma parameters of each BN layer of the network, setting a global channel pruning proportion, calculating a threshold value of the Gamma parameters according to the global pruning proportion, and setting all the Gamma parameters smaller than the threshold value to zero; setting the number I of shortcut network layers to be cut off, calculating POZ according to Gamma of BN layer _2 in shortcut, and deleting I shortcut structures with larger POZ; deleting convolution channels related to Gamma with other BN layers set to zero; and storing the network structure and parameters after pruning.
Description
Technical Field
The invention relates to the technical field of deep learning convolutional neural network model compression acceleration and the like, in particular to a deep learning network model compression method based on network layer pruning.
Background
Compared with the traditional image processing algorithm, the deep convolutional neural network has the advantages that the accuracy is obviously improved, but as the number of network layers is increased, the model is more and more complex, and when the deep convolutional neural network is deployed to edge computing equipment, the phenomenon that the speed is low during network reasoning and the real-time requirement cannot be met is faced.
The current model pruning reasoning acceleration algorithm is divided into unstructured pruning and structured pruning, the unstructured pruning does not change the structure of a network, the weight of a deep convolutional neural network is judged, nodes with smaller weights are pruned, but the network is sparse after pruning, and the acceleration can be realized only through specially designed hardware. The structured pruning of the model is to prune at the level of a convolution neural network convolution kernel (filter), so that the structure of the network can be changed, and a certain acceleration effect can be obtained on different platforms in actual use. However, when the number of the deep learning network layers is too deep, the acceleration effect obtained by model channel pruning is limited due to the influence of hardware IO caused by network layer input and output in the calculation process.
Disclosure of Invention
According to the problems existing in the prior art, the invention discloses a deep learning network model compression method based on network layer pruning,
for the convolutional neural network which completes training, leading in the trained weight and carrying out sparse training aiming at the BN layer;
two training parameters are set in the BN layer, namely Gamma and Beta respectively,
the Gamma coefficient in the BN layer is restrained in the training process, so that the Gamma coefficient approaches to 0, and the process of sparsification is completed through multiple iterations;
obtaining Gamma parameters of each BN layer of the network, setting a global channel pruning proportion, calculating a threshold value of the Gamma parameters according to the global pruning proportion, and setting all the Gamma parameters smaller than the threshold value to zero;
setting the number I of the pre-cut shortcut network layers, calculating the Gamma parameter POZ of the BN layer _2 in each shortcut structure,
sorting each short structure from large to small according to the POZ of the BN layer-2, deleting the short structure corresponding to the POZ sorted at the first I,
deleting convolution channels related to Gamma with other BN layers set to zero;
and storing the structure and parameters of the pruned network, and performing fine tuning training on the stored network if necessary.
Further, the Gamma parameter POZ is calculated in the following way:
where M represents the Gamma coefficient dimension of the BN layer, f (x) is 1 when x is 0, and f (x) is 0 when x is non-zero.
Due to the adoption of the technical scheme, the deep learning network model compression method based on the network layer pruning reduces the IO operation time between network layers in the model reasoning process, improves the calculation speed and ensures that the accuracy influence of the model reasoning is small.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram of a structural convolutional neural network of a cut-out shortcut;
FIG. 2 is a flowchart of the deep learning network model compression method of the present invention.
Detailed Description
In order to make the technical solutions and advantages of the present invention clearer, the following describes the technical solutions in the embodiments of the present invention clearly and completely with reference to the drawings in the embodiments of the present invention:
as shown in fig. 1, for a network layer with a shortcut structure, the importance of the network layer is measured, and the corresponding network layer with the shortcut structure is deleted through an importance comparison algorithm. The general convolutional neural network is an input layer, a hidden layer and an output layer, most structures in the hidden layer are a convolutional layer-BN layer-an activation function layer, a shortcut structure in the network layer is shown in FIG. 1, namely an input _0, a convolutional layer _1, a BN layer _1, an activation layer _1, a convolutional layer _2, a BN layer _2, an activation layer _2, an ADD layer, and an ADD layer characteristic diagram is obtained by adding the activation layer _2 and the input _0, and pruning of the network layer is carried out on the shortcut structure in the network.
As shown in fig. 2, a deep learning network model compression method based on network layer pruning specifically includes the following steps:
the training of the deep learning network is completed aiming at the network task of the user, the trained weight is introduced to carry out the sparse training aiming at the BN layer of the convolutional neural network,
the BN layer has two trainable parameters which are Gamma and Beta respectively,
namely, the Gamma coefficient in the BN layer is restrained in the training process, so that the Gamma coefficient approaches 0. And completing the thinning process through multiple iterations.
The method comprises the steps of obtaining Gamma parameters of each BN layer of the network, setting a global channel pruning proportion, calculating a threshold value of the Gamma parameters according to the global pruning proportion, and setting all the Gamma parameters smaller than the threshold value to zero.
Setting the number I of shortcut network layers to be cut off, calculating POZ (percent of zero) for Gamma parameter of BN layer _2 in each shortcut structure,
where M represents the Gamma coefficient dimension of the BN layer, f (x) is 1 when x is 0, and f (x) is 0 when x is non-zero.
And sequencing each short structure from large to small according to the POZ of the BN layer _2, and deleting the I short structures with larger POZ, thereby achieving the purpose of pruning the network layer.
Deleting convolution channels related to Gamma with other BN layers set to zero, storing the structure and parameters of the pruned network, and performing fine tuning training on the stored network if necessary.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.
Claims (2)
1. A deep learning network model compression method based on network layer pruning is characterized by comprising the following steps:
for the convolutional neural network which completes training, leading in the trained weight and carrying out sparse training aiming at the BN layer;
two training parameters are set in the BN layer, namely Gamma and Beta respectively,
the Gamma coefficient in the BN layer is restrained in the training process, so that the Gamma coefficient approaches to 0, and the process of sparsification is completed through multiple iterations;
obtaining Gamma parameters of each BN layer of the network, setting a global channel pruning proportion, calculating a threshold value of the Gamma parameters according to the global pruning proportion, and setting all the Gamma parameters smaller than the threshold value to zero;
setting the number I of the pre-cut shortcut network layers, and calculating a Gamma parameter POZ of a BN layer _2 in each shortcut structure;
sorting each short structure from large to small according to the POZ of the BN layer-2, deleting the short structure corresponding to the POZ sorted at the first I,
deleting convolution channels related to Gamma with other BN layers set to zero;
and storing the structure and parameters of the pruned network, and performing fine tuning training on the stored network if necessary.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010177912.1A CN111401523A (en) | 2020-03-13 | 2020-03-13 | Deep learning network model compression method based on network layer pruning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010177912.1A CN111401523A (en) | 2020-03-13 | 2020-03-13 | Deep learning network model compression method based on network layer pruning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111401523A true CN111401523A (en) | 2020-07-10 |
Family
ID=71428721
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010177912.1A Pending CN111401523A (en) | 2020-03-13 | 2020-03-13 | Deep learning network model compression method based on network layer pruning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111401523A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111882810A (en) * | 2020-07-31 | 2020-11-03 | 广州市微智联科技有限公司 | Fire identification and early warning method and system |
CN111931914A (en) * | 2020-08-10 | 2020-11-13 | 北京计算机技术及应用研究所 | Convolutional neural network channel pruning method based on model fine tuning |
CN112132005A (en) * | 2020-09-21 | 2020-12-25 | 福州大学 | Face detection method based on cluster analysis and model compression |
CN114627342A (en) * | 2022-03-03 | 2022-06-14 | 北京百度网讯科技有限公司 | Training method, device and equipment of image recognition model based on sparsity |
CN114841931A (en) * | 2022-04-18 | 2022-08-02 | 西南交通大学 | Real-time sleeper defect detection method based on pruning algorithm |
WO2022160856A1 (en) * | 2021-01-27 | 2022-08-04 | 歌尔股份有限公司 | Classification network, and method and apparatus for implementing same |
-
2020
- 2020-03-13 CN CN202010177912.1A patent/CN111401523A/en active Pending
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111882810A (en) * | 2020-07-31 | 2020-11-03 | 广州市微智联科技有限公司 | Fire identification and early warning method and system |
CN111882810B (en) * | 2020-07-31 | 2022-07-01 | 广州市微智联科技有限公司 | Fire identification and early warning method and system |
CN111931914A (en) * | 2020-08-10 | 2020-11-13 | 北京计算机技术及应用研究所 | Convolutional neural network channel pruning method based on model fine tuning |
CN112132005A (en) * | 2020-09-21 | 2020-12-25 | 福州大学 | Face detection method based on cluster analysis and model compression |
WO2022160856A1 (en) * | 2021-01-27 | 2022-08-04 | 歌尔股份有限公司 | Classification network, and method and apparatus for implementing same |
CN114627342A (en) * | 2022-03-03 | 2022-06-14 | 北京百度网讯科技有限公司 | Training method, device and equipment of image recognition model based on sparsity |
CN114841931A (en) * | 2022-04-18 | 2022-08-02 | 西南交通大学 | Real-time sleeper defect detection method based on pruning algorithm |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111401523A (en) | Deep learning network model compression method based on network layer pruning | |
CN110135580B (en) | Convolution network full integer quantization method and application method thereof | |
CN112232476B (en) | Method and device for updating test sample set | |
CN105389349B (en) | Dictionary update method and device | |
CN109145898A (en) | A kind of object detecting method based on convolutional neural networks and iterator mechanism | |
CN111460958B (en) | Object detector construction and object detection method and system | |
CN112052951A (en) | Pruning neural network method, system, equipment and readable storage medium | |
CN111222629A (en) | Neural network model pruning method and system based on adaptive batch normalization | |
CN113283473B (en) | CNN feature mapping pruning-based rapid underwater target identification method | |
CN110705708A (en) | Compression method and device of convolutional neural network model and computer storage medium | |
CN112488313A (en) | Convolutional neural network model compression method based on explicit weight | |
US20220245465A1 (en) | Picture searching method and apparatus, electronic device and computer readable storage medium | |
CN112686376A (en) | Node representation method based on timing diagram neural network and incremental learning method | |
CN111582229A (en) | Network self-adaptive semi-precision quantized image processing method and system | |
CN103699612A (en) | Image retrieval ranking method and device | |
CN111932690B (en) | Pruning method and device based on 3D point cloud neural network model | |
CN110874635A (en) | Deep neural network model compression method and device | |
CN114511731A (en) | Training method and device of target detector, storage medium and electronic equipment | |
CN106776751A (en) | The clustering method and clustering apparatus of a kind of data | |
CN116959477A (en) | Convolutional neural network-based noise source classification method and device | |
CN112132062A (en) | Remote sensing image classification method based on pruning compression neural network | |
CN111461324A (en) | Hierarchical pruning method based on layer recovery sensitivity | |
CN116187416A (en) | Iterative retraining method based on layer pruning sensitivity and image processor | |
CN111582442A (en) | Image identification method based on optimized deep neural network model | |
CN116051961A (en) | Target detection model training method, target detection method, device and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200710 |