CN111401523A - Deep learning network model compression method based on network layer pruning - Google Patents

Deep learning network model compression method based on network layer pruning Download PDF

Info

Publication number
CN111401523A
CN111401523A CN202010177912.1A CN202010177912A CN111401523A CN 111401523 A CN111401523 A CN 111401523A CN 202010177912 A CN202010177912 A CN 202010177912A CN 111401523 A CN111401523 A CN 111401523A
Authority
CN
China
Prior art keywords
layer
gamma
network
parameters
pruning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010177912.1A
Other languages
Chinese (zh)
Inventor
郭烈
高建东
赵剑
刘蓬勃
石振周
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University of Technology
Original Assignee
Dalian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University of Technology filed Critical Dalian University of Technology
Priority to CN202010177912.1A priority Critical patent/CN111401523A/en
Publication of CN111401523A publication Critical patent/CN111401523A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a deep learning network model compression method based on network layer pruning, which comprises the following steps: for the convolutional neural network which finishes training, leading in the trained weight and carrying out sparse training aiming at a BN layer, wherein the BN layer has two training parameters which are Gamma and Beta respectively, and the sparse process is finished through multiple iterations; obtaining Gamma parameters of each BN layer of the network, setting a global channel pruning proportion, calculating a threshold value of the Gamma parameters according to the global pruning proportion, and setting all the Gamma parameters smaller than the threshold value to zero; setting the number I of shortcut network layers to be cut off, calculating POZ according to Gamma of BN layer _2 in shortcut, and deleting I shortcut structures with larger POZ; deleting convolution channels related to Gamma with other BN layers set to zero; and storing the network structure and parameters after pruning.

Description

Deep learning network model compression method based on network layer pruning
Technical Field
The invention relates to the technical field of deep learning convolutional neural network model compression acceleration and the like, in particular to a deep learning network model compression method based on network layer pruning.
Background
Compared with the traditional image processing algorithm, the deep convolutional neural network has the advantages that the accuracy is obviously improved, but as the number of network layers is increased, the model is more and more complex, and when the deep convolutional neural network is deployed to edge computing equipment, the phenomenon that the speed is low during network reasoning and the real-time requirement cannot be met is faced.
The current model pruning reasoning acceleration algorithm is divided into unstructured pruning and structured pruning, the unstructured pruning does not change the structure of a network, the weight of a deep convolutional neural network is judged, nodes with smaller weights are pruned, but the network is sparse after pruning, and the acceleration can be realized only through specially designed hardware. The structured pruning of the model is to prune at the level of a convolution neural network convolution kernel (filter), so that the structure of the network can be changed, and a certain acceleration effect can be obtained on different platforms in actual use. However, when the number of the deep learning network layers is too deep, the acceleration effect obtained by model channel pruning is limited due to the influence of hardware IO caused by network layer input and output in the calculation process.
Disclosure of Invention
According to the problems existing in the prior art, the invention discloses a deep learning network model compression method based on network layer pruning,
for the convolutional neural network which completes training, leading in the trained weight and carrying out sparse training aiming at the BN layer;
two training parameters are set in the BN layer, namely Gamma and Beta respectively,
Figure BDA0002411425090000011
the Gamma coefficient in the BN layer is restrained in the training process, so that the Gamma coefficient approaches to 0, and the process of sparsification is completed through multiple iterations;
obtaining Gamma parameters of each BN layer of the network, setting a global channel pruning proportion, calculating a threshold value of the Gamma parameters according to the global pruning proportion, and setting all the Gamma parameters smaller than the threshold value to zero;
setting the number I of the pre-cut shortcut network layers, calculating the Gamma parameter POZ of the BN layer _2 in each shortcut structure,
sorting each short structure from large to small according to the POZ of the BN layer-2, deleting the short structure corresponding to the POZ sorted at the first I,
deleting convolution channels related to Gamma with other BN layers set to zero;
and storing the structure and parameters of the pruned network, and performing fine tuning training on the stored network if necessary.
Further, the Gamma parameter POZ is calculated in the following way:
Figure BDA0002411425090000021
where M represents the Gamma coefficient dimension of the BN layer, f (x) is 1 when x is 0, and f (x) is 0 when x is non-zero.
Due to the adoption of the technical scheme, the deep learning network model compression method based on the network layer pruning reduces the IO operation time between network layers in the model reasoning process, improves the calculation speed and ensures that the accuracy influence of the model reasoning is small.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram of a structural convolutional neural network of a cut-out shortcut;
FIG. 2 is a flowchart of the deep learning network model compression method of the present invention.
Detailed Description
In order to make the technical solutions and advantages of the present invention clearer, the following describes the technical solutions in the embodiments of the present invention clearly and completely with reference to the drawings in the embodiments of the present invention:
as shown in fig. 1, for a network layer with a shortcut structure, the importance of the network layer is measured, and the corresponding network layer with the shortcut structure is deleted through an importance comparison algorithm. The general convolutional neural network is an input layer, a hidden layer and an output layer, most structures in the hidden layer are a convolutional layer-BN layer-an activation function layer, a shortcut structure in the network layer is shown in FIG. 1, namely an input _0, a convolutional layer _1, a BN layer _1, an activation layer _1, a convolutional layer _2, a BN layer _2, an activation layer _2, an ADD layer, and an ADD layer characteristic diagram is obtained by adding the activation layer _2 and the input _0, and pruning of the network layer is carried out on the shortcut structure in the network.
As shown in fig. 2, a deep learning network model compression method based on network layer pruning specifically includes the following steps:
the training of the deep learning network is completed aiming at the network task of the user, the trained weight is introduced to carry out the sparse training aiming at the BN layer of the convolutional neural network,
the BN layer has two trainable parameters which are Gamma and Beta respectively,
Figure BDA0002411425090000031
namely, the Gamma coefficient in the BN layer is restrained in the training process, so that the Gamma coefficient approaches 0. And completing the thinning process through multiple iterations.
The method comprises the steps of obtaining Gamma parameters of each BN layer of the network, setting a global channel pruning proportion, calculating a threshold value of the Gamma parameters according to the global pruning proportion, and setting all the Gamma parameters smaller than the threshold value to zero.
Setting the number I of shortcut network layers to be cut off, calculating POZ (percent of zero) for Gamma parameter of BN layer _2 in each shortcut structure,
Figure BDA0002411425090000032
where M represents the Gamma coefficient dimension of the BN layer, f (x) is 1 when x is 0, and f (x) is 0 when x is non-zero.
And sequencing each short structure from large to small according to the POZ of the BN layer _2, and deleting the I short structures with larger POZ, thereby achieving the purpose of pruning the network layer.
Deleting convolution channels related to Gamma with other BN layers set to zero, storing the structure and parameters of the pruned network, and performing fine tuning training on the stored network if necessary.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims (2)

1. A deep learning network model compression method based on network layer pruning is characterized by comprising the following steps:
for the convolutional neural network which completes training, leading in the trained weight and carrying out sparse training aiming at the BN layer;
two training parameters are set in the BN layer, namely Gamma and Beta respectively,
Figure FDA0002411425080000011
the Gamma coefficient in the BN layer is restrained in the training process, so that the Gamma coefficient approaches to 0, and the process of sparsification is completed through multiple iterations;
obtaining Gamma parameters of each BN layer of the network, setting a global channel pruning proportion, calculating a threshold value of the Gamma parameters according to the global pruning proportion, and setting all the Gamma parameters smaller than the threshold value to zero;
setting the number I of the pre-cut shortcut network layers, and calculating a Gamma parameter POZ of a BN layer _2 in each shortcut structure;
sorting each short structure from large to small according to the POZ of the BN layer-2, deleting the short structure corresponding to the POZ sorted at the first I,
deleting convolution channels related to Gamma with other BN layers set to zero;
and storing the structure and parameters of the pruned network, and performing fine tuning training on the stored network if necessary.
2. The deep learning network model compression method of claim 1, further characterized by: the Gamma parameter POZ is calculated by adopting the following method:
Figure FDA0002411425080000012
where M represents the Gamma coefficient dimension of the BN layer, f (x) is 1 when x is 0, and f (x) is 0 when x is non-zero.
CN202010177912.1A 2020-03-13 2020-03-13 Deep learning network model compression method based on network layer pruning Pending CN111401523A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010177912.1A CN111401523A (en) 2020-03-13 2020-03-13 Deep learning network model compression method based on network layer pruning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010177912.1A CN111401523A (en) 2020-03-13 2020-03-13 Deep learning network model compression method based on network layer pruning

Publications (1)

Publication Number Publication Date
CN111401523A true CN111401523A (en) 2020-07-10

Family

ID=71428721

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010177912.1A Pending CN111401523A (en) 2020-03-13 2020-03-13 Deep learning network model compression method based on network layer pruning

Country Status (1)

Country Link
CN (1) CN111401523A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111882810A (en) * 2020-07-31 2020-11-03 广州市微智联科技有限公司 Fire identification and early warning method and system
CN111931914A (en) * 2020-08-10 2020-11-13 北京计算机技术及应用研究所 Convolutional neural network channel pruning method based on model fine tuning
CN112132005A (en) * 2020-09-21 2020-12-25 福州大学 Face detection method based on cluster analysis and model compression
CN114627342A (en) * 2022-03-03 2022-06-14 北京百度网讯科技有限公司 Training method, device and equipment of image recognition model based on sparsity
CN114841931A (en) * 2022-04-18 2022-08-02 西南交通大学 Real-time sleeper defect detection method based on pruning algorithm
WO2022160856A1 (en) * 2021-01-27 2022-08-04 歌尔股份有限公司 Classification network, and method and apparatus for implementing same

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111882810A (en) * 2020-07-31 2020-11-03 广州市微智联科技有限公司 Fire identification and early warning method and system
CN111882810B (en) * 2020-07-31 2022-07-01 广州市微智联科技有限公司 Fire identification and early warning method and system
CN111931914A (en) * 2020-08-10 2020-11-13 北京计算机技术及应用研究所 Convolutional neural network channel pruning method based on model fine tuning
CN112132005A (en) * 2020-09-21 2020-12-25 福州大学 Face detection method based on cluster analysis and model compression
WO2022160856A1 (en) * 2021-01-27 2022-08-04 歌尔股份有限公司 Classification network, and method and apparatus for implementing same
CN114627342A (en) * 2022-03-03 2022-06-14 北京百度网讯科技有限公司 Training method, device and equipment of image recognition model based on sparsity
CN114841931A (en) * 2022-04-18 2022-08-02 西南交通大学 Real-time sleeper defect detection method based on pruning algorithm

Similar Documents

Publication Publication Date Title
CN111401523A (en) Deep learning network model compression method based on network layer pruning
CN110135580B (en) Convolution network full integer quantization method and application method thereof
CN112232476B (en) Method and device for updating test sample set
CN105389349B (en) Dictionary update method and device
CN109145898A (en) A kind of object detecting method based on convolutional neural networks and iterator mechanism
CN111460958B (en) Object detector construction and object detection method and system
CN112052951A (en) Pruning neural network method, system, equipment and readable storage medium
CN111222629A (en) Neural network model pruning method and system based on adaptive batch normalization
CN113283473B (en) CNN feature mapping pruning-based rapid underwater target identification method
CN110705708A (en) Compression method and device of convolutional neural network model and computer storage medium
CN112488313A (en) Convolutional neural network model compression method based on explicit weight
US20220245465A1 (en) Picture searching method and apparatus, electronic device and computer readable storage medium
CN112686376A (en) Node representation method based on timing diagram neural network and incremental learning method
CN111582229A (en) Network self-adaptive semi-precision quantized image processing method and system
CN103699612A (en) Image retrieval ranking method and device
CN111932690B (en) Pruning method and device based on 3D point cloud neural network model
CN110874635A (en) Deep neural network model compression method and device
CN114511731A (en) Training method and device of target detector, storage medium and electronic equipment
CN106776751A (en) The clustering method and clustering apparatus of a kind of data
CN116959477A (en) Convolutional neural network-based noise source classification method and device
CN112132062A (en) Remote sensing image classification method based on pruning compression neural network
CN111461324A (en) Hierarchical pruning method based on layer recovery sensitivity
CN116187416A (en) Iterative retraining method based on layer pruning sensitivity and image processor
CN111582442A (en) Image identification method based on optimized deep neural network model
CN116051961A (en) Target detection model training method, target detection method, device and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200710