CN110689113A - Deep neural network compression method based on brain consensus initiative - Google Patents
Deep neural network compression method based on brain consensus initiative Download PDFInfo
- Publication number
- CN110689113A CN110689113A CN201910885350.3A CN201910885350A CN110689113A CN 110689113 A CN110689113 A CN 110689113A CN 201910885350 A CN201910885350 A CN 201910885350A CN 110689113 A CN110689113 A CN 110689113A
- Authority
- CN
- China
- Prior art keywords
- channel
- channels
- neural network
- layer
- deep neural
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 74
- 238000000034 method Methods 0.000 title claims abstract description 60
- 230000006835 compression Effects 0.000 title claims abstract description 27
- 238000007906 compression Methods 0.000 title claims abstract description 27
- 210000004556 brain Anatomy 0.000 title claims abstract description 9
- 238000013138 pruning Methods 0.000 claims abstract description 37
- 238000012549 training Methods 0.000 claims abstract description 32
- 230000004913 activation Effects 0.000 claims abstract description 20
- 108091006146 Channels Proteins 0.000 claims description 144
- 238000011156 evaluation Methods 0.000 claims description 27
- 230000001133 acceleration Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000004927 fusion Effects 0.000 claims description 3
- 230000003993 interaction Effects 0.000 claims description 3
- 230000007774 longterm Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 238000012935 Averaging Methods 0.000 claims description 2
- 230000000694 effects Effects 0.000 abstract description 8
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 244000141353 Prunus domestica Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000000593 degrading effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a deep neural network compression method based on brain consensus initiative, which screens partial important channels layer by layer in a convolutional layer in the forward process of each neural network training, and sets the activation values of other channels to be zero. Thus, during back propagation of errors, the gradient values of the convolution kernels that generate these insignificant channels are zero and therefore are not updated or trained. Meanwhile, the updating process of the channel utility is embedded in the back propagation of the error, and the connection between the channel utility and the error is enhanced through a 'consensus initiative' method. Each iteration of the network is updated by selectively 'training' convolution kernels corresponding to the effective channels, so that when training is finished, channels with high channel effects are reserved, and channel pruning and deep neural network compression are achieved. The method greatly simplifies the general flow of the existing deep neural network compression method and has high efficiency.
Description
Technical Field
The invention relates to the field of artificial intelligence and neural network computing, in particular to a deep neural network compression method based on brain consensus activity.
Background
Over the years, the development of deep neural networks has led to a great revolution in the field of artificial intelligence. It is generally accepted that the performance of a deep neural network depends on its depth. However, a deep neural network tends to incur a large overhead in computation and storage. In order to make the deep neural network applicable to some low power devices, such as mobile phones, it is necessary to reduce the complexity. Among many model compression algorithms, channel pruning is a compression algorithm specifically designed for convolutional layers of deep neural networks.
Channel pruning refers to a model compression algorithm that prunes the channels of the convolutional layers of the deep neural network. And screening a plurality of channels with the best performance for the input image expression through different strategies or methods, and cutting off the rest channels to realize the compression of the deep neural network model. A general channel pruning algorithm comprises three basic steps: training a redundant neural network; clipping it according to a certain rule; and training the cut neural network to recover the model performance. This process is quite redundant and current channel pruning algorithms focus on the significance or importance of each channel itself, thus ignoring the inherent link between them.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a deep neural network compression method based on brain consensus activity, a model compression method for simultaneously carrying out deep neural network training and pruning, a plurality of channels with best cooperativity and strongest expressiveness are selected from all channels of one layer through the consensus activity, and the rest channels are pruned, so that network compression is realized.
The invention is realized by the following technical scheme: a deep neural network compression method based on brain consensus initiative specifically comprises the following steps:
(1) in each forward process of deep neural network training, for each layer of channel, according to the utility value of the initialized channelArranging the channels from high to low, reserving the channel activation value corresponding to the channel utility value under the pruning rate according to the set pruning rate,and the channel activation values for the remaining channels of the layer are set to zero. The above-mentionedAnd (3) a long-term evaluation value of the channel of each layer of the neural network on the importance degree of the error of the deep neural network in the deep neural network training process, wherein l represents the index of the layer, and k represents the channel index of the layer.
(2) Determining normalized significance evaluation in back propagation process of deep neural network trainingThe method specifically comprises the following steps:
(2.1) in the back propagation process of deep neural network training, multiplying all channel activation values and gradients in each channel, accumulating and averaging to determine the significance evaluation of each channel
Wherein J represents an error function of the network;represents the l < th > layer, the k < th > channel, the m < th > activation value; m is the number of all the activation values of one channel of the l-th layer.
(2.2) evaluation of significance of channelThrough L2Normalization processing is carried out on the norm to obtain normalized significance evaluationComprises the following steps:
(3) And through a consensus initiative algorithm, the normalized significance evaluation among different channels is fused, and the interaction among the different channels is considered.
(3.1) evaluation of normalized significance between two channels by calculationAndthe product of the two channels is averaged according to the iteration number to obtain the correlation between the two channels
Wherein,representing the correlation among the ith channel, the jth channel and the ith layer, and the value range is 0-1,and (4) participating in the iteration times of deep neural network training for the two channels.
(3.2) other channels of the same layerAnd (3.1) calculated correlationMultiplying and summing the two channels to obtain a fusion significance evaluation value
(3.3) mixingAdding the strategy of moving average to the initial channel utility value in step 1The method comprises the following steps:
wherein, λ represents an attenuation factor, the value range is between 0 and 1, and n is the iteration number of the channel participating in the deep neural network.
(4) Step 3 is carried out in a circulating way, and the channel utility values of all the channels are updatedUntil the deep neural network converges.
(5) After the deep neural network converges, the utility value is calculated according to the channelAnd arranging the channels layer by layer, pruning the channels corresponding to the channel utility values under the pruning rate and generating convolution kernels of the channels according to the preset pruning rate, and realizing model compression and acceleration.
Compared with the prior art, the invention has the following beneficial effects: in the deep neural network training process, channels with strong expression capacity for input images are selectively identified and trained, and the learning process and the pruning process of the deep neural network are combined, so that the flow of the traditional neural network pruning algorithm is greatly simplified, and the efficiency of a compression algorithm is improved; by introducing a consensus active phenomenon among neurons in the brain, the internal relation among neurons in the same layer of the neural network is considered, so that the neural network after pruning also has high accuracy, and the performance of the neural network exceeds that of the existing algorithm. The compression method has the characteristics of simple realization, high efficiency and high accuracy of the compressed model.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
As shown in fig. 1, a deep neural network compression method based on brain consensus activity of the present invention specifically includes the following steps:
(1) channel utilityAnd (3) a long-term evaluation value of the channel of each layer of the neural network on the importance degree of the error of the deep neural network in the deep neural network training process, wherein l represents the index of the layer, and k represents the channel index of the layer. Those channels with high channel utility values are important for the neural network model, and if they are pruned, they will have a large impact on the training error, thereby degrading the model performance. Therefore, in each forward process of deep neural network training, for each layer of channels, the utility value of the channel is initialized according to the utility value of the channelThe channels are arranged according to the height of the channel, the channel activation value corresponding to the channel utility value under the pruning rate is reserved according to the set pruning rate, and the channel activation values of the other channels of the layer are set to be zero. The pruning rate is the proportion of the channels to be pruned in all the channels, the value range of the pruning rate is between 0 and 1, and the pruning rate is determined by comprehensively considering the performance loss and the compression yield of the deep neural network.
(2) Obtaining normalized significance evaluation in the back propagation process of deep neural network trainingThe method specifically comprises the following steps:
(2.1) inverse of deep neural network trainingIn the process of propagation, all the channel activation values and gradients in each channel are multiplied, accumulated and averaged, and the significance evaluation of each channel is determined
Wherein J represents an error function of the network;represents the l < th > layer, the k < th > channel, the m < th > activation value; m is the number of all the activation values of one channel of the l-th layer.
(2.2) evaluation of significance of channelThrough L2Normalization processing is carried out on the norm to obtain normalized significance evaluationComprises the following steps:
wherein,is in the range of 0 to 1.
(3) Through a consensus initiative algorithm, normalized significance evaluation among different channels is fused, interaction among the different channels is considered, the effect of cooperatively selecting effective channels can be achieved, and the accuracy of the compressed neural network is improved.
(3.1) evaluation of normalized significance between two channels by calculationAndthe product of the two channels is averaged according to the iteration number to obtain the correlation between the two channels
Wherein,representing the correlation among the ith channel, the jth channel and the ith layer, and the value range is 0-1,and (4) participating in the iteration times of deep neural network training for the two channels.
(3.2) other channels of the same layerAnd (3.1) calculated correlationMultiplying and summing the two channels to obtain a fusion significance evaluation value
Fused significance evaluation valueThe method considers the influence of other channels on the same layer on the current channel and is the core of the consensus initiative algorithm.
(3.3) mixingAdding the strategy of moving average to the initial channel utility value in step 1The method comprises the following steps:
wherein, λ represents an attenuation factor, the value range is between 0 and 1, and n is the iteration number of the channel participating in the deep neural network. The attenuation factor has the effect that for each channel utility value, the attenuation is continuously carried out along with the increase of the iteration times; if the channel utility value decreases by an amount (due to the attenuation factor) that is less than the attenuation of the channel utility value by the amount (last term of equation (5)) of the increase of the channel utility value during an update such as (3.3), the channel may not participate in training (the activation value is set to zero) during the next training iteration (step 1), thereby achieving the effect of channel "screening".
(4) Step 3 is carried out in a circulating way, and the channel utility values of all the channels are updatedAnd then continuously selecting the effective channel until the deep neural network converges.
(5) After the deep neural network converges, the utility value is calculated according to the channelAnd arranging the channels layer by layer, pruning the channels corresponding to the channel utility values under the pruning rate and generating convolution kernels of the channels according to the preset pruning rate, and realizing model compression and acceleration. In the deep neural network training process, the method continuously calculates and updates the channel utility value of each channel, namely, the network pruning dependence standard is obtained in the neural network training process. Thus, network pruning can be directly carried out when the neural network training is finished, and the stream of the general pruning method is greatly simplifiedThe method has high efficiency.
Examples
An example of this method is given below. For example, the compressed VGG-16 deep neural network comprises 13 convolutional layers, each having channels with the number [64, 128,256, 512 ].
1. Given an input data set or input picture z0(ii) a Pruning Rate of each layer { plEither layer or layer of either layer or layer, either layer or layers being less than or equal to 0.5,1 ≦ l ≦ 13} passage; initialization model { convlL is more than or equal to 1 and less than or equal to 13 }; attenuation constant λ ← 0.8 and maximum number of iterations of training lmax. Since the present method aims at compressing convolutional layer parameters in a deep neural network, the notation "conv" only represents convolutional layers.
2. Iteration number i ← 0 for initializing neural network training, and channel utility value { u } of each layerl← 0,1 ≤ l ≤ 13}, correlation matrix { R } of each layerl←0,1≤l≤13}。
3. When the iteration number I is less than the maximum iteration step number ImaxIn time, the method performs training of the neural network. In the forward process of executing the neural network once, specifically, the following steps are performed layer by layer:
(3.1) calculating to obtain the activation value z of the output channel of each layerl←convl(zl-1)。
(3.2) initializing a binarization mask mlAnd ← 0, the role of the mask being to indicate the selected channel.
(3.3) first, u is introducedlThe sequence from high to low is performed. For all output channels of the current layer (with C)lRepresenting its number), the method preserves C corresponding to the highest channel utilityl(1-pl)=0.5ClThe activation value of each channel, specifically, the mask value of the corresponding position of these channels is set to 1
(3.4) multiplying the channel mask and the channel activation value by channel zl←zl·mlResult inInput to the next layer.
4. And calculating the error J of the final neural network output.
5. A back propagation process of the neural network is performed, specifically, the following steps are performed layer by layer.
(5.2) the significance evaluations described by equation (1) and equation (2) are calculated and normalized.
(5.3) calculating and updating the counter that appears in equation (3): if it is notOtherwise it remains unchanged.
(5.4) updating the correlation matrix R according to equation (3)l。
(5.5) updating the importance evaluation θ of the channel using equation (4)l←Rlθl。
(5.6) updating the channel utility u described in equation (5)l←λul+(1-λ)θl。
6. When the maximum training step length is reached or the neural network converges, the channel utility u according to each layerlAnd pruning a small half of the channel (the pruning rate of each layer is 0.5) layer by layer to obtain a channel corresponding to the channel utility and a convolution kernel for generating the channel. And copying the rest parameters into a more compact model, thereby realizing the training and pruning of the neural network.
The table below shows the accuracy that can be achieved by this method at different pruning rates (or compression rates) and in comparison with other methods. As shown in the table, when floating point operands, namely FLOPs, are compressed by about 35%, the method can still achieve 93.78% of accuracy, and the result exceeds the common norm-based pruning method; when the compression rate reaches 49.6%, the method can still maintain 93.68% of accuracy, and also exceeds the structured Bayes pruning method, and the compression rate is only 92.50%; when the compression rate reaches 75.2%, the method can achieve 92.72% of recognition accuracy of the compressed neural network with only 1.28% of accuracy loss. Therefore, the compression method can directly convert a redundant and unintelligent neural network into a compact neural network with rich expression capability.
Comparison of accuracy of different methods
Claims (1)
1. A deep neural network compression method based on brain consensus initiative is characterized by comprising the following steps:
(1) in each forward process of deep neural network training, for each layer of channel, according to the utility value of the initialized channelAnd arranging the channels from high to low, reserving channel activation values corresponding to the channel utility values under the pruning rate according to the set pruning rate, and setting the channel activation values of the rest channels of the layer to zero. The above-mentionedAnd (3) a long-term evaluation value of the channel of each layer of the neural network on the importance degree of the error of the deep neural network in the deep neural network training process, wherein l represents the index of the layer, and k represents the channel index of the layer.
(2) Determining normalized significance evaluation in back propagation process of deep neural network trainingThe method specifically comprises the following steps:
(2.1) in the back propagation process of deep neural network training, multiplying all channel activation values and gradients in each channel, accumulating and averaging to determine the significance evaluation of each channel
Wherein J represents an error function of the network;represents the l < th > layer, the k < th > channel, the m < th > activation value; m is the number of all the activation values of one channel of the l-th layer.
(2.2) evaluation of significance of channelThrough L2Normalization processing is carried out on the norm to obtain normalized significance evaluationComprises the following steps:
wherein,is in the range of 0 to 1.
(3) And through a consensus initiative algorithm, the normalized significance evaluation among different channels is fused, and the interaction among the different channels is considered.
(3.1) evaluation of normalized significance between two channels by calculationAndthe product of the two channels is averaged according to the iteration number to obtain the correlation between the two channels
Wherein,representing the correlation among the ith channel, the jth channel and the ith layer, and the value range is 0-1,and (4) participating in the iteration times of deep neural network training for the two channels.
(3.2) other channels of the same layerAnd (3.1) calculated correlationMultiplying and summing the two channels to obtain a fusion significance evaluation value
(3.3) mixingAdding the strategy of moving average to the initial channel utility value in step 1The method comprises the following steps:
wherein, λ represents an attenuation factor, the value range is between 0 and 1, and n is the iteration number of the channel participating in the deep neural network.
(4) Step 3 is carried out in a circulating way, and the channel utility values of all the channels are updatedUntil the deep neural network converges.
(5) After the deep neural network converges, the utility value is calculated according to the channelAnd arranging the channels layer by layer, pruning the channels corresponding to the channel utility values under the pruning rate and generating convolution kernels of the channels according to the preset pruning rate, and realizing model compression and acceleration.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910885350.3A CN110689113A (en) | 2019-09-19 | 2019-09-19 | Deep neural network compression method based on brain consensus initiative |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910885350.3A CN110689113A (en) | 2019-09-19 | 2019-09-19 | Deep neural network compression method based on brain consensus initiative |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110689113A true CN110689113A (en) | 2020-01-14 |
Family
ID=69109619
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910885350.3A Pending CN110689113A (en) | 2019-09-19 | 2019-09-19 | Deep neural network compression method based on brain consensus initiative |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110689113A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111931914A (en) * | 2020-08-10 | 2020-11-13 | 北京计算机技术及应用研究所 | Convolutional neural network channel pruning method based on model fine tuning |
CN113283473A (en) * | 2021-04-20 | 2021-08-20 | 中国海洋大学 | Rapid underwater target identification method based on CNN feature mapping pruning |
WO2021164752A1 (en) * | 2020-02-21 | 2021-08-26 | 华为技术有限公司 | Neural network channel parameter searching method, and related apparatus |
WO2022022625A1 (en) * | 2020-07-29 | 2022-02-03 | 北京智行者科技有限公司 | Acceleration method and device for deep learning model |
WO2022178908A1 (en) * | 2021-02-26 | 2022-09-01 | 中国科学院深圳先进技术研究院 | Neural network pruning method and apparatus, and storage medium |
-
2019
- 2019-09-19 CN CN201910885350.3A patent/CN110689113A/en active Pending
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021164752A1 (en) * | 2020-02-21 | 2021-08-26 | 华为技术有限公司 | Neural network channel parameter searching method, and related apparatus |
WO2022022625A1 (en) * | 2020-07-29 | 2022-02-03 | 北京智行者科技有限公司 | Acceleration method and device for deep learning model |
CN111931914A (en) * | 2020-08-10 | 2020-11-13 | 北京计算机技术及应用研究所 | Convolutional neural network channel pruning method based on model fine tuning |
WO2022178908A1 (en) * | 2021-02-26 | 2022-09-01 | 中国科学院深圳先进技术研究院 | Neural network pruning method and apparatus, and storage medium |
CN113283473A (en) * | 2021-04-20 | 2021-08-20 | 中国海洋大学 | Rapid underwater target identification method based on CNN feature mapping pruning |
CN113283473B (en) * | 2021-04-20 | 2023-10-13 | 中国海洋大学 | CNN feature mapping pruning-based rapid underwater target identification method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110689113A (en) | Deep neural network compression method based on brain consensus initiative | |
CN109978142B (en) | Neural network model compression method and device | |
KR101113006B1 (en) | Apparatus and method for clustering using mutual information between clusters | |
CN104866904B (en) | A kind of BP neural network parallel method of the genetic algorithm optimization based on spark | |
CN111079899A (en) | Neural network model compression method, system, device and medium | |
CN111738401A (en) | Model optimization method, grouping compression method, corresponding device and equipment | |
CN115271099A (en) | Self-adaptive personalized federal learning method supporting heterogeneous model | |
CN109151332B (en) | Camera coding exposure optimal codeword sequence searching method based on fitness function | |
CN110535475B (en) | Hierarchical adaptive normalized minimum sum decoding algorithm | |
CN111758104B (en) | Neural network parameter optimization method and neural network calculation method and device suitable for hardware implementation | |
CN108764472A (en) | Convolutional neural networks fractional order error back propagation method | |
CN113708969B (en) | Collaborative embedding method of cloud data center virtual network based on deep reinforcement learning | |
JP2021022050A (en) | Neural network compression method, neural network compression device, computer program, and method of producing compressed neural network data | |
CN109409505A (en) | A method of the compression gradient for distributed deep learning | |
WO2021092796A1 (en) | Neural network model deployment method and apparatus, and device | |
CN107748913A (en) | A kind of general miniaturization method of deep neural network | |
CN114970853A (en) | Cross-range quantization convolutional neural network compression method | |
CN117574429A (en) | Federal deep learning method for privacy enhancement in edge computing network | |
CN110222816B (en) | Deep learning model establishing method, image processing method and device | |
Kim et al. | Fine-grained neural architecture search | |
CN117521763A (en) | Artificial intelligent model compression method integrating regularized pruning and importance pruning | |
CN111488981A (en) | Method for selecting sparse threshold of depth network parameter based on Gaussian distribution estimation | |
CN114419361A (en) | Neural network image classification method based on gated local channel attention | |
CN117422151A (en) | Federal learning method and device based on reasoning similarity and soft clustering | |
CN109711543B (en) | Reconfigurable deep belief network implementation system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200114 |