CN110689113A - Deep neural network compression method based on brain consensus initiative - Google Patents

Deep neural network compression method based on brain consensus initiative Download PDF

Info

Publication number
CN110689113A
CN110689113A CN201910885350.3A CN201910885350A CN110689113A CN 110689113 A CN110689113 A CN 110689113A CN 201910885350 A CN201910885350 A CN 201910885350A CN 110689113 A CN110689113 A CN 110689113A
Authority
CN
China
Prior art keywords
channel
channels
neural network
layer
deep neural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910885350.3A
Other languages
Chinese (zh)
Inventor
申世博
李荣鹏
张宏纲
赵志峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201910885350.3A priority Critical patent/CN110689113A/en
Publication of CN110689113A publication Critical patent/CN110689113A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a deep neural network compression method based on brain consensus initiative, which screens partial important channels layer by layer in a convolutional layer in the forward process of each neural network training, and sets the activation values of other channels to be zero. Thus, during back propagation of errors, the gradient values of the convolution kernels that generate these insignificant channels are zero and therefore are not updated or trained. Meanwhile, the updating process of the channel utility is embedded in the back propagation of the error, and the connection between the channel utility and the error is enhanced through a 'consensus initiative' method. Each iteration of the network is updated by selectively 'training' convolution kernels corresponding to the effective channels, so that when training is finished, channels with high channel effects are reserved, and channel pruning and deep neural network compression are achieved. The method greatly simplifies the general flow of the existing deep neural network compression method and has high efficiency.

Description

Deep neural network compression method based on brain consensus initiative
Technical Field
The invention relates to the field of artificial intelligence and neural network computing, in particular to a deep neural network compression method based on brain consensus activity.
Background
Over the years, the development of deep neural networks has led to a great revolution in the field of artificial intelligence. It is generally accepted that the performance of a deep neural network depends on its depth. However, a deep neural network tends to incur a large overhead in computation and storage. In order to make the deep neural network applicable to some low power devices, such as mobile phones, it is necessary to reduce the complexity. Among many model compression algorithms, channel pruning is a compression algorithm specifically designed for convolutional layers of deep neural networks.
Channel pruning refers to a model compression algorithm that prunes the channels of the convolutional layers of the deep neural network. And screening a plurality of channels with the best performance for the input image expression through different strategies or methods, and cutting off the rest channels to realize the compression of the deep neural network model. A general channel pruning algorithm comprises three basic steps: training a redundant neural network; clipping it according to a certain rule; and training the cut neural network to recover the model performance. This process is quite redundant and current channel pruning algorithms focus on the significance or importance of each channel itself, thus ignoring the inherent link between them.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a deep neural network compression method based on brain consensus activity, a model compression method for simultaneously carrying out deep neural network training and pruning, a plurality of channels with best cooperativity and strongest expressiveness are selected from all channels of one layer through the consensus activity, and the rest channels are pruned, so that network compression is realized.
The invention is realized by the following technical scheme: a deep neural network compression method based on brain consensus initiative specifically comprises the following steps:
(1) in each forward process of deep neural network training, for each layer of channel, according to the utility value of the initialized channel
Figure BDA0002207145580000011
Arranging the channels from high to low, reserving the channel activation value corresponding to the channel utility value under the pruning rate according to the set pruning rate,and the channel activation values for the remaining channels of the layer are set to zero. The above-mentioned
Figure BDA0002207145580000024
And (3) a long-term evaluation value of the channel of each layer of the neural network on the importance degree of the error of the deep neural network in the deep neural network training process, wherein l represents the index of the layer, and k represents the channel index of the layer.
(2) Determining normalized significance evaluation in back propagation process of deep neural network trainingThe method specifically comprises the following steps:
(2.1) in the back propagation process of deep neural network training, multiplying all channel activation values and gradients in each channel, accumulating and averaging to determine the significance evaluation of each channel
Figure BDA0002207145580000026
Figure BDA0002207145580000021
Wherein J represents an error function of the network;
Figure BDA0002207145580000027
represents the l < th > layer, the k < th > channel, the m < th > activation value; m is the number of all the activation values of one channel of the l-th layer.
(2.2) evaluation of significance of channelThrough L2Normalization processing is carried out on the norm to obtain normalized significance evaluation
Figure BDA0002207145580000029
Comprises the following steps:
Figure BDA0002207145580000022
wherein,
Figure BDA00022071455800000210
is in the range of 0 to 1.
(3) And through a consensus initiative algorithm, the normalized significance evaluation among different channels is fused, and the interaction among the different channels is considered.
(3.1) evaluation of normalized significance between two channels by calculation
Figure BDA00022071455800000211
And
Figure BDA00022071455800000212
the product of the two channels is averaged according to the iteration number to obtain the correlation between the two channels
Figure BDA00022071455800000213
Figure BDA0002207145580000023
Wherein,
Figure BDA00022071455800000214
representing the correlation among the ith channel, the jth channel and the ith layer, and the value range is 0-1,and (4) participating in the iteration times of deep neural network training for the two channels.
(3.2) other channels of the same layer
Figure BDA00022071455800000216
And (3.1) calculated correlation
Figure BDA00022071455800000217
Multiplying and summing the two channels to obtain a fusion significance evaluation value
Figure BDA0002207145580000033
Figure BDA0002207145580000031
(3.3) mixing
Figure BDA0002207145580000034
Adding the strategy of moving average to the initial channel utility value in step 1
Figure BDA0002207145580000035
The method comprises the following steps:
Figure BDA0002207145580000032
wherein, λ represents an attenuation factor, the value range is between 0 and 1, and n is the iteration number of the channel participating in the deep neural network.
(4) Step 3 is carried out in a circulating way, and the channel utility values of all the channels are updatedUntil the deep neural network converges.
(5) After the deep neural network converges, the utility value is calculated according to the channel
Figure BDA0002207145580000037
And arranging the channels layer by layer, pruning the channels corresponding to the channel utility values under the pruning rate and generating convolution kernels of the channels according to the preset pruning rate, and realizing model compression and acceleration.
Compared with the prior art, the invention has the following beneficial effects: in the deep neural network training process, channels with strong expression capacity for input images are selectively identified and trained, and the learning process and the pruning process of the deep neural network are combined, so that the flow of the traditional neural network pruning algorithm is greatly simplified, and the efficiency of a compression algorithm is improved; by introducing a consensus active phenomenon among neurons in the brain, the internal relation among neurons in the same layer of the neural network is considered, so that the neural network after pruning also has high accuracy, and the performance of the neural network exceeds that of the existing algorithm. The compression method has the characteristics of simple realization, high efficiency and high accuracy of the compressed model.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
As shown in fig. 1, a deep neural network compression method based on brain consensus activity of the present invention specifically includes the following steps:
(1) channel utilityAnd (3) a long-term evaluation value of the channel of each layer of the neural network on the importance degree of the error of the deep neural network in the deep neural network training process, wherein l represents the index of the layer, and k represents the channel index of the layer. Those channels with high channel utility values are important for the neural network model, and if they are pruned, they will have a large impact on the training error, thereby degrading the model performance. Therefore, in each forward process of deep neural network training, for each layer of channels, the utility value of the channel is initialized according to the utility value of the channel
Figure BDA0002207145580000044
The channels are arranged according to the height of the channel, the channel activation value corresponding to the channel utility value under the pruning rate is reserved according to the set pruning rate, and the channel activation values of the other channels of the layer are set to be zero. The pruning rate is the proportion of the channels to be pruned in all the channels, the value range of the pruning rate is between 0 and 1, and the pruning rate is determined by comprehensively considering the performance loss and the compression yield of the deep neural network.
(2) Obtaining normalized significance evaluation in the back propagation process of deep neural network training
Figure BDA0002207145580000045
The method specifically comprises the following steps:
(2.1) inverse of deep neural network trainingIn the process of propagation, all the channel activation values and gradients in each channel are multiplied, accumulated and averaged, and the significance evaluation of each channel is determined
Figure BDA0002207145580000046
Figure BDA0002207145580000041
Wherein J represents an error function of the network;
Figure BDA0002207145580000047
represents the l < th > layer, the k < th > channel, the m < th > activation value; m is the number of all the activation values of one channel of the l-th layer.
(2.2) evaluation of significance of channel
Figure BDA0002207145580000048
Through L2Normalization processing is carried out on the norm to obtain normalized significance evaluation
Figure BDA0002207145580000049
Comprises the following steps:
Figure BDA0002207145580000042
wherein,is in the range of 0 to 1.
(3) Through a consensus initiative algorithm, normalized significance evaluation among different channels is fused, interaction among the different channels is considered, the effect of cooperatively selecting effective channels can be achieved, and the accuracy of the compressed neural network is improved.
(3.1) evaluation of normalized significance between two channels by calculation
Figure BDA00022071455800000411
And
Figure BDA00022071455800000412
the product of the two channels is averaged according to the iteration number to obtain the correlation between the two channels
Figure BDA00022071455800000413
Figure BDA0002207145580000043
Wherein,representing the correlation among the ith channel, the jth channel and the ith layer, and the value range is 0-1,
Figure BDA0002207145580000054
and (4) participating in the iteration times of deep neural network training for the two channels.
(3.2) other channels of the same layer
Figure BDA0002207145580000055
And (3.1) calculated correlation
Figure BDA0002207145580000056
Multiplying and summing the two channels to obtain a fusion significance evaluation value
Figure BDA0002207145580000057
Figure BDA0002207145580000051
Fused significance evaluation value
Figure BDA0002207145580000058
The method considers the influence of other channels on the same layer on the current channel and is the core of the consensus initiative algorithm.
(3.3) mixing
Figure BDA0002207145580000059
Adding the strategy of moving average to the initial channel utility value in step 1
Figure BDA00022071455800000510
The method comprises the following steps:
Figure BDA0002207145580000052
wherein, λ represents an attenuation factor, the value range is between 0 and 1, and n is the iteration number of the channel participating in the deep neural network. The attenuation factor has the effect that for each channel utility value, the attenuation is continuously carried out along with the increase of the iteration times; if the channel utility value decreases by an amount (due to the attenuation factor) that is less than the attenuation of the channel utility value by the amount (last term of equation (5)) of the increase of the channel utility value during an update such as (3.3), the channel may not participate in training (the activation value is set to zero) during the next training iteration (step 1), thereby achieving the effect of channel "screening".
(4) Step 3 is carried out in a circulating way, and the channel utility values of all the channels are updated
Figure BDA00022071455800000511
And then continuously selecting the effective channel until the deep neural network converges.
(5) After the deep neural network converges, the utility value is calculated according to the channelAnd arranging the channels layer by layer, pruning the channels corresponding to the channel utility values under the pruning rate and generating convolution kernels of the channels according to the preset pruning rate, and realizing model compression and acceleration. In the deep neural network training process, the method continuously calculates and updates the channel utility value of each channel, namely, the network pruning dependence standard is obtained in the neural network training process. Thus, network pruning can be directly carried out when the neural network training is finished, and the stream of the general pruning method is greatly simplifiedThe method has high efficiency.
Examples
An example of this method is given below. For example, the compressed VGG-16 deep neural network comprises 13 convolutional layers, each having channels with the number [64, 128,256, 512 ].
1. Given an input data set or input picture z0(ii) a Pruning Rate of each layer { plEither layer or layer of either layer or layer, either layer or layers being less than or equal to 0.5,1 ≦ l ≦ 13} passage; initialization model { convlL is more than or equal to 1 and less than or equal to 13 }; attenuation constant λ ← 0.8 and maximum number of iterations of training lmax. Since the present method aims at compressing convolutional layer parameters in a deep neural network, the notation "conv" only represents convolutional layers.
2. Iteration number i ← 0 for initializing neural network training, and channel utility value { u } of each layerl← 0,1 ≤ l ≤ 13}, correlation matrix { R } of each layerl←0,1≤l≤13}。
3. When the iteration number I is less than the maximum iteration step number ImaxIn time, the method performs training of the neural network. In the forward process of executing the neural network once, specifically, the following steps are performed layer by layer:
(3.1) calculating to obtain the activation value z of the output channel of each layerl←convl(zl-1)。
(3.2) initializing a binarization mask mlAnd ← 0, the role of the mask being to indicate the selected channel.
(3.3) first, u is introducedlThe sequence from high to low is performed. For all output channels of the current layer (with C)lRepresenting its number), the method preserves C corresponding to the highest channel utilityl(1-pl)=0.5ClThe activation value of each channel, specifically, the mask value of the corresponding position of these channels is set to 1
Figure BDA0002207145580000061
(3.4) multiplying the channel mask and the channel activation value by channel zl←zl·mlResult inInput to the next layer.
4. And calculating the error J of the final neural network output.
5. A back propagation process of the neural network is performed, specifically, the following steps are performed layer by layer.
(5.1) calculating the channel gradient of each layer of channels
Figure BDA0002207145580000062
(5.2) the significance evaluations described by equation (1) and equation (2) are calculated and normalized.
(5.3) calculating and updating the counter that appears in equation (3): if it is not
Figure BDA0002207145580000063
Otherwise it remains unchanged.
(5.4) updating the correlation matrix R according to equation (3)l
(5.5) updating the importance evaluation θ of the channel using equation (4)l←Rlθl
(5.6) updating the channel utility u described in equation (5)l←λul+(1-λ)θl
6. When the maximum training step length is reached or the neural network converges, the channel utility u according to each layerlAnd pruning a small half of the channel (the pruning rate of each layer is 0.5) layer by layer to obtain a channel corresponding to the channel utility and a convolution kernel for generating the channel. And copying the rest parameters into a more compact model, thereby realizing the training and pruning of the neural network.
The table below shows the accuracy that can be achieved by this method at different pruning rates (or compression rates) and in comparison with other methods. As shown in the table, when floating point operands, namely FLOPs, are compressed by about 35%, the method can still achieve 93.78% of accuracy, and the result exceeds the common norm-based pruning method; when the compression rate reaches 49.6%, the method can still maintain 93.68% of accuracy, and also exceeds the structured Bayes pruning method, and the compression rate is only 92.50%; when the compression rate reaches 75.2%, the method can achieve 92.72% of recognition accuracy of the compressed neural network with only 1.28% of accuracy loss. Therefore, the compression method can directly convert a redundant and unintelligent neural network into a compact neural network with rich expression capability.
Comparison of accuracy of different methods
Figure BDA0002207145580000071

Claims (1)

1. A deep neural network compression method based on brain consensus initiative is characterized by comprising the following steps:
(1) in each forward process of deep neural network training, for each layer of channel, according to the utility value of the initialized channel
Figure FDA0002207145570000011
And arranging the channels from high to low, reserving channel activation values corresponding to the channel utility values under the pruning rate according to the set pruning rate, and setting the channel activation values of the rest channels of the layer to zero. The above-mentioned
Figure FDA0002207145570000012
And (3) a long-term evaluation value of the channel of each layer of the neural network on the importance degree of the error of the deep neural network in the deep neural network training process, wherein l represents the index of the layer, and k represents the channel index of the layer.
(2) Determining normalized significance evaluation in back propagation process of deep neural network training
Figure FDA0002207145570000013
The method specifically comprises the following steps:
(2.1) in the back propagation process of deep neural network training, multiplying all channel activation values and gradients in each channel, accumulating and averaging to determine the significance evaluation of each channel
Figure FDA0002207145570000014
Figure FDA0002207145570000015
Wherein J represents an error function of the network;
Figure FDA0002207145570000016
represents the l < th > layer, the k < th > channel, the m < th > activation value; m is the number of all the activation values of one channel of the l-th layer.
(2.2) evaluation of significance of channelThrough L2Normalization processing is carried out on the norm to obtain normalized significance evaluation
Figure FDA0002207145570000018
Comprises the following steps:
Figure FDA0002207145570000019
wherein,is in the range of 0 to 1.
(3) And through a consensus initiative algorithm, the normalized significance evaluation among different channels is fused, and the interaction among the different channels is considered.
(3.1) evaluation of normalized significance between two channels by calculation
Figure FDA00022071455700000111
And
Figure FDA00022071455700000112
the product of the two channels is averaged according to the iteration number to obtain the correlation between the two channels
Figure FDA00022071455700000113
Figure FDA0002207145570000021
Wherein,representing the correlation among the ith channel, the jth channel and the ith layer, and the value range is 0-1,
Figure FDA0002207145570000023
and (4) participating in the iteration times of deep neural network training for the two channels.
(3.2) other channels of the same layer
Figure FDA0002207145570000024
And (3.1) calculated correlation
Figure FDA0002207145570000025
Multiplying and summing the two channels to obtain a fusion significance evaluation value
(3.3) mixing
Figure FDA0002207145570000028
Adding the strategy of moving average to the initial channel utility value in step 1
Figure FDA0002207145570000029
The method comprises the following steps:
Figure FDA00022071455700000210
wherein, λ represents an attenuation factor, the value range is between 0 and 1, and n is the iteration number of the channel participating in the deep neural network.
(4) Step 3 is carried out in a circulating way, and the channel utility values of all the channels are updated
Figure FDA00022071455700000211
Until the deep neural network converges.
(5) After the deep neural network converges, the utility value is calculated according to the channel
Figure FDA00022071455700000212
And arranging the channels layer by layer, pruning the channels corresponding to the channel utility values under the pruning rate and generating convolution kernels of the channels according to the preset pruning rate, and realizing model compression and acceleration.
CN201910885350.3A 2019-09-19 2019-09-19 Deep neural network compression method based on brain consensus initiative Pending CN110689113A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910885350.3A CN110689113A (en) 2019-09-19 2019-09-19 Deep neural network compression method based on brain consensus initiative

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910885350.3A CN110689113A (en) 2019-09-19 2019-09-19 Deep neural network compression method based on brain consensus initiative

Publications (1)

Publication Number Publication Date
CN110689113A true CN110689113A (en) 2020-01-14

Family

ID=69109619

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910885350.3A Pending CN110689113A (en) 2019-09-19 2019-09-19 Deep neural network compression method based on brain consensus initiative

Country Status (1)

Country Link
CN (1) CN110689113A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111931914A (en) * 2020-08-10 2020-11-13 北京计算机技术及应用研究所 Convolutional neural network channel pruning method based on model fine tuning
CN113283473A (en) * 2021-04-20 2021-08-20 中国海洋大学 Rapid underwater target identification method based on CNN feature mapping pruning
WO2021164752A1 (en) * 2020-02-21 2021-08-26 华为技术有限公司 Neural network channel parameter searching method, and related apparatus
WO2022022625A1 (en) * 2020-07-29 2022-02-03 北京智行者科技有限公司 Acceleration method and device for deep learning model
WO2022178908A1 (en) * 2021-02-26 2022-09-01 中国科学院深圳先进技术研究院 Neural network pruning method and apparatus, and storage medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021164752A1 (en) * 2020-02-21 2021-08-26 华为技术有限公司 Neural network channel parameter searching method, and related apparatus
WO2022022625A1 (en) * 2020-07-29 2022-02-03 北京智行者科技有限公司 Acceleration method and device for deep learning model
CN111931914A (en) * 2020-08-10 2020-11-13 北京计算机技术及应用研究所 Convolutional neural network channel pruning method based on model fine tuning
WO2022178908A1 (en) * 2021-02-26 2022-09-01 中国科学院深圳先进技术研究院 Neural network pruning method and apparatus, and storage medium
CN113283473A (en) * 2021-04-20 2021-08-20 中国海洋大学 Rapid underwater target identification method based on CNN feature mapping pruning
CN113283473B (en) * 2021-04-20 2023-10-13 中国海洋大学 CNN feature mapping pruning-based rapid underwater target identification method

Similar Documents

Publication Publication Date Title
CN110689113A (en) Deep neural network compression method based on brain consensus initiative
CN109978142B (en) Neural network model compression method and device
KR101113006B1 (en) Apparatus and method for clustering using mutual information between clusters
CN104866904B (en) A kind of BP neural network parallel method of the genetic algorithm optimization based on spark
CN111079899A (en) Neural network model compression method, system, device and medium
CN111738401A (en) Model optimization method, grouping compression method, corresponding device and equipment
CN115271099A (en) Self-adaptive personalized federal learning method supporting heterogeneous model
CN109151332B (en) Camera coding exposure optimal codeword sequence searching method based on fitness function
CN110535475B (en) Hierarchical adaptive normalized minimum sum decoding algorithm
CN111758104B (en) Neural network parameter optimization method and neural network calculation method and device suitable for hardware implementation
CN108764472A (en) Convolutional neural networks fractional order error back propagation method
CN113708969B (en) Collaborative embedding method of cloud data center virtual network based on deep reinforcement learning
JP2021022050A (en) Neural network compression method, neural network compression device, computer program, and method of producing compressed neural network data
CN109409505A (en) A method of the compression gradient for distributed deep learning
WO2021092796A1 (en) Neural network model deployment method and apparatus, and device
CN107748913A (en) A kind of general miniaturization method of deep neural network
CN114970853A (en) Cross-range quantization convolutional neural network compression method
CN117574429A (en) Federal deep learning method for privacy enhancement in edge computing network
CN110222816B (en) Deep learning model establishing method, image processing method and device
Kim et al. Fine-grained neural architecture search
CN117521763A (en) Artificial intelligent model compression method integrating regularized pruning and importance pruning
CN111488981A (en) Method for selecting sparse threshold of depth network parameter based on Gaussian distribution estimation
CN114419361A (en) Neural network image classification method based on gated local channel attention
CN117422151A (en) Federal learning method and device based on reasoning similarity and soft clustering
CN109711543B (en) Reconfigurable deep belief network implementation system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200114