CN112613610A - Deep neural network compression method based on joint dynamic pruning - Google Patents

Deep neural network compression method based on joint dynamic pruning Download PDF

Info

Publication number
CN112613610A
CN112613610A CN202011561741.9A CN202011561741A CN112613610A CN 112613610 A CN112613610 A CN 112613610A CN 202011561741 A CN202011561741 A CN 202011561741A CN 112613610 A CN112613610 A CN 112613610A
Authority
CN
China
Prior art keywords
pruning
channel
dynamic
neural network
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011561741.9A
Other languages
Chinese (zh)
Inventor
张明明
宋浒
卢庆宁
俞俊
温磊
刘文盼
范江
查易艺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NARI Group Corp
Information and Telecommunication Branch of State Grid Jiangsu Electric Power Co Ltd
Original Assignee
NARI Group Corp
Information and Telecommunication Branch of State Grid Jiangsu Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NARI Group Corp, Information and Telecommunication Branch of State Grid Jiangsu Electric Power Co Ltd filed Critical NARI Group Corp
Priority to CN202011561741.9A priority Critical patent/CN112613610A/en
Publication of CN112613610A publication Critical patent/CN112613610A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a deep neural network compression method based on joint dynamic pruning, which comprises the following steps: step 1: acquiring two super parameters of a convolution kernel dynamic pruning rate beta and a channel dynamic compression rate alpha; step 2: pruning M in convolutional layer by using convolutional kernel dynamic pruning methodi(1- β) convolution kernels; and step 3: selecting N by channel dynamic compression methodiAlpha channels participate in training; and 4, step 4: in the training process, the model parameters are updated, so that the dynamic pruning of the convolution kernel is converged into a subset of the dynamic compression of the channel. The invention accelerates the training and reasoning, maintains the capacity of the model and effectively reduces the floating point operation times and parameter scale of the model.

Description

Deep neural network compression method based on joint dynamic pruning
Technical Field
The invention relates to a neural network compression method, in particular to a deep neural network compression method based on joint dynamic pruning.
Background
While deep learning has promoted advances in computer vision, natural language processing, and the like, the complexity, high storage space, and computational resource consumption of the model make it difficult to be applied to various hardware platforms. For example, the number of parameters of a classic image classification network VGG16 is as large as 1.3 hundred million, the memory space is 500MB, and 309 million floating point operations are required to complete one image recognition task. In fact, a great degree of redundancy exists in the deep-learning neural network, and the rest of the weights can be predicted by using only a small part of the weights. Therefore, model compression is theoretically feasible and is necessary in reality.
Network pruning is a direction that is more popular in the field of model compression, and its principle is to remove the less important weights in the network and then fine-tune the network again to make it converge. Therefore, how to measure the importance of the weights becomes a core problem of network pruning. The static pruning method is common no matter what difference exists between the parameter selection standard and the pruning training process: the pruned parameters are permanently removed from the model and subsequently no longer participate in reasoning and training. Although most parameters in the network are redundant, the static pruning method still permanently removes a part of critical parameters, and it is difficult to avoid the miscut no matter what judgment standard is adopted, which inevitably results in the loss of network capacity.
Compared with a static pruning method, the purpose of dynamic pruning is to retain the capacity of the pruned part and avoid the capacity reduction of the model caused by permanent pruning. The idea is similar to the deep learning over-fitting prevention dropout method, except that dynamic pruning designs some criteria to measure the relationship of the input image to the convolution kernel, rather than simple random discarding. For a particular input image, the convolution kernels that can be activated are present and limited. However, the dynamic pruning algorithm is unique in that it dynamically selects, but it just restricts its ability to compress the network. Theoretically, the activated convolution kernel is fixed for a particular input image. But for uncertain input images, the activated convolution kernel cannot be determined. Since the feature distribution of the input image is difficult to know during the neural network construction and training process, if parts of the convolution kernels are permanently removed, there is still a small loss to the model capacity once these convolution kernels are activated by a particular input image. Thus, dynamic pruning algorithms have difficulty permanently removing the convolution kernel, resulting in network compression ratios that are significantly less than static pruning algorithms.
Disclosure of Invention
The purpose of the invention is as follows: the invention aims to solve the defects in the prior art, provides a deep neural network compression method based on joint dynamic pruning, and solves the problems that the static pruning method causes network capacity loss and the network compression ratio of the dynamic pruning method is obviously smaller than that of the static pruning algorithm.
The technical scheme is as follows: the invention relates to a deep neural network compression method based on joint dynamic pruning, which is characterized by comprising the following steps: the method comprises the following steps:
step 1: acquiring two super parameters of a convolution kernel dynamic pruning rate beta and a channel dynamic compression rate alpha;
step 2: a convolution kernel dynamic pruning method is utilized; cutting off M in convolutional layeri(1-. beta.) convolution kernels of which MiThe number of the ith layer of convolution kernels is;
and step 3: selecting N by channel dynamic compression methodiAlpha channels are involved in the training, where NiThe number of input channels of the ith layer is;
and 4, step 4: and updating the model parameters in the training process to ensure that the dynamic pruning of the convolution kernel is converged into a dynamically compressed subset of the channel.
The step 2 comprises the following steps:
step 21: obtaining an L1 norm of each convolution kernel in the same convolution layer;
step 22: m for minimizing the L1 norm in the convolutional layeriZeroing the (1-beta) convolution kernels;
step 23: updating the model parameters in the back propagation, and then returning to the step (21) to perform the next iteration of the model;
step 24: the convolution kernel of zero is pruned when the iteration is complete.
And step 2, removing the convolution kernel which has small influence on convolution operation after iterative convergence according to the principle of deleting the model parameters and reducing complexity.
The step 3 comprises the following steps:
step 31: sampling an input characteristic diagram by using a global average pooling method;
step 32: taking the samples of the input characteristic graph as the input of a prediction network, and calculating a channel importance function;
step 33: based on the principle of winner eating all the food, N with the minimum weight of the channel importance functioni(1- α) channels are zeroed;
step 34: replacing the scaling factor gamma of the convolutional layer corresponding to the BN layer with a channel importance function weight;
step 35: the model parameters are updated in the back propagation. The prediction network in step 32 is a fully connected neural network and is independent of the training network.
In step 3, according to the principle of reducing complexity while not deleting model parameters, selecting
NiAlpha channels participate in the training. .
Has the advantages that: compared with the prior art, the method has the obvious advantages of accelerating training and reasoning, maintaining the capacity of the model and effectively reducing the floating point operation times and parameter scale of the model.
Drawings
FIG. 1 is a schematic diagram of the present invention;
FIG. 2 is a schematic diagram of convolution kernel dynamic pruning in accordance with the present invention;
FIG. 3 is a schematic diagram of dynamic compression of a channel according to the present invention.
Detailed Description
The technical scheme of the invention is further explained by combining the attached drawings.
FIG. 1 is a schematic diagram of the present invention, which mainly includes a convolution kernel dynamic pruning module and a channel dynamic compression module. The invention relates to a deep neural network compression method based on joint dynamic pruning, which comprises the following steps:
step 1: acquiring two super parameters of a convolution kernel dynamic pruning rate beta and a channel dynamic compression rate alpha;
step 2: pruning M in convolutional layer by using convolutional kernel dynamic pruning methodi(1-. beta.) convolution kernels of which MiThe number of the ith layer of convolution kernels is;
and step 3: selecting N by channel dynamic compression methodiAlpha channels are involved in the training, where NiThe number of input channels of the ith layer is;
and 4, step 4: and updating the model parameters in the training process to ensure that the dynamic pruning of the convolution kernel is converged into a dynamically compressed subset of the channel.
And 2, pruning a convolution kernel which has the minimum influence on convolution operation after iterative convergence according to the principle of deleting the model parameters and reducing the complexity. The step 2 comprises the following steps:
step 21: obtaining an L1 norm of each convolution kernel in the same convolution layer;
step 22: m for minimizing the L1 norm in the convolutional layeri(1- β) zero-setting of convolution kernels;
step 23: updating the model parameters in the back propagation, and then returning to the step (21) to perform the next iteration of the model;
step 24: the convolution kernel of zero is pruned when the iteration is complete.
In step 3, N is selected according to the principle of reducing complexity without deleting model parametersiAlpha channels participate in the training. The step 3 comprises the following steps:
step 31: sampling an input characteristic diagram by using a global average pooling method; the global average pooling method is to average all elements of the same channel of the feature map, i.e. to convert the feature map of W × H × N into a tensor of 1 × N, and to further reduce the dimension into a feature vector of 1 × N, where W is the image width, H is the image height, and N is the number of channels.
Step 32: taking the samples of the input characteristic graph as the input of a prediction network, and calculating the channel importance function of the prediction network;
step 33: based on the principle of winner eating all the food, N with the minimum weight in the channel importance functioniZeroing (1-alpha) channels; the winner takes the main of eating and gets all or most of the share for the last winner of competition, while the loser is eliminated.
Step 34: replacing the scaling factor gamma of the convolutional layer corresponding to the BN layer with a channel importance function weight;
step 35: the model parameters are updated in the back propagation.
The prediction network in step 32 is a fully connected neural network and is independent of the training network.
In this embodiment, each step is described using a simple convolutional neural network, i-th convolutional layer. The input characteristic diagram of the ith convolution layer is XiNumber of input channels Ni4; the output characteristic diagram of the ith convolution layer is Xi+1Number of output channels Ni+14. Number M of convolution kernels of ith convolution layeriEqual to the number of output channels Ni+1I.e. equal to 4. Let the convolution kernel dynamic pruning rate β be 0.75 and the channel dynamic compression rate α be 0.5.
In step 2, the schematic diagram of convolution kernel dynamic pruning is shown in fig. 2. First, the L1 norm of each convolution kernel in the ith convolution layer is calculated. The L1 norm is the sum of the absolute values of the weights in the convolution kernel tensor. And the convolution kernel with smaller weight has smaller corresponding output characteristic diagram value, namely the influence is not the same as that of other characteristic diagrams in the same layer. The L1 norm of the layer convolution kernel is [0.457, 0.813, 0.345, 0.136]The 4 th convolution kernel belongs to the 25% proportion of the L1 norm minimum. For this part of the convolution kernel, a zeroing operation is performed, but they are not pruned from the network. If there is a mis-pruned convolution kernel, it will be updated to a larger weight in the back propagationTherefore, higher importance is obtained in subsequent judgment, and the miscut does not occur. During the course of continuous iterative training, the convolution kernel of accidental mis-pruning is recovered, so that the tendency of each pruned part is converged to Ni+1(1-. beta.) fixed convolution kernels, i.e., 1 fixed convolution kernel for the convolutional layer. And finally, permanently pruning the least important convolution kernels to complete the dynamic pruning work of the convolution kernels.
In step 3, the schematic diagram of channel dynamic compression is shown in fig. 3. The importance of the corresponding channel is judged according to the input image, a function is needed, the input of the function is the input image, the output is the importance of different channels of the scalar value, and the method is realized by adopting a prediction network. Since the complexity of this additional operation needs to be reduced as much as possible, the input feature map needs to be sampled first. Global average pooling is used as a down-sampling method to reduce the size of the image and compress each channel of the input feature map into a scalar. For the input image Xi of the ith convolution layer, its size is Wi×Hi×NiWherein W isiIs the image width, HiFor image height, a 1 × 1 × 4 tensor is obtained by global average pooling, and a 1 × 4 tensor is obtained by further dimension reduction, for example [0.621, 1.846, 0.743, 0.543]. The sampled result is then used as input to the prediction network.
The prediction network is a fully connected neural network layer separated from the training network, and the input channel number of the prediction network layer is NiThe number of output channels is Ni+1. In this embodiment, the input channel and the output channel of the prediction network of the i-th layer are 4 and 4, respectively. The prediction network obtains the relationship between the input characteristic diagram and the channel through learning to obtain a channel importance function. The weighted output of the channel importance function in this embodiment is [0.842, 1.313, 0.594, 0.218 ]]N of lower importance based on the principle of the winner eating it alli+1(1- α), i.e., the corresponding output results for 2 channels, are zeroed, resulting in [0.842, 1.313, 0]. Then, in the BN layer corresponding to the ith convolutional layer, the value of the scaling factor γ is replaced with the weight of the channel importance function, which is equivalent to skipping the convolution operation of the fractional-1- α channel. Predictive and classification networksThe collaterals are trained together, so that manual interference statistics is avoided.
In the step (4), when training is finished, the dynamic pruning of the convolution kernel is finally converged into a subset of the dynamic compression of the channel. The convolution kernel dynamic pruning selects Ni+1The (1-alpha) channels are zeroed out, and the channel dynamic compression does not converge to the zeroed-out convolution kernels in the learning process.
In this example, the network is pruned of the 1- α convolution kernels, i.e., the least significant 50% of the convolution kernels in each convolution layer are removed. The actual permanently removed convolution kernel in the model is 1- β, i.e., the actual permanently removed convolution kernel is 25%, but due to channel dynamic compression, the complexity of the model is equivalent to a model with a permanent removal of 50% of the convolution kernel.
The invention simultaneously completes the pruning of the convolution kernel and the channel, can be freely matched for use, reduces the complexity of model training and reasoning, improves the processing speed of the characteristic diagram and simultaneously keeps the capacity of the model.

Claims (6)

1. A deep neural network compression method based on joint dynamic pruning is characterized in that: the method comprises the following steps:
step 1: acquiring two super parameters of a convolution kernel dynamic pruning rate beta and a channel dynamic compression rate alpha;
step 2: pruning M in convolutional layer by using convolutional kernel dynamic pruning methodi(1-. beta.) convolution kernels of which MiThe number of the ith layer of convolution kernels is;
and step 3: selecting N by channel dynamic compression methodiAlpha channels are involved in the training, where NiThe number of input channels of the ith layer is;
and 4, step 4: and updating the model parameters in the training process to ensure that the dynamic pruning of the convolution kernel is converged into a dynamically compressed subset of the channel.
2. The deep neural network compression method based on joint dynamic pruning according to claim 1, wherein: the step 2 comprises the following steps:
step 21: obtaining an L1 norm of each convolution kernel in the same convolution layer;
step 22: m for minimizing the L1 norm in the convolutional layeriZeroing the (1-beta) convolution kernels;
step 23: updating the model parameters in the back propagation, and then returning to the step (21) to perform the next iteration of the model;
step 24: the convolution kernel of zero is pruned when the iteration is complete.
3. The deep neural network compression method based on joint dynamic pruning according to claim 1, wherein: and 2, cutting out a convolution kernel of which the influence on the convolution operation is less than a set threshold value after the iterative convergence according to the principle of deleting the model parameters and reducing the complexity.
4. The deep neural network compression method based on joint dynamic pruning according to claim 1, wherein: the step 3 comprises the following steps:
step 31: sampling an input characteristic diagram by using a global average pooling method;
step 32: taking the samples of the input characteristic graph as the input of a prediction network, and calculating the channel importance function of the prediction network;
step 33: based on the principle of winner eating all the food, N with the minimum weight of the channel importance functioni(1- α) channels are zeroed;
step 34: replacing the scaling factor gamma of the convolutional layer corresponding to the BN layer with a channel importance function weight;
step 35: the model parameters are updated in the back propagation.
5. The deep neural network compression method based on joint dynamic pruning of claim 4, wherein: the prediction network in step 32 is a fully connected neural network and is independent of the training network.
6. The deep neural network compression method based on joint dynamic pruning according to claim 1, wherein: in step 3, N is selected according to the principle of reducing complexity while not deleting model parametersiAlpha channels participate in the training.
CN202011561741.9A 2020-12-25 2020-12-25 Deep neural network compression method based on joint dynamic pruning Pending CN112613610A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011561741.9A CN112613610A (en) 2020-12-25 2020-12-25 Deep neural network compression method based on joint dynamic pruning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011561741.9A CN112613610A (en) 2020-12-25 2020-12-25 Deep neural network compression method based on joint dynamic pruning

Publications (1)

Publication Number Publication Date
CN112613610A true CN112613610A (en) 2021-04-06

Family

ID=75245247

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011561741.9A Pending CN112613610A (en) 2020-12-25 2020-12-25 Deep neural network compression method based on joint dynamic pruning

Country Status (1)

Country Link
CN (1) CN112613610A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113065644A (en) * 2021-04-26 2021-07-02 上海哔哩哔哩科技有限公司 Method, apparatus, device and medium for compressing neural network models
CN113570035A (en) * 2021-07-07 2021-10-29 浙江工业大学 Attention mechanism method using multilayer convolution layer information
CN112949840B (en) * 2021-04-20 2024-02-02 中国人民解放军国防科技大学 Channel attention-guided convolutional neural network dynamic channel pruning method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109886397A (en) * 2019-03-21 2019-06-14 西安交通大学 A kind of neural network structure beta pruning compression optimization method for convolutional layer
CN110263841A (en) * 2019-06-14 2019-09-20 南京信息工程大学 A kind of dynamic, structured network pruning method based on filter attention mechanism and BN layers of zoom factor
US20200082268A1 (en) * 2018-09-11 2020-03-12 National Tsing Hua University Electronic apparatus and compression method for artificial neural network
CN111242287A (en) * 2020-01-15 2020-06-05 东南大学 Neural network compression method based on channel L1 norm pruning
CN111340225A (en) * 2020-02-28 2020-06-26 中云智慧(北京)科技有限公司 Deep convolution neural network model compression and acceleration method
CN111652366A (en) * 2020-05-09 2020-09-11 哈尔滨工业大学 Combined neural network model compression method based on channel pruning and quantitative training
US20200364573A1 (en) * 2019-05-15 2020-11-19 Advanced Micro Devices, Inc. Accelerating neural networks with one shot skip layer pruning

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200082268A1 (en) * 2018-09-11 2020-03-12 National Tsing Hua University Electronic apparatus and compression method for artificial neural network
CN109886397A (en) * 2019-03-21 2019-06-14 西安交通大学 A kind of neural network structure beta pruning compression optimization method for convolutional layer
US20200364573A1 (en) * 2019-05-15 2020-11-19 Advanced Micro Devices, Inc. Accelerating neural networks with one shot skip layer pruning
CN110263841A (en) * 2019-06-14 2019-09-20 南京信息工程大学 A kind of dynamic, structured network pruning method based on filter attention mechanism and BN layers of zoom factor
CN111242287A (en) * 2020-01-15 2020-06-05 东南大学 Neural network compression method based on channel L1 norm pruning
CN111340225A (en) * 2020-02-28 2020-06-26 中云智慧(北京)科技有限公司 Deep convolution neural network model compression and acceleration method
CN111652366A (en) * 2020-05-09 2020-09-11 哈尔滨工业大学 Combined neural network model compression method based on channel pruning and quantitative training

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ZHANG CHILIANG; HU TAO; GUAN YINGDA; YE ZUOCHANG: "Accelerating Convolutional Neural Networks with Dynamic Channel Pruning", 2019 DATA COMPRESSION CONFERENCE (DCC), 29 March 2019 (2019-03-29) *
靳丽蕾;杨文柱;王思乐;崔振超;陈向阳;陈丽萍;: "一种用于卷积神经网络压缩的混合剪枝方法", 小型微型计算机系统, no. 12, 11 December 2018 (2018-12-11) *
韩冰冰: "基于通道剪枝的模型压缩和加速算法研究", 中国优秀硕士学位论文全文数据库 信息科技辑, no. 07, 15 July 2019 (2019-07-15) *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112949840B (en) * 2021-04-20 2024-02-02 中国人民解放军国防科技大学 Channel attention-guided convolutional neural network dynamic channel pruning method and device
CN113065644A (en) * 2021-04-26 2021-07-02 上海哔哩哔哩科技有限公司 Method, apparatus, device and medium for compressing neural network models
CN113065644B (en) * 2021-04-26 2023-09-29 上海哔哩哔哩科技有限公司 Method, apparatus, device and medium for compressing neural network model
CN113570035A (en) * 2021-07-07 2021-10-29 浙江工业大学 Attention mechanism method using multilayer convolution layer information
CN113570035B (en) * 2021-07-07 2024-04-16 浙江工业大学 Attention mechanism method utilizing multi-layer convolution layer information

Similar Documents

Publication Publication Date Title
CN112613610A (en) Deep neural network compression method based on joint dynamic pruning
KR102640237B1 (en) Image processing methods, apparatus, electronic devices, and computer-readable storage media
CN109271933B (en) Method for estimating three-dimensional human body posture based on video stream
CN111583135B (en) Nuclear prediction neural network Monte Carlo rendering image denoising method
CN110349103A (en) It is a kind of based on deep neural network and jump connection without clean label image denoising method
CN108288270B (en) Target detection method based on channel pruning and full convolution deep learning
CN109934826A (en) A kind of characteristics of image dividing method based on figure convolutional network
CN109146813B (en) Multitask image reconstruction method, device, equipment and medium
CN113420651B (en) Light weight method, system and target detection method for deep convolutional neural network
CN109191411B (en) Multitask image reconstruction method, device, equipment and medium
CN112488070A (en) Neural network compression method for remote sensing image target detection
CN113554084B (en) Vehicle re-identification model compression method and system based on pruning and light convolution
CN115205147A (en) Multi-scale optimization low-illumination image enhancement method based on Transformer
CN113420794B (en) Binaryzation Faster R-CNN citrus disease and pest identification method based on deep learning
CN111860771A (en) Convolutional neural network computing method applied to edge computing
CN116740119A (en) Tobacco leaf image active contour segmentation method based on deep learning
CN112101364A (en) Semantic segmentation method based on parameter importance incremental learning
CN111325222A (en) Image normalization processing method and device and storage medium
CN115969329A (en) Sleep staging method, system, device and medium
CN111462090A (en) Multi-scale image target detection method
CN113393476B (en) Lightweight multi-path mesh image segmentation method and system and electronic equipment
CN117575915A (en) Image super-resolution reconstruction method, terminal equipment and storage medium
CN113850365A (en) Method, device, equipment and storage medium for compressing and transplanting convolutional neural network
CN113657392A (en) Small target semantic segmentation method and system based on low-rank mixed attention mechanism
CN113780550A (en) Convolutional neural network pruning method and device for quantizing feature map similarity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination