CN114937186A - Neural network data-free quantification method based on heterogeneous generated data - Google Patents
Neural network data-free quantification method based on heterogeneous generated data Download PDFInfo
- Publication number
- CN114937186A CN114937186A CN202210673423.4A CN202210673423A CN114937186A CN 114937186 A CN114937186 A CN 114937186A CN 202210673423 A CN202210673423 A CN 202210673423A CN 114937186 A CN114937186 A CN 114937186A
- Authority
- CN
- China
- Prior art keywords
- network
- quantization
- loss
- false
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 36
- 238000011002 quantification Methods 0.000 title claims description 11
- 238000013139 quantization Methods 0.000 claims abstract description 69
- 238000012549 training Methods 0.000 claims abstract description 38
- 238000009826 distribution Methods 0.000 claims abstract description 15
- 238000004821 distillation Methods 0.000 claims abstract description 11
- 230000002787 reinforcement Effects 0.000 claims abstract description 7
- 230000004913 activation Effects 0.000 claims description 12
- 239000000126 substance Substances 0.000 claims description 7
- 238000009827 uniform distribution Methods 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 3
- 230000003014 reinforcing effect Effects 0.000 claims description 2
- 102000001690 Factor VIII Human genes 0.000 claims 1
- 108010054218 Factor VIII Proteins 0.000 claims 1
- 238000007906 compression Methods 0.000 abstract description 7
- 230000006835 compression Effects 0.000 abstract description 7
- 230000001133 acceleration Effects 0.000 abstract description 6
- 230000008447 perception Effects 0.000 abstract description 4
- 238000013527 convolutional neural network Methods 0.000 description 7
- 238000013459 approach Methods 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 230000015556 catabolic process Effects 0.000 description 3
- 238000006731 degradation reaction Methods 0.000 description 3
- 238000003909 pattern recognition Methods 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 239000002131 composite material Substances 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 101100153586 Caenorhabditis elegans top-1 gene Proteins 0.000 description 1
- 101100370075 Mus musculus Top1 gene Proteins 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000006866 deterioration Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Multimedia (AREA)
- Facsimile Image Signal Circuits (AREA)
Abstract
A neural network data-free quantization method based on heterogeneous generated data relates to the compression and acceleration of an artificial neural network. The method comprises the following steps: 1) the false pictures are randomly initialized using a standard gaussian distribution. 2) Optimizing and initializing the false picture until the iteration times reach the limit, and updating the false picture by using local object reinforcement, boundary distance limit, soft perception loss and BN loss; 3) quantizing the neural network, and then training the quantized network by using distillation loss and cross entropy loss by using an optimized false picture until a preset number of training rounds is reached; 4) and (5) after the training is finished, keeping the weight of the quantization network, and obtaining the quantized quantization network. Real data is not needed, a quantitative network can be obtained through training from the beginning, and network compression and acceleration can be achieved on a general hardware platform under the condition that specific hardware support is not needed.
Description
Technical Field
The invention relates to compression and acceleration of an artificial neural network, in particular to a neural network data-free quantization method based on heterogeneous generated data.
Background
In recent years, Deep Neural Networks (DNNs) have been widely used in many fields such as computer vision and natural language processing. Despite the tremendous success of DNNs, the increasing network size hinders the deployment of DNNs on many resource-limited platforms, such as mobile phones, embedded devices, and the like. To overcome this dilemma, the academia and industry explore a variety of ways to reduce the complexity of DNNs, and network quantization to represent full-precision DNNs in a low-precision format is a promising direction.
Most existing methods belong to quantitative perceptual training, where quantization is performed on the premise that an original complete training data set can be obtained. However, the disadvantage also stems from its dependence on training data. In many practical situations, the original training data is sometimes prohibited from being accessed due to the ongoing deterioration of privacy and security issues. For example, people may not want their medical records to be disclosed to others, nor do business materials want to be disseminated over the internet. Therefore, the quantitative perceptual training is no longer applicable.
How to obtain quantized DNN without data is highly regarded by academic and industrial circles. Existing data-free quantification studies can be broadly divided into two categories:
the first class of dataless quantization methods does not utilize any data at all, but instead focuses on the calibration parameters. For example, DFQ (Nagel M, Baalen M, Blankovort T, et al. data-free quantization and bias correction [ C ]// Proceedings of the IEEE/CVF International Conference on Computer Vision.2019: 1325-. Simple parameter calibration tends to result in severe performance degradation. This problem is even magnified for ultra low precision cases. For example, when ResNet-18(He K, Zhang X, Ren S, et al. deep residual learning for image recognition [ C ]// Proceedings of the IEEE Conference on Computer vision and pattern recognition.2016:770-778.) is quantized to 4 bits, only the 0.10% top-1 precision of DFQ on image is reported in the GDFQ' S (Xu S, Li H, Zhuang B, et al. genetic low-bit data free quantization [ C ]// European Conference on Computer vision. Springer, Cham 2020:1-17.) appendix.
The second category helps to train the quantization network by using synthetic false images, and an intuitive solution is to deploy a generator to synthesize the training data. The generator-based approach has a large overhead in computational resources because the introduced generator must be trained from scratch for different bit settings. ZeroQ (Cai Y, Yao Z, Dong Z, et al. Zeroq: A novel zero shot quantization frame [ C ]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern registration. 2020:13169-13178.) and DSG (Zhang X, Qin H, Ding Y, et al. conversion sampling generation for encryption data-free quantization [ C ]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern registration. 2021:15658-15667.) describe data synthesis as an optimization problem in which random input data extracted from a standard Gaussian distribution is iteratively updated to fit the true data distribution. The benefit of this research route is that the composite image can be reused to calibrate or fine tune the network for different bit widths, thus enabling resource-friendly quantization. However, when comparing the feature visualizations of ZeroQ and DSG with real data, there is still a non-negligible quality gap in the composite image, since traditional gaussian synthesis is to fit the entire dataset, ignoring more subtle decision-like boundaries. Therefore, the quantization model usually suffers from a large performance degradation. To ensure decision-like boundaries in false images, perceptual loss IL (Haroush M, Hubara I, Hoffer E, et al. the knowledge with: Methods for data-free model compression [ C ]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern registration.2020: 8494-. Thus, the false data shows a separable distribution, and such false pictures do not capture the intra-class heterogeneity well. Images from the same class often contain different content; features from the same class of real pictures are very scattered and heterogeneous. Feature clustering of ZeroQ + IL and DSG + IL indicates that the same class of synthetic images is mostly homogenous. Quantization models trimmed with these false pictures do not generalize well to real test datasets with heterogeneity.
Disclosure of Invention
The invention aims to provide a neural network data-free quantization method based on heterogeneous generated data, aiming at the problems of performance degradation and the like caused by the current neural network data-free quantization method. Real data is not needed, a quantization network can be obtained through training from the beginning, the performance is higher, and particularly when a small network is quantized, compression and acceleration of the network can be realized on a general hardware platform under the condition that specific hardware support is not needed.
The invention comprises the following steps:
1) randomly initializing a false picture by using a standard Gaussian distribution;
2) optimizing the false picture until the iteration times reach the limit, and updating the false picture by using local object reinforcement, boundary distance limit, soft perception loss and BN loss;
3) quantizing the neural network, and then training the quantized network by using the optimized false picture in the step 2) and using distillation loss and cross entropy loss until a preset training round number is reached;
4) and (5) after the training is finished, keeping the weight of the quantization network, and obtaining the quantized quantization network.
In step 1), the randomly initializing false picture by using the standard gaussian distribution is to generate an initializing false picture with the same size as the real picture from the standard gaussian distribution sampling.
In step 2), the specific method for reinforcing the local object may be: random crop (crop), scale (reszie) with a probability of p ═ 50% before the fake pictures are input into the pre-trained network:
wherein, crop η The scale representing the clipping is sampled from the uniform distribution U (η,1),representing a false picture after local object reinforcement;
the specific method of the boundary distance limitation may be: the false pictures are limited to keep a certain distribution in the feature space of the pre-trained network:
wherein M is c A set of features representing all the false drawings of the same category as the ith false drawing;
the soft perceptual loss is to provide a soft target for the false picture:
wherein U (e, 1) represents a uniform distribution from e to 1, and mes represents an average square error;
the BN loss:
wherein, mu l (x f ),σ l (x f ) Representing a false picture x f At the mean and variance of the l-th layer of the pre-training network,representing BN parameters stored in the first layer of the pre-training network during training;
BN losses combining the aboveBoundary distance limitationLoss of soft feelThe total losses that can be obtained are:
in the step 3), the quantization neural network quantizes the pre-trained full-precision network to obtain a quantization network Q; the quantization is as follows:
wherein clip (F, l, u) ═ min (max (F, l), u), l, u denote the upper and lower clipping boundaries; representing a full precision input, which may be a network weight or an activation value; round denotes rounding its input to the nearest integer;is a scaling factor for interconverting a full precision number and an integer, b denotes the quantization bit width; for the weight, a channel-by-channel quantization mode is used, and for the activation value, a layer-by-layer quantization mode is used; after the quantization value q is obtained, it is dequantized back by the scaling factor
Training a quantization network using distillation loss, cross entropy loss, wherein cross entropy loss:
wherein the content of the first and second substances,representing a predicted value of the pre-trained full-precision network about the ith input picture belonging to the y-th class, wherein N represents a total of N input pictures;
distillation loss:
wherein, the first and the second end of the pipe are connected with each other,indicating that the quantization network belongs to a predictor of class c with respect to the ith input picture,and C represents the number of data set categories, and N represents N input pictures in total.
Compared with the prior art, the invention has the following outstanding advantages:
1) the heterogeneity of the false picture can be preserved, and the quality of the false picture is greatly improved.
2) Through a large number of experiments, the neural network non-data quantization method based on heterogeneous generated data is simple to implement, improves performance, and simultaneously exceeds various mainstream neural network non-data quantization methods in performance, especially when all layers are quantized into very low bits or smaller neural networks.
3) Real data is not needed, a quantization network can be obtained through training from the beginning, the performance is higher, and particularly when a small network is quantized, compression and acceleration of the network can be realized on a general hardware platform under the condition that specific hardware support is not needed. The method can be applied to a convolutional neural network in the field of image classification.
Drawings
FIG. 1 is a method block diagram of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments will be further described with reference to the accompanying drawings.
A method block diagram of an embodiment of the invention is shown in fig. 1.
1. Description of the symbols
F(W 1 ,W 2 ,…,W L ) A full-precision Convolutional Neural Network (CNN) representing one L layer, where W i Represents the ith convolution layer, the number of convolution kernels of the ith convolution layer is out i The convolution kernel weight for this layer can be expressed as:
wherein the content of the first and second substances,a jth convolution kernel representing an ith convolution layer, each convolution kernelIn order to realize the purpose of the method,
therein, in i ,width i ,height i The number of input channels of the ith layer, and the width and height of the convolution kernel are respectively. Given the input a of the ith convolutional layer i-1 (i.e., the output of the previous layer), the convolution result of the ith convolutional layer can be expressed as:
wherein the content of the first and second substances,is the jth channel of the j convolution result, and all channels are collected to obtain O i ,Representing a convolution operation. Then, the convolution result is passed through an activation function to obtain a final output activation value of the layer:
A i =σ(O i )
σ denotes the activation function.
The goal of the quantization algorithm is to obtain a neural network that can operate with low bits, when the convolution operation is expressed as:
wherein, the first and the second end of the pipe are connected with each other,a jth channel representing the quantized jth convolution kernel of the ith layer and the input of the ith layer. At the moment, the quantization algorithm can obtain an L-layer low-precision convolutional neural networkWherein, the first and the second end of the pipe are connected with each other,representing the ith convolutional layer that has been quantized.
To obtain a quantized network, the pre-trained full precision network is quantized. The quantization is as follows:
where clip (m, l, u) is min (max (m, l), u), and l, u represents the upper and lower clipping boundaries. m represents a full precision input, which may be a network weight W or an activation value a. round means rounding its input to the nearest integer.Is a scaling factor for interconverting a full precision number and an integer, b denotes the quantization bit width. For the weights, a channel-by-channel quantization approach is used, i.e., each output channel has a separate clipping upper and lower bounds and scaling factor. For the activation values, a layer-by-layer quantization approach is used, i.e., each layer shares the same clipping upper and lower bounds and scaling factors. After the quantized value q is obtained, it can be dequantized back with a scaling factorAnd then the operation is performed. For the convolution operation of the two quantized values, one can use:
wherein s is 1 ,s 2 Can be stored by pre-calculation, and q 1 ,q 2 All are low precision values, so that the original full-precision operation can be replaced by only low-precision convolution operation.
2. Heterogeneous data analysis
The existing neural network non-data quantization method is limited by poor quality of false pictures, and the performance is obviously reduced when all network layers are quantized into low bits. In order to improve the performance of the quantization network, the invention provides a neural network data-free quantization method based on heterogeneous generated data. Based on the characteristic that real data has heterogeneity, the false picture is updated by local object reinforcement, boundary distance limitation, soft perception loss and BN loss, so that the picture with heterogeneity is generated.
3. Description of the training
The embodiment of the invention comprises the following steps:
1) randomly initializing false pictures by using standard Gaussian distribution;
2) optimizing the false picture until the iteration number reaches the limit, and updating the false picture by using local object reinforcement, boundary distance limit, soft sensing loss and BN loss;
3) quantizing the neural network, and then training the quantized network by using distillation loss and cross entropy loss by using an optimized false picture until a preset number of training rounds is reached;
4) and (5) after the training is finished, keeping the weight of the quantization network, and obtaining the quantized quantization network.
In step 1), an initialization dummy picture with the same size as the real picture is generated from the standard gaussian distribution samples.
In step 2), the dummy picture is optimized and initialized until the iteration number reaches the limit, and the dummy picture is updated by using local object reinforcement, boundary distance limitation, soft sensing loss and BN loss.
Local object enhancement, random crop (crop), scale (rescie) with a probability p 50% before a fake picture is input into the pre-trained network:
wherein crop η The scale representing the clipping is sampled from the uniform distribution U (η,1),showing a false picture after local object enhancement.
In the boundary distance limitation, the false pictures are limited to keep a certain distribution in the feature space of the pre-trained network:
wherein M is c A set of features representing all the false drawings of the same category as the ith false drawing.
In soft perception loss, a soft target is provided for false pictures:
where U (e, 1) represents the uniform distribution from e to 1 and mes represents the mean squared error.
Loss of BN:
μ l (x f ),σ l (x f ) Representing a false picture x f The mean and variance at the l-th layer of the pre-trained network,and the stored BN parameters of the l-th layer of the pre-training network during training are shown.
Combining the above losses, the total loss is:
and 3), quantizing the neural network, and training the quantized network by using distillation loss and cross entropy loss by using the optimized false picture until a preset training round number is reached.
And in the quantization neural network, quantizing the pre-trained full-precision network to obtain a quantization network Q. The quantization is as follows:
where clip (F, l, u) ═ min (max (F, l), u), and l, u denote the upper and lower clipping boundaries. F represents a full precision input, which may be a network weight or an activation value. round means rounding its input to the nearest integer.Is a scaling factor for interconverting a full precision number and an integer, b denotes the quantization bit width. For the weights, a channel-by-channel quantization approach is used, and for the activation values, a layer-by-layer quantization approach is used. After the quantized value q is obtained, it can be dequantized back with a scaling factor
Training the quantization network using distillation loss, cross entropy loss, wherein cross entropy loss:
wherein the content of the first and second substances,and the pre-trained full-precision network represents that the ith input picture belongs to a predicted value of the y-th class, and N represents a total of N input pictures.
Distillation loss:
wherein the content of the first and second substances,indicating that the quantization network belongs to a predictor of class c with respect to the ith input picture,and C represents the number of data set categories, and N represents N input pictures in total.
4. Implementation details
The neural network data-free quantification method based on heterogeneous generated data uses an ImageNet data set for effect evaluation, and uses a Pythrch deep learning framework to implement on an NVIDIAGTX 3090 display card. For the generation of false data, the optimizer is set to Adam, the momentum of the optimizer is set to 0.9, as long as the loss is not reduced in 50 iterations, the learning rate is reduced to 0.1, the total number of iterations is 1000, the size of batch is set to 256, eta, lambda l ,λ u Epsilon is set to; 0.5,0.3,0.8,0.9. A total of 5120 pictures were generated. For training the quantization network, its optimizer uses a Stochastic Gradient Descent (SGD), the optimizer momentum is set to 0.9, and the weight decay is set to 10e-4 to adjust the learning rate of the quantization network. The batch size is set to 16, the initial learning rate is set to 10e-6, the learning rate is reduced to 0.1 per 100 rounds, and the number of rounds of total training is set to 150.
5. Field of application
The method can be applied to the field of the deep convolutional neural network CNN, and compression and acceleration of the deep convolutional neural network are realized. Table 1 shows ResNet18(He K, Zhang X, Ren S, et al. deep residual learning for image recognition [ C ]// Proceedings of the IEEE conference on computer vision and pattern recognition.2016:770-778.) the results of the method are compared with the results of other neural network post-training quantification methods on the ImageNet dataset, wherein Gner generates a false picture by using a generator, and WbAb represents the quantification model weight and the activation value into b bits.
TABLE 1
As can be seen from Table 1, when all layers of the model are scaled to 5 bits, the method of the present invention (IntraQ) and the latest neural network non-data quantization method can maintain high performance, and in addition, compared with BRECQ, when ResNet-18 is quantized to 4 bits, the method of the present invention achieves great performance improvement, with 1.97% of performance.
Table 2 shows the comparison of the results of the method on the ImageNet dataset with other neural network post-training quantification methods for MobileNet V1(Howard A G, Zhu M, Chen B, et al. Mobilenes: Efficient connected neural networks for mobile vision applications [ J ]. arXiv preprinting: 1704.04861,2017.).
TABLE 2
From table 2, it can be seen that the present invention can improve the previous highest performance by 9.17% when quantifying MobileNetV1 to 4 bits.
Table 3 shows the results of MobileNet V2(Sandler M, Howard A, Zhu M, et al. Mobileneetv2: Inverted responses and linear botterns [ C ]// Proceedings of the IEEE conference on computer vision and pattern recognition.2018:4510-4520.) on the ImageNet dataset comparing the results of the method with other neural network post-training quantification methods.
TABLE 3
From table 3, it can be seen that the present invention can improve the previous maximum performance by 4.65% when quantifying MobileNetV2 to 4 bits.
Thus, the performance advantage of the present invention is more pronounced when quantifying lightweight models such as MobileNet, especially at lower precision (e.g., 4 bits). Moreover, the invention achieves the best results on all networks and bits, which proves the effectiveness of the invention.
The above-described embodiments are merely preferred embodiments of the present invention, and should not be construed as limiting the scope of the invention. All equivalent changes and modifications made within the scope of the present invention shall fall within the scope of the present invention.
Claims (8)
1. The neural network data-free quantification method based on heterogeneous generated data is characterized by comprising the following steps of:
1) randomly initializing a false picture by using a standard Gaussian distribution;
2) optimizing the false picture until the iteration number reaches the limit, and updating the false picture by using local object reinforcement, boundary distance limit, soft sensing loss and BN loss;
3) firstly quantizing the neural network, and then training the quantized network by using distillation loss and cross entropy loss by using an optimized false picture until a preset number of training rounds is reached;
4) and (5) after the training is finished, keeping the weight of the quantization network, and obtaining the quantized quantization network.
2. The method according to claim 1, wherein in step 1), the randomly initializing false pictures using the standard gaussian distribution is an initializing false picture generated from a sampling of the standard gaussian distribution and having a size consistent with a real picture.
3. The neural network data-free quantification method based on heterogeneous generated data as claimed in claim 1, wherein in the step 2), the specific method for reinforcing the local object is as follows: random crop (crop), scale (reszie) with a probability of p ═ 50% before the fake pictures are input into the pre-trained network:
4. The neural network data-free quantification method based on heterogeneous generated data as claimed in claim 1, wherein in step 2), the specific method of the boundary distance limitation is: the false pictures are limited to keep a certain distribution in the feature space of the pre-trained network:
wherein, M c A set of features representing all the false drawings of the same category as the ith false drawing.
5. The neural network data-free quantization method based on heterogeneous generated data of claim 1, wherein in step 2), the soft perceptual loss is to provide a soft target for the false picture:
where U (e, 1) represents the uniform distribution from e to 1 and mes represents the mean squared error.
6. The heterogeneous data generation-based neural network data-free quantization method of claim 1, wherein in step 2), the BN loss:
wherein, mu l (x f ),σ l (x f ) Representing a false picture x f The mean and variance at the l-th layer of the pre-trained network,representing BN parameters stored in the l layer of the pre-training network during training;
7. the neural network data-free quantization method based on heterogeneous generated data according to claim 1, wherein in step 3), the quantization neural network quantizes a pre-trained full-precision network to obtain a quantization network Q; the quantization mode is as follows:
wherein clip (F, l, u) ═ min (max (F, l), u), l, u denote the upper and lower clipping boundaries; f represents a full precision input, which may be a network weight or an activation value; round denotes rounding its input to the nearest integer;is a scaling factor for interconverting a full precision number and an integer, b denotes the quantization bit width; for the weight, a channel-by-channel quantization mode is used, and for the activation value, a layer-by-layer quantization mode is used; after the quantization value q is obtained, it is dequantized back by the scaling factor
8. The neural network data-free quantization method based on heterogeneous generated data according to claim 1, wherein in step 3), the quantization network is trained by using distillation loss and cross-entropy loss, wherein the cross-entropy loss is:
wherein the content of the first and second substances,the method comprises the steps that a pre-trained full-precision network is represented, the ith input picture belongs to a predicted value of the y th class, and N represents a total of N input pictures;
distillation loss:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210673423.4A CN114937186B (en) | 2022-06-14 | Neural network data-free quantization method based on heterogeneous generated data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210673423.4A CN114937186B (en) | 2022-06-14 | Neural network data-free quantization method based on heterogeneous generated data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114937186A true CN114937186A (en) | 2022-08-23 |
CN114937186B CN114937186B (en) | 2024-06-07 |
Family
ID=
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117689044A (en) * | 2024-02-01 | 2024-03-12 | 厦门大学 | Quantification method suitable for vision self-attention model |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200104721A1 (en) * | 2018-09-27 | 2020-04-02 | Scopemedia Inc. | Neural network image search |
CN112686367A (en) * | 2020-12-01 | 2021-04-20 | 广东石油化工学院 | Novel normalization mechanism |
CN112861602A (en) * | 2020-12-10 | 2021-05-28 | 华南理工大学 | Face living body recognition model compression and transplantation method based on depth separable convolution |
US20210192352A1 (en) * | 2019-12-19 | 2021-06-24 | Northeastern University | Computer-implemented methods and systems for compressing deep neural network models using alternating direction method of multipliers (admm) |
CN113850385A (en) * | 2021-10-12 | 2021-12-28 | 北京航空航天大学 | Coarse and fine granularity combined neural network pruning method |
CN114037714A (en) * | 2021-11-02 | 2022-02-11 | 大连理工大学人工智能大连研究院 | 3D MR and TRUS image segmentation method for prostate system puncture |
CN114429209A (en) * | 2022-01-27 | 2022-05-03 | 厦门大学 | Neural network post-training quantification method based on fine-grained data distribution alignment |
CN114581552A (en) * | 2022-03-15 | 2022-06-03 | 南京邮电大学 | Gray level image colorizing method based on generation countermeasure network |
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200104721A1 (en) * | 2018-09-27 | 2020-04-02 | Scopemedia Inc. | Neural network image search |
US20210192352A1 (en) * | 2019-12-19 | 2021-06-24 | Northeastern University | Computer-implemented methods and systems for compressing deep neural network models using alternating direction method of multipliers (admm) |
CN112686367A (en) * | 2020-12-01 | 2021-04-20 | 广东石油化工学院 | Novel normalization mechanism |
CN112861602A (en) * | 2020-12-10 | 2021-05-28 | 华南理工大学 | Face living body recognition model compression and transplantation method based on depth separable convolution |
CN113850385A (en) * | 2021-10-12 | 2021-12-28 | 北京航空航天大学 | Coarse and fine granularity combined neural network pruning method |
CN114037714A (en) * | 2021-11-02 | 2022-02-11 | 大连理工大学人工智能大连研究院 | 3D MR and TRUS image segmentation method for prostate system puncture |
CN114429209A (en) * | 2022-01-27 | 2022-05-03 | 厦门大学 | Neural network post-training quantification method based on fine-grained data distribution alignment |
CN114581552A (en) * | 2022-03-15 | 2022-06-03 | 南京邮电大学 | Gray level image colorizing method based on generation countermeasure network |
Non-Patent Citations (2)
Title |
---|
YUNSHAN ZHONG等: "IntraQ:Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization", 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, 27 September 2022 (2022-09-27) * |
尹文枫;梁玲燕;彭慧民;曹其春;赵健;董刚;赵雅倩;赵坤;: "卷积神经网络压缩与加速技术研究进展", 计算机系统应用, no. 09, 15 September 2020 (2020-09-15) * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117689044A (en) * | 2024-02-01 | 2024-03-12 | 厦门大学 | Quantification method suitable for vision self-attention model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230342616A1 (en) | Systems and Methods for Contrastive Learning of Visual Representations | |
Xu et al. | Accelerating federated learning for iot in big data analytics with pruning, quantization and selective updating | |
Barnes et al. | rTop-k: A statistical estimation approach to distributed SGD | |
CN108614992B (en) | Hyperspectral remote sensing image classification method and device and storage device | |
CN108717512B (en) | Malicious code classification method based on convolutional neural network | |
Dodge et al. | Quality robust mixtures of deep neural networks | |
Luo et al. | Anti-forensics of JPEG compression using generative adversarial networks | |
JPH1055444A (en) | Recognition of face using feature vector with dct as base | |
CN113674334B (en) | Texture recognition method based on depth self-attention network and local feature coding | |
CN109949200B (en) | Filter subset selection and CNN-based steganalysis framework construction method | |
Gao et al. | Hyperspectral image classification using joint sparse model and discontinuity preserving relaxation | |
CN111539444A (en) | Gaussian mixture model method for modified mode recognition and statistical modeling | |
CN111935487B (en) | Image compression method and system based on video stream detection | |
CN116258874A (en) | SAR recognition database sample gesture expansion method based on depth condition diffusion network | |
Huang et al. | Compressing multidimensional weather and climate data into neural networks | |
Li et al. | Incoherent dictionary learning with log-regularizer based on proximal operators | |
CN114429209A (en) | Neural network post-training quantification method based on fine-grained data distribution alignment | |
Chang et al. | Randnet: Deep learning with compressed measurements of images | |
An et al. | RBDN: Residual bottleneck dense network for image super-resolution | |
CN104463922A (en) | Image feature coding and recognizing method based on integrated learning | |
US20080252499A1 (en) | Method and system for the compression of probability tables | |
CN114937186A (en) | Neural network data-free quantification method based on heterogeneous generated data | |
CN117172301A (en) | Distribution flexible subset quantization method suitable for super-division network | |
CN113554047A (en) | Training method of image processing model, image processing method and corresponding device | |
CN109558819B (en) | Depth network lightweight method for remote sensing image target detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |