CN114937186A - Neural network data-free quantification method based on heterogeneous generated data - Google Patents

Neural network data-free quantification method based on heterogeneous generated data Download PDF

Info

Publication number
CN114937186A
CN114937186A CN202210673423.4A CN202210673423A CN114937186A CN 114937186 A CN114937186 A CN 114937186A CN 202210673423 A CN202210673423 A CN 202210673423A CN 114937186 A CN114937186 A CN 114937186A
Authority
CN
China
Prior art keywords
network
quantization
loss
false
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210673423.4A
Other languages
Chinese (zh)
Other versions
CN114937186B (en
Inventor
纪荣嵘
钟云山
林明宝
南宫瑞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen University
Original Assignee
Xiamen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen University filed Critical Xiamen University
Priority to CN202210673423.4A priority Critical patent/CN114937186B/en
Priority claimed from CN202210673423.4A external-priority patent/CN114937186B/en
Publication of CN114937186A publication Critical patent/CN114937186A/en
Application granted granted Critical
Publication of CN114937186B publication Critical patent/CN114937186B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Facsimile Image Signal Circuits (AREA)

Abstract

A neural network data-free quantization method based on heterogeneous generated data relates to the compression and acceleration of an artificial neural network. The method comprises the following steps: 1) the false pictures are randomly initialized using a standard gaussian distribution. 2) Optimizing and initializing the false picture until the iteration times reach the limit, and updating the false picture by using local object reinforcement, boundary distance limit, soft perception loss and BN loss; 3) quantizing the neural network, and then training the quantized network by using distillation loss and cross entropy loss by using an optimized false picture until a preset number of training rounds is reached; 4) and (5) after the training is finished, keeping the weight of the quantization network, and obtaining the quantized quantization network. Real data is not needed, a quantitative network can be obtained through training from the beginning, and network compression and acceleration can be achieved on a general hardware platform under the condition that specific hardware support is not needed.

Description

Neural network data-free quantification method based on heterogeneous generated data
Technical Field
The invention relates to compression and acceleration of an artificial neural network, in particular to a neural network data-free quantization method based on heterogeneous generated data.
Background
In recent years, Deep Neural Networks (DNNs) have been widely used in many fields such as computer vision and natural language processing. Despite the tremendous success of DNNs, the increasing network size hinders the deployment of DNNs on many resource-limited platforms, such as mobile phones, embedded devices, and the like. To overcome this dilemma, the academia and industry explore a variety of ways to reduce the complexity of DNNs, and network quantization to represent full-precision DNNs in a low-precision format is a promising direction.
Most existing methods belong to quantitative perceptual training, where quantization is performed on the premise that an original complete training data set can be obtained. However, the disadvantage also stems from its dependence on training data. In many practical situations, the original training data is sometimes prohibited from being accessed due to the ongoing deterioration of privacy and security issues. For example, people may not want their medical records to be disclosed to others, nor do business materials want to be disseminated over the internet. Therefore, the quantitative perceptual training is no longer applicable.
How to obtain quantized DNN without data is highly regarded by academic and industrial circles. Existing data-free quantification studies can be broadly divided into two categories:
the first class of dataless quantization methods does not utilize any data at all, but instead focuses on the calibration parameters. For example, DFQ (Nagel M, Baalen M, Blankovort T, et al. data-free quantization and bias correction [ C ]// Proceedings of the IEEE/CVF International Conference on Computer Vision.2019: 1325-. Simple parameter calibration tends to result in severe performance degradation. This problem is even magnified for ultra low precision cases. For example, when ResNet-18(He K, Zhang X, Ren S, et al. deep residual learning for image recognition [ C ]// Proceedings of the IEEE Conference on Computer vision and pattern recognition.2016:770-778.) is quantized to 4 bits, only the 0.10% top-1 precision of DFQ on image is reported in the GDFQ' S (Xu S, Li H, Zhuang B, et al. genetic low-bit data free quantization [ C ]// European Conference on Computer vision. Springer, Cham 2020:1-17.) appendix.
The second category helps to train the quantization network by using synthetic false images, and an intuitive solution is to deploy a generator to synthesize the training data. The generator-based approach has a large overhead in computational resources because the introduced generator must be trained from scratch for different bit settings. ZeroQ (Cai Y, Yao Z, Dong Z, et al. Zeroq: A novel zero shot quantization frame [ C ]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern registration. 2020:13169-13178.) and DSG (Zhang X, Qin H, Ding Y, et al. conversion sampling generation for encryption data-free quantization [ C ]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern registration. 2021:15658-15667.) describe data synthesis as an optimization problem in which random input data extracted from a standard Gaussian distribution is iteratively updated to fit the true data distribution. The benefit of this research route is that the composite image can be reused to calibrate or fine tune the network for different bit widths, thus enabling resource-friendly quantization. However, when comparing the feature visualizations of ZeroQ and DSG with real data, there is still a non-negligible quality gap in the composite image, since traditional gaussian synthesis is to fit the entire dataset, ignoring more subtle decision-like boundaries. Therefore, the quantization model usually suffers from a large performance degradation. To ensure decision-like boundaries in false images, perceptual loss IL (Haroush M, Hubara I, Hoffer E, et al. the knowledge with: Methods for data-free model compression [ C ]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern registration.2020: 8494-. Thus, the false data shows a separable distribution, and such false pictures do not capture the intra-class heterogeneity well. Images from the same class often contain different content; features from the same class of real pictures are very scattered and heterogeneous. Feature clustering of ZeroQ + IL and DSG + IL indicates that the same class of synthetic images is mostly homogenous. Quantization models trimmed with these false pictures do not generalize well to real test datasets with heterogeneity.
Disclosure of Invention
The invention aims to provide a neural network data-free quantization method based on heterogeneous generated data, aiming at the problems of performance degradation and the like caused by the current neural network data-free quantization method. Real data is not needed, a quantization network can be obtained through training from the beginning, the performance is higher, and particularly when a small network is quantized, compression and acceleration of the network can be realized on a general hardware platform under the condition that specific hardware support is not needed.
The invention comprises the following steps:
1) randomly initializing a false picture by using a standard Gaussian distribution;
2) optimizing the false picture until the iteration times reach the limit, and updating the false picture by using local object reinforcement, boundary distance limit, soft perception loss and BN loss;
3) quantizing the neural network, and then training the quantized network by using the optimized false picture in the step 2) and using distillation loss and cross entropy loss until a preset training round number is reached;
4) and (5) after the training is finished, keeping the weight of the quantization network, and obtaining the quantized quantization network.
In step 1), the randomly initializing false picture by using the standard gaussian distribution is to generate an initializing false picture with the same size as the real picture from the standard gaussian distribution sampling.
In step 2), the specific method for reinforcing the local object may be: random crop (crop), scale (reszie) with a probability of p ═ 50% before the fake pictures are input into the pre-trained network:
Figure BDA0003693989130000031
wherein, crop η The scale representing the clipping is sampled from the uniform distribution U (η,1),
Figure BDA0003693989130000032
representing a false picture after local object reinforcement;
the specific method of the boundary distance limitation may be: the false pictures are limited to keep a certain distribution in the feature space of the pre-trained network:
Figure BDA0003693989130000033
wherein v is F Representing features extracted using a pre-trained network,
Figure BDA0003693989130000034
the following were used:
Figure BDA0003693989130000035
wherein M is c A set of features representing all the false drawings of the same category as the ith false drawing;
the soft perceptual loss is to provide a soft target for the false picture:
Figure BDA0003693989130000036
wherein U (e, 1) represents a uniform distribution from e to 1, and mes represents an average square error;
the BN loss:
Figure BDA0003693989130000037
wherein, mu l (x f ),σ l (x f ) Representing a false picture x f At the mean and variance of the l-th layer of the pre-training network,
Figure BDA0003693989130000038
representing BN parameters stored in the first layer of the pre-training network during training;
BN losses combining the above
Figure BDA0003693989130000039
Boundary distance limitation
Figure BDA00036939891300000310
Loss of soft feel
Figure BDA00036939891300000311
The total losses that can be obtained are:
Figure BDA00036939891300000312
in the step 3), the quantization neural network quantizes the pre-trained full-precision network to obtain a quantization network Q; the quantization is as follows:
Figure BDA00036939891300000313
Figure BDA0003693989130000041
wherein clip (F, l, u) ═ min (max (F, l), u), l, u denote the upper and lower clipping boundaries; representing a full precision input, which may be a network weight or an activation value; round denotes rounding its input to the nearest integer;
Figure BDA0003693989130000042
is a scaling factor for interconverting a full precision number and an integer, b denotes the quantization bit width; for the weight, a channel-by-channel quantization mode is used, and for the activation value, a layer-by-layer quantization mode is used; after the quantization value q is obtained, it is dequantized back by the scaling factor
Figure BDA0003693989130000043
Training a quantization network using distillation loss, cross entropy loss, wherein cross entropy loss:
Figure BDA0003693989130000044
wherein the content of the first and second substances,
Figure BDA0003693989130000045
representing a predicted value of the pre-trained full-precision network about the ith input picture belonging to the y-th class, wherein N represents a total of N input pictures;
distillation loss:
Figure BDA0003693989130000046
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003693989130000047
indicating that the quantization network belongs to a predictor of class c with respect to the ith input picture,
Figure BDA0003693989130000048
and C represents the number of data set categories, and N represents N input pictures in total.
Compared with the prior art, the invention has the following outstanding advantages:
1) the heterogeneity of the false picture can be preserved, and the quality of the false picture is greatly improved.
2) Through a large number of experiments, the neural network non-data quantization method based on heterogeneous generated data is simple to implement, improves performance, and simultaneously exceeds various mainstream neural network non-data quantization methods in performance, especially when all layers are quantized into very low bits or smaller neural networks.
3) Real data is not needed, a quantization network can be obtained through training from the beginning, the performance is higher, and particularly when a small network is quantized, compression and acceleration of the network can be realized on a general hardware platform under the condition that specific hardware support is not needed. The method can be applied to a convolutional neural network in the field of image classification.
Drawings
FIG. 1 is a method block diagram of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments will be further described with reference to the accompanying drawings.
A method block diagram of an embodiment of the invention is shown in fig. 1.
1. Description of the symbols
F(W 1 ,W 2 ,…,W L ) A full-precision Convolutional Neural Network (CNN) representing one L layer, where W i Represents the ith convolution layer, the number of convolution kernels of the ith convolution layer is out i The convolution kernel weight for this layer can be expressed as:
Figure BDA0003693989130000051
wherein the content of the first and second substances,
Figure BDA0003693989130000052
a jth convolution kernel representing an ith convolution layer, each convolution kernel
Figure BDA0003693989130000053
In order to realize the purpose of the method,
Figure BDA0003693989130000054
therein, in i ,width i ,height i The number of input channels of the ith layer, and the width and height of the convolution kernel are respectively. Given the input a of the ith convolutional layer i-1 (i.e., the output of the previous layer), the convolution result of the ith convolutional layer can be expressed as:
Figure BDA0003693989130000055
wherein the content of the first and second substances,
Figure BDA0003693989130000056
is the jth channel of the j convolution result, and all channels are collected to obtain O i
Figure BDA00036939891300000514
Representing a convolution operation. Then, the convolution result is passed through an activation function to obtain a final output activation value of the layer:
A i =σ(O i )
σ denotes the activation function.
The goal of the quantization algorithm is to obtain a neural network that can operate with low bits, when the convolution operation is expressed as:
Figure BDA0003693989130000057
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003693989130000058
a jth channel representing the quantized jth convolution kernel of the ith layer and the input of the ith layer. At the moment, the quantization algorithm can obtain an L-layer low-precision convolutional neural network
Figure BDA0003693989130000059
Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA00036939891300000510
representing the ith convolutional layer that has been quantized.
To obtain a quantized network, the pre-trained full precision network is quantized. The quantization is as follows:
Figure BDA00036939891300000511
Figure BDA00036939891300000512
where clip (m, l, u) is min (max (m, l), u), and l, u represents the upper and lower clipping boundaries. m represents a full precision input, which may be a network weight W or an activation value a. round means rounding its input to the nearest integer.
Figure BDA00036939891300000513
Is a scaling factor for interconverting a full precision number and an integer, b denotes the quantization bit width. For the weights, a channel-by-channel quantization approach is used, i.e., each output channel has a separate clipping upper and lower bounds and scaling factor. For the activation values, a layer-by-layer quantization approach is used, i.e., each layer shares the same clipping upper and lower bounds and scaling factors. After the quantized value q is obtained, it can be dequantized back with a scaling factor
Figure BDA0003693989130000061
And then the operation is performed. For the convolution operation of the two quantized values, one can use:
Figure BDA0003693989130000062
wherein s is 1 ,s 2 Can be stored by pre-calculation, and q 1 ,q 2 All are low precision values, so that the original full-precision operation can be replaced by only low-precision convolution operation.
2. Heterogeneous data analysis
The existing neural network non-data quantization method is limited by poor quality of false pictures, and the performance is obviously reduced when all network layers are quantized into low bits. In order to improve the performance of the quantization network, the invention provides a neural network data-free quantization method based on heterogeneous generated data. Based on the characteristic that real data has heterogeneity, the false picture is updated by local object reinforcement, boundary distance limitation, soft perception loss and BN loss, so that the picture with heterogeneity is generated.
3. Description of the training
The embodiment of the invention comprises the following steps:
1) randomly initializing false pictures by using standard Gaussian distribution;
2) optimizing the false picture until the iteration number reaches the limit, and updating the false picture by using local object reinforcement, boundary distance limit, soft sensing loss and BN loss;
3) quantizing the neural network, and then training the quantized network by using distillation loss and cross entropy loss by using an optimized false picture until a preset number of training rounds is reached;
4) and (5) after the training is finished, keeping the weight of the quantization network, and obtaining the quantized quantization network.
In step 1), an initialization dummy picture with the same size as the real picture is generated from the standard gaussian distribution samples.
In step 2), the dummy picture is optimized and initialized until the iteration number reaches the limit, and the dummy picture is updated by using local object reinforcement, boundary distance limitation, soft sensing loss and BN loss.
Local object enhancement, random crop (crop), scale (rescie) with a probability p 50% before a fake picture is input into the pre-trained network:
Figure BDA0003693989130000063
wherein crop η The scale representing the clipping is sampled from the uniform distribution U (η,1),
Figure BDA0003693989130000064
showing a false picture after local object enhancement.
In the boundary distance limitation, the false pictures are limited to keep a certain distribution in the feature space of the pre-trained network:
Figure BDA0003693989130000065
wherein v is F Representing features extracted using a pre-trained network,
Figure BDA0003693989130000066
the following:
Figure BDA0003693989130000071
wherein M is c A set of features representing all the false drawings of the same category as the ith false drawing.
In soft perception loss, a soft target is provided for false pictures:
Figure BDA0003693989130000072
where U (e, 1) represents the uniform distribution from e to 1 and mes represents the mean squared error.
Loss of BN:
Figure BDA0003693989130000073
μ l (x f ),σ l (x f ) Representing a false picture x f The mean and variance at the l-th layer of the pre-trained network,
Figure BDA0003693989130000074
and the stored BN parameters of the l-th layer of the pre-training network during training are shown.
Combining the above losses, the total loss is:
Figure BDA0003693989130000075
and 3), quantizing the neural network, and training the quantized network by using distillation loss and cross entropy loss by using the optimized false picture until a preset training round number is reached.
And in the quantization neural network, quantizing the pre-trained full-precision network to obtain a quantization network Q. The quantization is as follows:
Figure BDA0003693989130000076
Figure BDA0003693989130000077
where clip (F, l, u) ═ min (max (F, l), u), and l, u denote the upper and lower clipping boundaries. F represents a full precision input, which may be a network weight or an activation value. round means rounding its input to the nearest integer.
Figure BDA0003693989130000078
Is a scaling factor for interconverting a full precision number and an integer, b denotes the quantization bit width. For the weights, a channel-by-channel quantization approach is used, and for the activation values, a layer-by-layer quantization approach is used. After the quantized value q is obtained, it can be dequantized back with a scaling factor
Figure BDA0003693989130000079
Training the quantization network using distillation loss, cross entropy loss, wherein cross entropy loss:
Figure BDA00036939891300000710
wherein the content of the first and second substances,
Figure BDA00036939891300000711
and the pre-trained full-precision network represents that the ith input picture belongs to a predicted value of the y-th class, and N represents a total of N input pictures.
Distillation loss:
Figure BDA0003693989130000081
wherein the content of the first and second substances,
Figure BDA0003693989130000082
indicating that the quantization network belongs to a predictor of class c with respect to the ith input picture,
Figure BDA0003693989130000083
and C represents the number of data set categories, and N represents N input pictures in total.
4. Implementation details
The neural network data-free quantification method based on heterogeneous generated data uses an ImageNet data set for effect evaluation, and uses a Pythrch deep learning framework to implement on an NVIDIAGTX 3090 display card. For the generation of false data, the optimizer is set to Adam, the momentum of the optimizer is set to 0.9, as long as the loss is not reduced in 50 iterations, the learning rate is reduced to 0.1, the total number of iterations is 1000, the size of batch is set to 256, eta, lambda lu Epsilon is set to; 0.5,0.3,0.8,0.9. A total of 5120 pictures were generated. For training the quantization network, its optimizer uses a Stochastic Gradient Descent (SGD), the optimizer momentum is set to 0.9, and the weight decay is set to 10e-4 to adjust the learning rate of the quantization network. The batch size is set to 16, the initial learning rate is set to 10e-6, the learning rate is reduced to 0.1 per 100 rounds, and the number of rounds of total training is set to 150.
5. Field of application
The method can be applied to the field of the deep convolutional neural network CNN, and compression and acceleration of the deep convolutional neural network are realized. Table 1 shows ResNet18(He K, Zhang X, Ren S, et al. deep residual learning for image recognition [ C ]// Proceedings of the IEEE conference on computer vision and pattern recognition.2016:770-778.) the results of the method are compared with the results of other neural network post-training quantification methods on the ImageNet dataset, wherein Gner generates a false picture by using a generator, and WbAb represents the quantification model weight and the activation value into b bits.
TABLE 1
Figure BDA0003693989130000084
As can be seen from Table 1, when all layers of the model are scaled to 5 bits, the method of the present invention (IntraQ) and the latest neural network non-data quantization method can maintain high performance, and in addition, compared with BRECQ, when ResNet-18 is quantized to 4 bits, the method of the present invention achieves great performance improvement, with 1.97% of performance.
Table 2 shows the comparison of the results of the method on the ImageNet dataset with other neural network post-training quantification methods for MobileNet V1(Howard A G, Zhu M, Chen B, et al. Mobilenes: Efficient connected neural networks for mobile vision applications [ J ]. arXiv preprinting: 1704.04861,2017.).
TABLE 2
Figure BDA0003693989130000091
From table 2, it can be seen that the present invention can improve the previous highest performance by 9.17% when quantifying MobileNetV1 to 4 bits.
Table 3 shows the results of MobileNet V2(Sandler M, Howard A, Zhu M, et al. Mobileneetv2: Inverted responses and linear botterns [ C ]// Proceedings of the IEEE conference on computer vision and pattern recognition.2018:4510-4520.) on the ImageNet dataset comparing the results of the method with other neural network post-training quantification methods.
TABLE 3
Figure BDA0003693989130000092
From table 3, it can be seen that the present invention can improve the previous maximum performance by 4.65% when quantifying MobileNetV2 to 4 bits.
Thus, the performance advantage of the present invention is more pronounced when quantifying lightweight models such as MobileNet, especially at lower precision (e.g., 4 bits). Moreover, the invention achieves the best results on all networks and bits, which proves the effectiveness of the invention.
The above-described embodiments are merely preferred embodiments of the present invention, and should not be construed as limiting the scope of the invention. All equivalent changes and modifications made within the scope of the present invention shall fall within the scope of the present invention.

Claims (8)

1. The neural network data-free quantification method based on heterogeneous generated data is characterized by comprising the following steps of:
1) randomly initializing a false picture by using a standard Gaussian distribution;
2) optimizing the false picture until the iteration number reaches the limit, and updating the false picture by using local object reinforcement, boundary distance limit, soft sensing loss and BN loss;
3) firstly quantizing the neural network, and then training the quantized network by using distillation loss and cross entropy loss by using an optimized false picture until a preset number of training rounds is reached;
4) and (5) after the training is finished, keeping the weight of the quantization network, and obtaining the quantized quantization network.
2. The method according to claim 1, wherein in step 1), the randomly initializing false pictures using the standard gaussian distribution is an initializing false picture generated from a sampling of the standard gaussian distribution and having a size consistent with a real picture.
3. The neural network data-free quantification method based on heterogeneous generated data as claimed in claim 1, wherein in the step 2), the specific method for reinforcing the local object is as follows: random crop (crop), scale (reszie) with a probability of p ═ 50% before the fake pictures are input into the pre-trained network:
Figure FDA0003693989120000011
wherein, crop η The scale representing the clipping is sampled from the uniform distribution U (η,1),
Figure FDA0003693989120000012
showing a false picture after local object enhancement.
4. The neural network data-free quantification method based on heterogeneous generated data as claimed in claim 1, wherein in step 2), the specific method of the boundary distance limitation is: the false pictures are limited to keep a certain distribution in the feature space of the pre-trained network:
Figure FDA0003693989120000013
wherein v is F Representing features extracted using a pre-trained network,
Figure FDA0003693989120000014
the following were used:
Figure FDA0003693989120000015
wherein, M c A set of features representing all the false drawings of the same category as the ith false drawing.
5. The neural network data-free quantization method based on heterogeneous generated data of claim 1, wherein in step 2), the soft perceptual loss is to provide a soft target for the false picture:
Figure FDA0003693989120000021
where U (e, 1) represents the uniform distribution from e to 1 and mes represents the mean squared error.
6. The heterogeneous data generation-based neural network data-free quantization method of claim 1, wherein in step 2), the BN loss:
Figure FDA0003693989120000022
wherein, mu l (x f ),σ l (x f ) Representing a false picture x f The mean and variance at the l-th layer of the pre-trained network,
Figure FDA0003693989120000023
representing BN parameters stored in the l layer of the pre-training network during training;
combining the above BN losses
Figure FDA0003693989120000024
Boundary distance limitation
Figure FDA0003693989120000025
Loss of soft perception
Figure FDA0003693989120000026
The total loss was:
Figure FDA0003693989120000027
7. the neural network data-free quantization method based on heterogeneous generated data according to claim 1, wherein in step 3), the quantization neural network quantizes a pre-trained full-precision network to obtain a quantization network Q; the quantization mode is as follows:
Figure FDA0003693989120000028
Figure FDA0003693989120000029
wherein clip (F, l, u) ═ min (max (F, l), u), l, u denote the upper and lower clipping boundaries; f represents a full precision input, which may be a network weight or an activation value; round denotes rounding its input to the nearest integer;
Figure FDA00036939891200000210
is a scaling factor for interconverting a full precision number and an integer, b denotes the quantization bit width; for the weight, a channel-by-channel quantization mode is used, and for the activation value, a layer-by-layer quantization mode is used; after the quantization value q is obtained, it is dequantized back by the scaling factor
Figure FDA00036939891200000211
8. The neural network data-free quantization method based on heterogeneous generated data according to claim 1, wherein in step 3), the quantization network is trained by using distillation loss and cross-entropy loss, wherein the cross-entropy loss is:
Figure FDA00036939891200000212
wherein the content of the first and second substances,
Figure FDA00036939891200000213
the method comprises the steps that a pre-trained full-precision network is represented, the ith input picture belongs to a predicted value of the y th class, and N represents a total of N input pictures;
distillation loss:
Figure FDA00036939891200000214
wherein the content of the first and second substances,
Figure FDA0003693989120000031
indicating that the quantization network belongs to a predictor of class c with respect to the ith input picture,
Figure FDA0003693989120000032
and C represents the number of data set categories, and N represents N input pictures in total.
CN202210673423.4A 2022-06-14 Neural network data-free quantization method based on heterogeneous generated data Active CN114937186B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210673423.4A CN114937186B (en) 2022-06-14 Neural network data-free quantization method based on heterogeneous generated data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210673423.4A CN114937186B (en) 2022-06-14 Neural network data-free quantization method based on heterogeneous generated data

Publications (2)

Publication Number Publication Date
CN114937186A true CN114937186A (en) 2022-08-23
CN114937186B CN114937186B (en) 2024-06-07

Family

ID=

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117689044A (en) * 2024-02-01 2024-03-12 厦门大学 Quantification method suitable for vision self-attention model

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200104721A1 (en) * 2018-09-27 2020-04-02 Scopemedia Inc. Neural network image search
CN112686367A (en) * 2020-12-01 2021-04-20 广东石油化工学院 Novel normalization mechanism
CN112861602A (en) * 2020-12-10 2021-05-28 华南理工大学 Face living body recognition model compression and transplantation method based on depth separable convolution
US20210192352A1 (en) * 2019-12-19 2021-06-24 Northeastern University Computer-implemented methods and systems for compressing deep neural network models using alternating direction method of multipliers (admm)
CN113850385A (en) * 2021-10-12 2021-12-28 北京航空航天大学 Coarse and fine granularity combined neural network pruning method
CN114037714A (en) * 2021-11-02 2022-02-11 大连理工大学人工智能大连研究院 3D MR and TRUS image segmentation method for prostate system puncture
CN114429209A (en) * 2022-01-27 2022-05-03 厦门大学 Neural network post-training quantification method based on fine-grained data distribution alignment
CN114581552A (en) * 2022-03-15 2022-06-03 南京邮电大学 Gray level image colorizing method based on generation countermeasure network

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200104721A1 (en) * 2018-09-27 2020-04-02 Scopemedia Inc. Neural network image search
US20210192352A1 (en) * 2019-12-19 2021-06-24 Northeastern University Computer-implemented methods and systems for compressing deep neural network models using alternating direction method of multipliers (admm)
CN112686367A (en) * 2020-12-01 2021-04-20 广东石油化工学院 Novel normalization mechanism
CN112861602A (en) * 2020-12-10 2021-05-28 华南理工大学 Face living body recognition model compression and transplantation method based on depth separable convolution
CN113850385A (en) * 2021-10-12 2021-12-28 北京航空航天大学 Coarse and fine granularity combined neural network pruning method
CN114037714A (en) * 2021-11-02 2022-02-11 大连理工大学人工智能大连研究院 3D MR and TRUS image segmentation method for prostate system puncture
CN114429209A (en) * 2022-01-27 2022-05-03 厦门大学 Neural network post-training quantification method based on fine-grained data distribution alignment
CN114581552A (en) * 2022-03-15 2022-06-03 南京邮电大学 Gray level image colorizing method based on generation countermeasure network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YUNSHAN ZHONG等: "IntraQ:Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization", 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, 27 September 2022 (2022-09-27) *
尹文枫;梁玲燕;彭慧民;曹其春;赵健;董刚;赵雅倩;赵坤;: "卷积神经网络压缩与加速技术研究进展", 计算机系统应用, no. 09, 15 September 2020 (2020-09-15) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117689044A (en) * 2024-02-01 2024-03-12 厦门大学 Quantification method suitable for vision self-attention model

Similar Documents

Publication Publication Date Title
US20230342616A1 (en) Systems and Methods for Contrastive Learning of Visual Representations
Xu et al. Accelerating federated learning for iot in big data analytics with pruning, quantization and selective updating
Barnes et al. rTop-k: A statistical estimation approach to distributed SGD
CN108614992B (en) Hyperspectral remote sensing image classification method and device and storage device
CN108717512B (en) Malicious code classification method based on convolutional neural network
Dodge et al. Quality robust mixtures of deep neural networks
Luo et al. Anti-forensics of JPEG compression using generative adversarial networks
JPH1055444A (en) Recognition of face using feature vector with dct as base
CN113674334B (en) Texture recognition method based on depth self-attention network and local feature coding
CN109949200B (en) Filter subset selection and CNN-based steganalysis framework construction method
Gao et al. Hyperspectral image classification using joint sparse model and discontinuity preserving relaxation
CN111539444A (en) Gaussian mixture model method for modified mode recognition and statistical modeling
CN111935487B (en) Image compression method and system based on video stream detection
CN116258874A (en) SAR recognition database sample gesture expansion method based on depth condition diffusion network
Huang et al. Compressing multidimensional weather and climate data into neural networks
Li et al. Incoherent dictionary learning with log-regularizer based on proximal operators
CN114429209A (en) Neural network post-training quantification method based on fine-grained data distribution alignment
Chang et al. Randnet: Deep learning with compressed measurements of images
An et al. RBDN: Residual bottleneck dense network for image super-resolution
CN104463922A (en) Image feature coding and recognizing method based on integrated learning
US20080252499A1 (en) Method and system for the compression of probability tables
CN114937186A (en) Neural network data-free quantification method based on heterogeneous generated data
CN117172301A (en) Distribution flexible subset quantization method suitable for super-division network
CN113554047A (en) Training method of image processing model, image processing method and corresponding device
CN109558819B (en) Depth network lightweight method for remote sensing image target detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant