CN113095472A - Method for reducing precision loss of convolutional neural network through forward reasoning in quantization process - Google Patents
Method for reducing precision loss of convolutional neural network through forward reasoning in quantization process Download PDFInfo
- Publication number
- CN113095472A CN113095472A CN202010020803.9A CN202010020803A CN113095472A CN 113095472 A CN113095472 A CN 113095472A CN 202010020803 A CN202010020803 A CN 202010020803A CN 113095472 A CN113095472 A CN 113095472A
- Authority
- CN
- China
- Prior art keywords
- value
- quantization
- batchnorm
- weight
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013139 quantization Methods 0.000 title claims abstract description 59
- 238000000034 method Methods 0.000 title claims abstract description 43
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 19
- 230000002159 abnormal effect Effects 0.000 claims abstract description 12
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 5
- 238000011002 quantification Methods 0.000 abstract description 11
- 238000013528 artificial neural network Methods 0.000 description 6
- 238000001514 detection method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012805 post-processing Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000008450 motivation Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The invention provides a method for reducing precision loss by forward reasoning of a convolutional neural network in a quantization process, which comprises the following steps: in the process of carrying out weight quantification inverse quantification, extracting a batchnorm value, and avoiding the influence of a batchnorm value abnormal value on weight quantification; after convolution, the 32bit dequantized output is multiplied by the batcnorm extracted value, so that precision loss is avoided. The quantization process is to directly quantize the weight and not to combine any batcnorm parameter value with the weight.
Description
Technical Field
The invention relates to the technical field of 8bit quantization and inverse quantization of a convolutional neural network, in particular to a method for reducing precision loss of the convolutional neural network through forward reasoning in a quantization process.
Background
A Convolutional Neural Network (CNN) is a feed-forward neural network for image processing, which includes convolution calculation and has a depth structure, and is widely used for image classification, image recognition, and the like. In recent years, with the rapid development of science and technology, a big data age has come. Deep learning models Deep Neural Networks (DNNs), and has achieved significant results in a key area of many human intelligence. The Convolutional Neural Network (CNN) is a typical DNN structure, can effectively extract hidden layer features of an image and accurately classify the image, and is widely applied to the field of image identification and detection in recent years.
The initial motivation for quantizing networks was to reduce the model file size, down to one quarter using 8-bit quantization, and still convert back to floating point numbers after model loading. The specific method is that when the network weight is stored as a file, the minimum value and the maximum value of each layer are stored, and then each floating point numerical value is represented by 8-bit integers (256 sections are divided in a space linear mode within the range of the maximum value and the minimum value, and each section represents a real numerical value in the section by a unique 8-bit integer). Porting the computation to 8-bit may help you run the model faster, with lower power consumption, especially important on mobile devices. The google quantization inverse quantization firstly needs to combine the batchnorm value into the weight value, and the combined weight value is subjected to integral quantization and inverse quantization operations.
Terms and explanations in the prior art:
convolutional Neural Networks (CNN): is a type of feedforward neural network that contains convolution calculations and has a depth structure.
And (3) detecting the model: and positioning the position of the target object in the image according to the target task.
Quantization and inverse quantization: the quantization finger stores the weight value in the full-precision model in a 8-bit mode in a discrete value mode, and the inverse quantization finger inversely quantizes the 8-bit discrete value to full precision;
batchnorm: the method for normalizing each layer of the neural network in the training process can effectively accelerate the convergence speed and the stability of the model;
forward propagation: and carrying out forward operation by utilizing the cured neural network to obtain a prediction result.
Reasoning: in deep learning, reasoning refers to deploying a pre-trained neural network model into an actual service scene, such as image classification, object detection, online translation and the like. Because reasoning is directly user-oriented, reasoning performance is critical, especially for enterprise-level products.
In the prior art, batchnorm is combined according to channels, and a certain channel value is inevitably larger, so that the integral quantization precision is further influenced, and the integral precision loss of the model is serious.
Disclosure of Invention
In order to solve the above problems, particularly the problem of the generation of wait between model inference and model post-processing, the present invention aims to: model reasoning and model result post-processing can be operated in parallel, and therefore the overall efficiency of the detection model is improved.
Specifically, the invention provides a method for reducing precision loss of a convolutional neural network in a forward reasoning mode in a quantization process, which comprises the following steps: in the process of carrying out weight quantification inverse quantification, extracting a batchnorm value, and avoiding the influence of a batchnorm value abnormal value on weight quantification; after convolution, the 32bit dequantized output is multiplied by the batcnorm extracted value, so that precision loss is avoided.
The quantization process is to directly quantize the weight and not to combine any batcnorm parameter value with the weight.
Further comprising:
assume that the quantization calculation for the ith layer is as follows:
X0=scale×D(Xi×Q(W))+b
wherein Q (W) is the weight quantization operation, D (X) is the inverse quantization operation to the whole, scale, b are the corresponding coefficients after the batchnorm extraction, respectively.
The scale value operation is multiplied by the value after inverse quantization and simultaneously added with the offset after combination.
The extraction of the batchnorm value avoids the influence of the abnormal value of the batchnorm value on the weight quantification by independently processing the channel value of the batchnorm, and avoids the problem of the abnormal value of the channel when the weight is combined with the batchnorm.
Before the weights are not combined with the batchnorm, the overall weight distribution conforms to the normal distribution, and the maximum value and the minimum value conform to the expectation.
When the weights merge batchnorm, the overall weight maximum and minimum do not fit into expectations, the distribution is corrupted, resulting in quantization loss.
Thus, the present application has the advantages that: the method belongs to a technology for local quantization and inverse quantization of a deep neural network based on a full-precision model, and can effectively reduce the influence of a batchnorm parameter abnormal value on the weight, improve the forward propagation precision in the 8-bit quantization process and reduce the precision loss.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention.
FIG. 1 is a schematic representation of the process of the present invention.
FIG. 2 is a schematic diagram of the method of the present invention at a weight uncombined batchnorm.
FIG. 3 is a diagram of the method of the present invention in weight combining batchnorm.
Detailed Description
In order that the technical contents and advantages of the present invention can be more clearly understood, the present invention will now be described in further detail with reference to the accompanying drawings.
The invention relates to a method for reducing precision loss of a convolutional neural network in a forward reasoning manner in a quantization process, which has the advantages that:
1. the weights are directly quantized, any batchnorm parameter value is not combined with the weights, and the influence of the batchnorm parameter value on weight distribution is eliminated;
scale value operation is multiplied by the value after inverse quantization, simultaneously with offset addition after combining.
As shown in fig. 1, a method for reducing precision loss by forward reasoning in a quantization process by a convolutional neural network includes: in the process of carrying out weight quantification inverse quantification, extracting a batchnorm value, and avoiding the influence of a batchnorm value abnormal value on weight quantification; after convolution, the 32bit dequantized output is multiplied by the batcnorm extracted value, so that precision loss is avoided.
The quantization process is to directly quantize the weight and not to combine any batcnorm parameter value with the weight.
Further comprising:
assume that the quantization calculation for the ith layer is as follows:
X0=scale×D(Xi×Q(W))+b
wherein Q (W) is the weight quantization operation, D (X) is the inverse quantization operation to the whole, scale, b are the corresponding coefficients after the batchnorm extraction, respectively.
The scale value operation is multiplied by the value after inverse quantization and simultaneously added with the offset after combination.
The extraction of the batchnorm value avoids the influence of the abnormal value of the batchnorm value on the weight quantification by independently processing the channel value of the batchnorm, and avoids the problem of the abnormal value of the channel when the weight is combined with the batchnorm.
As shown in FIG. 2, before the weights are not combined with batchnorm, the overall weight distribution is in accordance with the normal distribution, and the maximum value and the minimum value are in accordance with the expectation.
As shown in fig. 3, when the weights overlap and batchnorm later, the overall weight maximum and minimum values do not meet expectations, the distribution is destroyed, resulting in quantization loss.
The patent provides a method for improving model precision and reducing precision loss in an operation process based on google quantization inverse quantization, and the problem of channel abnormal values when weights are combined with batcnorm is avoided by independently processing the batcnorm channel values, and the technical scheme of the invention is realized as follows:
and the original google quantization and inverse quantization operation is used, the scale coefficients after the batchnorm combination are combined to the weight, the google quantization and inverse quantization are respectively carried out 8-bit discrete value quantization on the large value and the small value of the integral weight value, and after the scale is combined by the weight, the integral normal distribution characteristic of the weight is disturbed, the quantization loss is further enhanced, and the integral model quantization effect is reduced.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes may be made to the embodiment of the present invention by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (7)
1. A method for reducing precision loss by forward reasoning of a convolutional neural network in a quantization process is characterized by comprising the following steps:
s1, extracting a batchnorm value in the process of carrying out weight quantization inverse quantization, and avoiding the influence of the abnormal value of the batchnorm value on weight quantization;
s2, after convolution, the 32bit dequantized output is multiplied by the batcnorm extracted value, thereby avoiding loss of precision.
2. The method of claim 1, wherein the quantization process is a direct quantization process, wherein no batcnorm parameter value is incorporated into the weights.
3. The convolutional neural network as claimed in claim 1, wherein the method further comprises:
assume that the quantization calculation for the ith layer is as follows:
X0=scale×D(Xi×Q(W))+b
wherein Q (W) is the weight quantization operation, D (X) is the inverse quantization operation to the whole, scale, b are the corresponding coefficients after the batchnorm extraction, respectively.
4. The convolutional neural network forward inference reduced accuracy loss method as claimed in claim 3, wherein said scale value operation is multiplied by the value after inverse quantization, and simultaneously and biased-added after combination.
5. The method for reducing the precision loss of the convolutional neural network through forward reasoning in the quantization process according to claim 1, wherein the extracting of the batcnorm value and the avoiding of the influence of the abnormal value of the batcnorm value on the weight quantization are realized by separately processing the batcnorm channel value and avoiding the problem of the abnormal value of the channel when the weights are combined with the batcnorm.
6. The method of claim 5, wherein before the weights are not combined with the batchnorm, the overall weight distribution conforms to a normal distribution, and the maximum and minimum values conform to expectations.
7. The convolutional neural network method for reducing accuracy loss through forward inference in a quantization process as claimed in claim 5, wherein after the weights are overlapped and batchnorm, the maximum value and the minimum value of the overall weight do not meet the expectation, the distribution is destroyed, and quantization loss is generated.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010020803.9A CN113095472B (en) | 2020-01-09 | 2020-01-09 | Method for reducing precision loss by forward reasoning of convolutional neural network in quantization process |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010020803.9A CN113095472B (en) | 2020-01-09 | 2020-01-09 | Method for reducing precision loss by forward reasoning of convolutional neural network in quantization process |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113095472A true CN113095472A (en) | 2021-07-09 |
CN113095472B CN113095472B (en) | 2024-06-28 |
Family
ID=76664065
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010020803.9A Active CN113095472B (en) | 2020-01-09 | 2020-01-09 | Method for reducing precision loss by forward reasoning of convolutional neural network in quantization process |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113095472B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030115051A1 (en) * | 2001-12-14 | 2003-06-19 | Microsoft Corporation | Quantization matrices for digital audio |
JP2008042662A (en) * | 2006-08-08 | 2008-02-21 | Nec Corp | Quantization system, encoding system, portable device, quantization method, and quantization program |
CN109902745A (en) * | 2019-03-01 | 2019-06-18 | 成都康乔电子有限责任公司 | A kind of low precision training based on CNN and 8 integers quantization inference methods |
CN110073371A (en) * | 2017-05-05 | 2019-07-30 | 辉达公司 | For to reduce the loss scaling that precision carries out deep neural network training |
CN110413255A (en) * | 2018-04-28 | 2019-11-05 | 北京深鉴智能科技有限公司 | Artificial neural network method of adjustment and device |
-
2020
- 2020-01-09 CN CN202010020803.9A patent/CN113095472B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030115051A1 (en) * | 2001-12-14 | 2003-06-19 | Microsoft Corporation | Quantization matrices for digital audio |
JP2008042662A (en) * | 2006-08-08 | 2008-02-21 | Nec Corp | Quantization system, encoding system, portable device, quantization method, and quantization program |
CN110073371A (en) * | 2017-05-05 | 2019-07-30 | 辉达公司 | For to reduce the loss scaling that precision carries out deep neural network training |
CN110413255A (en) * | 2018-04-28 | 2019-11-05 | 北京深鉴智能科技有限公司 | Artificial neural network method of adjustment and device |
CN109902745A (en) * | 2019-03-01 | 2019-06-18 | 成都康乔电子有限责任公司 | A kind of low precision training based on CNN and 8 integers quantization inference methods |
Also Published As
Publication number | Publication date |
---|---|
CN113095472B (en) | 2024-06-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109918663B (en) | Semantic matching method, device and storage medium | |
TW202004658A (en) | Self-tuning incremental model compression method in deep neural network | |
US20210089904A1 (en) | Learning method of neural network model for language generation and apparatus for performing the learning method | |
CN115238893B (en) | Neural network model quantification method and device for natural language processing | |
Xu et al. | Efficient subsampling for training complex language models | |
CN116235187A (en) | Compression and decompression data for language models | |
Wang et al. | Energynet: Energy-efficient dynamic inference | |
CN110188877A (en) | A kind of neural network compression method and device | |
Zhu et al. | Binary neural network for speaker verification | |
US20240104342A1 (en) | Methods, systems, and media for low-bit neural networks using bit shift operations | |
CN117151178A (en) | FPGA-oriented CNN customized network quantification acceleration method | |
Song et al. | A channel-level pruning strategy for convolutional layers in cnns | |
CN113095472A (en) | Method for reducing precision loss of convolutional neural network through forward reasoning in quantization process | |
Kang et al. | Weight partitioning for dynamic fixed-point neuromorphic computing systems | |
CN115292033A (en) | Model operation method and device, storage medium and electronic equipment | |
Hirose et al. | Quantization error-based regularization for hardware-aware neural network training | |
CN114998661A (en) | Target detection method based on fixed point number quantization | |
Kiyama et al. | Deep learning framework with arbitrary numerical precision | |
CN114372565A (en) | Target detection network compression method for edge device | |
Wan et al. | Study of posit numeric in speech recognition neural inference | |
Pereira et al. | Evaluating robustness to noise and compression of deep neural networks for keyword spotting | |
Guo et al. | A Chinese Speech Recognition System Based on Binary Neural Network and Pre-processing | |
CN113761834A (en) | Method, device and storage medium for acquiring word vector of natural language processing model | |
Li et al. | Designing Efficient Shortcut Architecture for Improving the Accuracy of Fully Quantized Neural Networks Accelerator | |
Zhou et al. | Short-spoken language intent classification with conditional sequence generative adversarial network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |