CN113095472A

CN113095472A - Method for reducing precision loss of convolutional neural network through forward reasoning in quantization process

Info

Publication number: CN113095472A
Application number: CN202010020803.9A
Authority: CN
Inventors: 周飞飞; 于晓静
Original assignee: Beijing Ingenic Semiconductor Co Ltd
Current assignee: Beijing Ingenic Semiconductor Co Ltd
Priority date: 2020-01-09
Filing date: 2020-01-09
Publication date: 2021-07-09
Anticipated expiration: 2040-01-09
Also published as: CN113095472B

Abstract

The invention provides a method for reducing precision loss by forward reasoning of a convolutional neural network in a quantization process, which comprises the following steps: in the process of carrying out weight quantification inverse quantification, extracting a batchnorm value, and avoiding the influence of a batchnorm value abnormal value on weight quantification; after convolution, the 32bit dequantized output is multiplied by the batcnorm extracted value, so that precision loss is avoided. The quantization process is to directly quantize the weight and not to combine any batcnorm parameter value with the weight.

Description

Method for reducing precision loss of convolutional neural network through forward reasoning in quantization process

Technical Field

The invention relates to the technical field of 8bit quantization and inverse quantization of a convolutional neural network, in particular to a method for reducing precision loss of the convolutional neural network through forward reasoning in a quantization process.

Background

A Convolutional Neural Network (CNN) is a feed-forward neural network for image processing, which includes convolution calculation and has a depth structure, and is widely used for image classification, image recognition, and the like. In recent years, with the rapid development of science and technology, a big data age has come. Deep learning models Deep Neural Networks (DNNs), and has achieved significant results in a key area of many human intelligence. The Convolutional Neural Network (CNN) is a typical DNN structure, can effectively extract hidden layer features of an image and accurately classify the image, and is widely applied to the field of image identification and detection in recent years.

The initial motivation for quantizing networks was to reduce the model file size, down to one quarter using 8-bit quantization, and still convert back to floating point numbers after model loading. The specific method is that when the network weight is stored as a file, the minimum value and the maximum value of each layer are stored, and then each floating point numerical value is represented by 8-bit integers (256 sections are divided in a space linear mode within the range of the maximum value and the minimum value, and each section represents a real numerical value in the section by a unique 8-bit integer). Porting the computation to 8-bit may help you run the model faster, with lower power consumption, especially important on mobile devices. The google quantization inverse quantization firstly needs to combine the batchnorm value into the weight value, and the combined weight value is subjected to integral quantization and inverse quantization operations.

Terms and explanations in the prior art:

convolutional Neural Networks (CNN): is a type of feedforward neural network that contains convolution calculations and has a depth structure.

And (3) detecting the model: and positioning the position of the target object in the image according to the target task.

Quantization and inverse quantization: the quantization finger stores the weight value in the full-precision model in a 8-bit mode in a discrete value mode, and the inverse quantization finger inversely quantizes the 8-bit discrete value to full precision;

batchnorm: the method for normalizing each layer of the neural network in the training process can effectively accelerate the convergence speed and the stability of the model;

forward propagation: and carrying out forward operation by utilizing the cured neural network to obtain a prediction result.

Reasoning: in deep learning, reasoning refers to deploying a pre-trained neural network model into an actual service scene, such as image classification, object detection, online translation and the like. Because reasoning is directly user-oriented, reasoning performance is critical, especially for enterprise-level products.

In the prior art, batchnorm is combined according to channels, and a certain channel value is inevitably larger, so that the integral quantization precision is further influenced, and the integral precision loss of the model is serious.

Disclosure of Invention

In order to solve the above problems, particularly the problem of the generation of wait between model inference and model post-processing, the present invention aims to: model reasoning and model result post-processing can be operated in parallel, and therefore the overall efficiency of the detection model is improved.

Specifically, the invention provides a method for reducing precision loss of a convolutional neural network in a forward reasoning mode in a quantization process, which comprises the following steps: in the process of carrying out weight quantification inverse quantification, extracting a batchnorm value, and avoiding the influence of a batchnorm value abnormal value on weight quantification; after convolution, the 32bit dequantized output is multiplied by the batcnorm extracted value, so that precision loss is avoided.

The quantization process is to directly quantize the weight and not to combine any batcnorm parameter value with the weight.

Further comprising:

assume that the quantization calculation for the ith layer is as follows:

X₀＝scale×D(X_i×Q(W))+b

wherein Q (W) is the weight quantization operation, D (X) is the inverse quantization operation to the whole, scale, b are the corresponding coefficients after the batchnorm extraction, respectively.

The scale value operation is multiplied by the value after inverse quantization and simultaneously added with the offset after combination.

The extraction of the batchnorm value avoids the influence of the abnormal value of the batchnorm value on the weight quantification by independently processing the channel value of the batchnorm, and avoids the problem of the abnormal value of the channel when the weight is combined with the batchnorm.

Before the weights are not combined with the batchnorm, the overall weight distribution conforms to the normal distribution, and the maximum value and the minimum value conform to the expectation.

When the weights merge batchnorm, the overall weight maximum and minimum do not fit into expectations, the distribution is corrupted, resulting in quantization loss.

Thus, the present application has the advantages that: the method belongs to a technology for local quantization and inverse quantization of a deep neural network based on a full-precision model, and can effectively reduce the influence of a batchnorm parameter abnormal value on the weight, improve the forward propagation precision in the 8-bit quantization process and reduce the precision loss.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention.

FIG. 1 is a schematic representation of the process of the present invention.

FIG. 2 is a schematic diagram of the method of the present invention at a weight uncombined batchnorm.

FIG. 3 is a diagram of the method of the present invention in weight combining batchnorm.

Detailed Description

In order that the technical contents and advantages of the present invention can be more clearly understood, the present invention will now be described in further detail with reference to the accompanying drawings.

The invention relates to a method for reducing precision loss of a convolutional neural network in a forward reasoning manner in a quantization process, which has the advantages that:

1. the weights are directly quantized, any batchnorm parameter value is not combined with the weights, and the influence of the batchnorm parameter value on weight distribution is eliminated;

scale value operation is multiplied by the value after inverse quantization, simultaneously with offset addition after combining.

As shown in fig. 1, a method for reducing precision loss by forward reasoning in a quantization process by a convolutional neural network includes: in the process of carrying out weight quantification inverse quantification, extracting a batchnorm value, and avoiding the influence of a batchnorm value abnormal value on weight quantification; after convolution, the 32bit dequantized output is multiplied by the batcnorm extracted value, so that precision loss is avoided.

Further comprising:

assume that the quantization calculation for the ith layer is as follows:

X₀＝scale×D(X_i×Q(W))+b

As shown in FIG. 2, before the weights are not combined with batchnorm, the overall weight distribution is in accordance with the normal distribution, and the maximum value and the minimum value are in accordance with the expectation.

As shown in fig. 3, when the weights overlap and batchnorm later, the overall weight maximum and minimum values do not meet expectations, the distribution is destroyed, resulting in quantization loss.

The patent provides a method for improving model precision and reducing precision loss in an operation process based on google quantization inverse quantization, and the problem of channel abnormal values when weights are combined with batcnorm is avoided by independently processing the batcnorm channel values, and the technical scheme of the invention is realized as follows:

and the original google quantization and inverse quantization operation is used, the scale coefficients after the batchnorm combination are combined to the weight, the google quantization and inverse quantization are respectively carried out 8-bit discrete value quantization on the large value and the small value of the integral weight value, and after the scale is combined by the weight, the integral normal distribution characteristic of the weight is disturbed, the quantization loss is further enhanced, and the integral model quantization effect is reduced.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes may be made to the embodiment of the present invention by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for reducing precision loss by forward reasoning of a convolutional neural network in a quantization process is characterized by comprising the following steps:

s1, extracting a batchnorm value in the process of carrying out weight quantization inverse quantization, and avoiding the influence of the abnormal value of the batchnorm value on weight quantization;

s2, after convolution, the 32bit dequantized output is multiplied by the batcnorm extracted value, thereby avoiding loss of precision.

2. The method of claim 1, wherein the quantization process is a direct quantization process, wherein no batcnorm parameter value is incorporated into the weights.

3. The convolutional neural network as claimed in claim 1, wherein the method further comprises:

assume that the quantization calculation for the ith layer is as follows:

X₀＝scale×D(X_i×Q(W))+b

4. The convolutional neural network forward inference reduced accuracy loss method as claimed in claim 3, wherein said scale value operation is multiplied by the value after inverse quantization, and simultaneously and biased-added after combination.

5. The method for reducing the precision loss of the convolutional neural network through forward reasoning in the quantization process according to claim 1, wherein the extracting of the batcnorm value and the avoiding of the influence of the abnormal value of the batcnorm value on the weight quantization are realized by separately processing the batcnorm channel value and avoiding the problem of the abnormal value of the channel when the weights are combined with the batcnorm.

6. The method of claim 5, wherein before the weights are not combined with the batchnorm, the overall weight distribution conforms to a normal distribution, and the maximum and minimum values conform to expectations.

7. The convolutional neural network method for reducing accuracy loss through forward inference in a quantization process as claimed in claim 5, wherein after the weights are overlapped and batchnorm, the maximum value and the minimum value of the overall weight do not meet the expectation, the distribution is destroyed, and quantization loss is generated.