CN113177634A

CN113177634A - Image analysis system, method and equipment based on neural network input and output quantification

Info

Publication number: CN113177634A
Application number: CN202110469141.8A
Authority: CN
Inventors: 张峰; 李淼; 池昭波; 张翠婷; 马春宇; 赵婷
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Institute of Automation of Chinese Academy of Science
Priority date: 2021-04-28
Filing date: 2021-04-28
Publication date: 2021-07-27
Anticipated expiration: 2041-04-28
Also published as: CN113177634B

Abstract

The invention belongs to the field of convolutional neural networks, in particular relates to an image analysis system, method and device based on input and output quantization of a neural network, and aims to solve the problem that the quantization of the whole neural network is not uniform because an input and output layer is not quantized when the network is lightened in the prior art. The invention comprises the following steps: carrying out thermometer coding on an input picture to generate a channel expansion image with low bit number which is n times the number of channels, correspondingly expanding a convolution kernel channel of an input layer to be n times, carrying out convolution operation through a quantized input layer and transmitting the convolution kernel channel to a hidden layer to obtain a characteristic image; and (4) performing branch quantization on the output layer, and performing weight summation on the result after each branch quantization to obtain the final output of the neural network. The invention realizes the input quantization by carrying out thermometer coding on input data and expanding the response of an input layer, and adopts a multi-branch structure to replace a single convolution structure to realize the quantization, thereby maintaining the consistency of the integral quantization of the network and hardly losing the precision.

Description

Image analysis system, method and equipment based on neural network input and output quantification

Technical Field

The invention belongs to the field of convolutional neural networks, and particularly relates to an image analysis system, method and device based on input and output quantization of a neural network.

Background

In recent years, due to the requirement on high precision of a neural network, a network model is continuously widened, and the deployment of a large and deep network at a mobile terminal becomes a big problem, so that a plurality of network lightweight design methods are correspondingly derived, wherein quantization is a relatively common mode, and comprises the quantization of weight and the quantization of an activation value. However, in the field of neural network quantization, it is a well-recognized method in the academic world that quantization is not performed on an input layer and an output layer. The input layer and the output layer are crucial to the overall accuracy of the network, the input layer determines the accuracy of the basic features extracted by the network, the output layer is the layer closest to the true value of the network in the training process, and if the output layer is inaccurate, the calculation of the overall loss of the network is greatly influenced. However, if the input layer and the output layer are not quantized, but the intermediate network layers are quantized, the consistency of the entire network is lost.

Disclosure of Invention

In order to solve the above-mentioned problems in the prior art, that is, the problem that the quantization of the entire neural network is not uniform because the input/output layer is not quantized when the network is lightweight in the prior art, the present invention provides an image analysis system based on the input/output quantization of the neural network, including: the image acquisition module, input data quantization module, input layer quantization module, hidden layer processing module, output layer quantization module and output module:

the image acquisition module is configured to acquire an input image and input the input image into an input layer of the convolutional neural network;

the input data quantization module is configured to perform thermometer coding on the input picture to generate a low-bit channel expansion image with n times of channel number;

the input layer quantization module is configured to expand a convolution kernel channel of the input layer of the convolutional neural network by n times to obtain a quantized input layer;

the hidden layer processing module is configured to perform convolution operation through the quantized input layer and transmit the convolution operation to a hidden layer based on the channel expansion image to obtain a characteristic image;

the output layer quantization module is configured to set the weight of the last layer of the corresponding convolutional neural network output layer as k branches, quantize the activation value of each branch by a low bit number, generate a quantized activation value, and further obtain an output layer with a multi-branch structure;

and the output module is configured to obtain a final calculation result of the neural network after weighted accumulation is carried out on the output of each branch of the output layer of the multi-branch structure based on the characteristic image.

In some preferred embodiments, the input data quantization module specifically includes generating a 4-bit 17-channel extended image by using an input picture of 8-bit RGB three-channel data in a thermometer coding manner, where n takes a value of 17.

In some preferred embodiments, the input layer quantization module specifically includes expanding the number of convolution kernel channels of the convolutional neural network from 3 to 51, where n takes a value of 17.

In some preferred embodiments, the output layer quantization module quantizes the activation value of the output layer to a 4-bit activation value, that is, a quantized activation value, and increases the weight of the last layer of the output layer from one branch to k branches.

In some preferred embodiments, the k branches include a positive integer branch, a negative integer branch, a positive fractional branch, and a negative fractional branch.

In some preferred embodiments, the system further comprises a convolution training module, specifically, a repeated picture acquisition module, an input data quantization module, an input layer quantization module, a hidden layer processing module, an output layer quantization module and an output module, which are used for generating a quantization feature image; wherein the integer part of positive numbers and the positive integer branch calculate a positive integer L1 penalty, the fractional part of positive numbers and the positive fractional branch calculate a positive fractional L1 penalty, the integer part of negative numbers and the negative integer branch calculate a negative integer L1 penalty, and the fractional part of negative numbers and the negative fractional branch calculate a negative fractional L1 penalty.

The output layer quantization module can set the output layer branches to any number, and the weighted value of each branch can be set to be x 2, x1,/2,/4,/8 or/16 respectively.

In another aspect of the present invention, an image analysis method based on input and output quantization of a neural network is provided, where the method includes:

step S100, acquiring an input picture, and inputting the input picture into an input layer of a convolutional neural network;

step S200, carrying out thermometer coding on the input picture to generate a channel expansion image with n times of channel number and low bit number;

step S300, expanding a convolution kernel channel of the input layer of the convolution neural network to n times to obtain a quantized input layer;

step S400, based on the channel expansion image, carrying out convolution operation through the quantized input layer and transmitting the convolution operation to a hidden layer to obtain a characteristic image;

step S500, setting the weight of the last layer of the corresponding convolutional neural network output layer as k branches, and quantizing the activation value of each branch with a low bit number to generate a quantized activation value, so as to obtain an output layer with a multi-branch structure;

and step S600, based on the characteristic image, obtaining a final calculation result of the neural network after weighted accumulation of the output of each branch of the output layer of the multi-branch structure.

In a third aspect of the present invention, an electronic device is provided, including: at least one processor; and a memory communicatively coupled to at least one of the processors; wherein the memory stores instructions executable by the processor for execution by the processor to implement the image analysis method based on neural network input-output quantization described above.

In a fourth aspect of the present invention, a computer-readable storage medium is provided, which stores computer instructions for being executed by the computer to implement the above-mentioned image analysis method based on neural network input and output quantization.

The invention has the beneficial effects that:

(1) according to the image analysis system based on the input and output quantization of the neural network, the input and output layers are quantized on the premise of keeping the integral quantization consistency of the network by performing thermometer coding and an output branch design method on the input of the neural network, the network precision is hardly lost, and the light-weighted accuracy of the neural network is improved. The invention realizes that each layer of the network is deployed in the light-weight terminal equipment without adding extra processing modules for the input layer and the output layer, and the precision of the network is hardly lost.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is a block diagram of an image analysis system based on neural network input/output quantization according to an embodiment of the present invention;

FIG. 2 is a flow chart of an image analysis method based on neural network input/output quantization according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of the principle of thermometer coding in an image analysis system based on input and output quantization of a neural network according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of the principle of convolution kernel single channel expansion in an image analysis system based on neural network input-output quantization according to the present invention;

fig. 5 is a schematic diagram illustrating the principle of convolution kernel three-channel expansion in the image analysis system based on neural network input/output quantization.

Detailed Description

The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

The invention provides an image analysis system based on input and output quantization of a neural network, which provides a neural network input thermometer coding and output branch design method.

The invention relates to an image analysis system based on input and output quantization of a neural network, which comprises: the device comprises a picture acquisition module, an input data quantization module, an input layer quantization module, a hidden layer processing module, an output layer quantization module and an output module;

In order to more clearly describe the image analysis system based on input and output quantization of the neural network, the following describes each functional module in the embodiment of the present invention in detail with reference to fig. 1.

The image analysis system based on neural network input and output quantization of the first embodiment of the invention comprises a picture acquisition module, an input data quantization module, an input layer quantization module, a hidden layer processing module, an output layer quantization module and an output module, wherein each functional module is described in detail as follows:

in this embodiment, the input data quantization module specifically includes generating a 4-bit 17-channel extended image by using an input picture of 8-bit RGB three-channel data in a thermometer coding manner, where n takes a value of 17. The original input picture is 8 bits of RGB three-channel data, each channel has a value of 0-255, and 8 bits are changed into a plurality of 4 bits by thermometer coding. Taking the R channel as an example, if the component value of R is 56, a 4-bit maximum can represent 15, so that 56 needs 3 bits of 0x1111 and one bit of 0x1011 to represent, and the rest are all 0. I.e., the decimal 56 for the R channel is expressed in 4-bit thermometer coding, it should be 15, 15, 15, 11, 0, 0, 0, 0, 0, 0, 0, 0. As shown in fig. 3, thermometer coding is used to represent decimal numbers, taking 49, 156, 255 as an example.

in this embodiment, the input layer quantization module specifically expands the number of convolution kernel channels of the convolutional neural network from 3 to 51, where n takes a value of 17. Since a 4-bit representation has a maximum value of 15 and an 8-bit representation has a maximum value of 255, an 8-bit representation requires at least 17 4-bit representations, and thus 3 channels of RGB are extended to 17 times, i.e., 51 channels, after thermometer coding. Although the original single channel is expanded to 17 channels, the channels of the convolution kernels corresponding to the 17 channels are still the same, and therefore, the channels of the corresponding convolution kernels only need to be duplicated and expanded to 17. Because the convolution operation is originally a multiply-add mode, 17 channels do not need to be additionally accumulated into one channel, and the convolution operation can be realized only through normal convolution operation. The convolution kernel channel number is expanded by copying the pixel value of each channel of 3 channels by 16, and the size of each channel is changed from 3 x1 to 3 x 17 by copying. As shown in fig. 4 and 5, the convolution kernel single channel and the single convolution kernel and the copy extension corresponding to the RGB three channels.

in this embodiment, the output layer quantization module quantizes the activation value of the output layer to a 4-bit activation value, that is, a quantized activation value, and increases the weight of the last layer of the output layer from one branch to k branches. If the output layer is simply quantized with 4 bits, no matter whether the integer bits and the decimal bits are respectively several bits, the expression space of the output layer has 16 kinds, at this time, the basic classification task cannot be completed, and the complex target detection task cannot be completed, because the 4-bit data only has 16 expression modes of 0000-. The output layer cannot be simply low bit quantized.

In order to increase the representable space of the output layer, the output layer is considered to increase the branch structure, and finally a plurality of branches are accumulated and enter the data post-processing stage of the neural network. Compared with the method that a single branch is adopted for quantification and then is used as the output of the network, the method that a plurality of branch structures are adopted not only increases the representable space of the output, but also increases the weight quantity of the last layer, and because the data volume of the last layer of the neural network is usually smaller, even if a plurality of branches are added, the parameter quantity is smaller compared with the huge parameter quantity of the whole network, so that the parameter quantity increase caused by the branch increase can be almost ignored. Therefore, the quantization of the activation value of the output layer of the neural network and the compensation of the precision loss caused by the quantization by adding branches are significant.

In this embodiment, the k branches include a positive integer branch, a negative integer branch, a positive fractional branch, and a negative fractional branch. In order to make the output result of the multi-branch low bit quantization closer to the output result of the single-branch unquantized neural network, it is preferable to treat the four branches as positive integer, negative integer, positive decimal or negative decimal, respectively, and the final result of the network is the summation of the four branches.

If the quantization of the input layer or the quantization of the output layer is performed only separately, the problem that the neural network cannot be arranged on specific equipment is easily caused, so that each functional module of the present application should be regarded as an integral technical scheme and should not be subjected to transition splitting and separate comparison with other files or experiments.

In this embodiment, the system further includes a convolution training module, specifically, after the full-precision output layer is modified into a multi-branch structure, a full-precision output layer multi-branch guiding method may be adopted for training the network, and the function of the repeated picture obtaining module, the input data quantization module, the input layer quantization module, the hidden layer processing module, the output layer quantization module, and the output module generates a quantization feature image; wherein the integer part of positive numbers and the positive integer branch calculate a positive integer L1 penalty, the fractional part of positive numbers and the positive fractional branch calculate a positive fractional L1 penalty, the integer part of negative numbers and the negative integer branch calculate a negative integer L1 penalty, and the fractional part of negative numbers and the negative fractional branch calculate a negative fractional L1 penalty. In this embodiment, the network training method includes: repeating the functions of the picture acquisition module, the input data quantization module, the input layer quantization module, the hidden layer processing module, the output layer quantization module and the output module on the training data to generate a calculation result, performing a random gradient descent algorithm until the precision of the image analysis network is improved by less than a preset threshold value in each iteration, and the descent of a loss function between each iteration is less than the preset loss threshold value or reaches the preset iteration times, and stopping iteration to obtain the trained image analysis network.

In this embodiment, the output layer quantization module may set the output layer branches to any number, for example, each branch weighting value may be set to × 2, × 1,/2,/4,/8, or/16, or may be set to other values or forms of branches as needed.

It should be noted that, the image analysis system based on input and output quantization of a neural network provided in the above embodiment is only illustrated by the division of the above functional modules, and in practical applications, the above functions may be allocated to different functional modules according to needs, that is, the modules or steps in the embodiment of the present invention are further decomposed or combined, for example, the modules in the above embodiment may be combined into one module, or may be further split into multiple sub-modules, so as to complete all or part of the above described functions. The names of the modules and steps involved in the embodiments of the present invention are only for distinguishing the modules or steps, and are not to be construed as unduly limiting the present invention.

The image analysis method based on neural network input and output quantization of the second embodiment of the present invention includes steps S100 to S600, and each step is detailed as follows:

step S500, setting the weight of the last layer of the corresponding convolutional neural network output layer as k branches, and quantizing the activation value of each branch with a low bit number to generate a quantized activation value, thereby obtaining an output layer with a multi-branch structure;

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process and related description of the system described above may refer to the corresponding process in the foregoing method embodiments, and will not be described herein again.

An electronic apparatus according to a third embodiment of the present invention includes:

at least one processor; and a memory communicatively coupled to at least one of the processors; wherein the memory stores instructions executable by the processor for execution by the processor to implement the image analysis method based on neural network input-output quantization described above.

A computer-readable storage medium of a fourth embodiment of the present invention stores computer instructions for being executed by the computer to implement the image analysis method based on input and output quantization of a neural network described above.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes and related descriptions of the storage device and the processing device described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The terms "first," "second," and the like are used for distinguishing between similar elements and not necessarily for describing or implying a particular order or sequence.

The terms "comprises," "comprising," or any other similar term are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.

Claims

1. An image analysis system based on neural network input-output quantization, the system comprising: the device comprises a picture acquisition module, an input data quantization module, an input layer quantization module, a hidden layer processing module, an output layer quantization module and an output module;

the hidden layer processing module is configured to perform convolution operation through the quantized input layer and transmit the convolution operation to a hidden layer based on the channel expansion image, and obtain a characteristic image;

the output layer quantization module is configured to set the weight of the last layer of the corresponding convolutional neural network output layer as k branches, quantize the activation value of each branch by a low bit number, generate a quantized activation value, and further obtain an output layer with a multi-branch structure; (ii) a

2. The image analysis system based on neural network input/output quantization of claim 1, wherein the input data quantization module specifically includes a thermometer coding method for generating a 4-bit 17-channel expansion image from an input picture of 8-bit RGB three-channel data, and n takes a value of 17.

3. The image analysis system based on neural network input-output quantization of claim 2, wherein the input layer quantization module specifically includes expanding the number of convolution kernel channels of the convolution neural network from 3 to 51, where n takes a value of 17.

4. The image analysis system based on neural network input/output quantization of claim 1, wherein the output layer quantization module is specifically configured to quantize the activation value of the output layer to a 4-bit activation value (i.e., quantized activation value), and increase the weight of the last layer of the output layer from one branch to k branches.

5. The neural network input-output quantization based image analysis system of claim 4, wherein the k branches comprise a positive integer branch, a negative integer branch, a positive fractional branch, and a negative fractional branch.

6. The image analysis system for quantization of an input layer and an output layer of a neural network according to claim 5, further comprising a convolution training module, specifically a function generation quantization feature image of the repeated picture acquisition module, the input data quantization module, the input layer quantization module, the hidden layer processing module, the output layer quantization module, and the output module; wherein the integer part of positive numbers and the positive integer branch calculate a positive integer L1 penalty, the fractional part of positive numbers and the positive fractional branch calculate a positive fractional L1 penalty, the integer part of negative numbers and the negative integer branch calculate a negative integer L1 penalty, and the fractional part of negative numbers and the negative fractional branch calculate a negative fractional L1 penalty.

7. The image analysis system according to claim 6, wherein the output layer quantization module is capable of setting the output layer branches to any number, and the branch weighting values are respectively set to x 2, x1, x 2, x 4, x 8, or/16.

8. An image analysis method based on neural network input-output quantization, the method comprising:

step S300, expanding a convolution kernel channel of the convolution neural network to n times to obtain a quantized input layer;

step S400, transmitting the convolution result of the thermometer-coded low bit number and the corresponding expanded convolution kernel to an implicit layer by the quantized input layer;

step S500, setting the weight of the last layer of the output layer of the convolutional neural network as k branches to obtain the output layer of a multi-branch structure, and carrying out low-bit quantization on the activation value to generate a quantized activation value;

step S600, based on the quantization of the input and output layer, obtaining a quantization feature image of channel expansion, that is, a final calculation result of the neural network, by performing weighted accumulation on each branch output of the output layer of the multi-branch structure.

9. An electronic device, comprising: at least one processor; and a memory communicatively coupled to at least one of the processors; wherein the memory stores instructions executable by the processor for execution by the processor to implement the neural network input-output quantization based image analysis method of claim 8.

10. A computer-readable storage medium storing computer instructions for execution by the computer to implement the method for image analysis based on neural network input-output quantization of claim 8.