CN111144457A

CN111144457A - Image processing method, device, equipment and storage medium

Info

Publication number: CN111144457A
Application number: CN201911280488.7A
Authority: CN
Inventors: 曹效伦
Original assignee: Reach Best Technology Co Ltd
Current assignee: Reach Best Technology Co Ltd
Priority date: 2019-12-13
Filing date: 2019-12-13
Publication date: 2020-05-12
Anticipated expiration: 2039-12-13
Also published as: CN111144457B

Abstract

The present disclosure relates to an image processing method, an apparatus, a device and a storage medium, which are applied to a terminal device or a server, and the method includes: acquiring weight data and input image data of a target convolutional layer in the pre-trained convolutional neural network; mapping the weight data and the input image data of the target convolutional layer to a set threshold interval, and respectively obtaining a weight quantization value and a weight scaling scale corresponding to the weight data, and an image quantization value and an image scaling scale corresponding to the input image data; and obtaining image characteristic data output by the target convolutional layer according to the convolution operation result of the weight quantization value and the image quantization value of the target convolutional layer, the weight scaling scale and the image scaling scale. The method and the device can realize the quantization processing of the target convolution layer in the neural network, reduce the size of the model, improve the image processing speed and the generalization capability of the quantization processing, reduce the precision loss and further improve the image processing quality.

Description

Image processing method, device, equipment and storage medium

Technical Field

The present disclosure relates to the field of deep learning technologies, and in particular, to an image processing method, apparatus, device, and storage medium.

Background

Today, with the rapid development of artificial intelligence, deep learning techniques play an irreplaceable role in more and more business scenarios. With the model structure becoming more complex and the application scenarios (edge calculation) of the mobile terminal becoming more and more, how to improve the inference speed of the neural network model is also receiving more and more attention.

Quantization techniques are commonly used in the related art to quantify the overall convolution within the neural network to improve the inference speed of the neural network model. However, this quantization scheme may cause the output tensor distribution of the quantized model to change, which may result in a large loss of model precision and a poor generalization capability of the model.

Disclosure of Invention

The present disclosure provides an image processing method, apparatus and system to at least solve the above technical problems in the related art.

The technical scheme of the disclosure is as follows:

according to a first aspect of the embodiments of the present disclosure, there is provided an image processing method, which performs feature extraction on an input image by using a pre-trained convolutional neural network, where the convolutional neural network includes a plurality of convolutional layers; the method comprises the following steps:

acquiring weight data and input image data of a target convolutional layer in the pre-trained convolutional neural network;

mapping the weight data and the input image data of the target convolutional layer to a set threshold interval, and respectively obtaining a weight quantization value and a weight scaling scale corresponding to the weight data, and an image quantization value and an image scaling scale corresponding to the input image data;

and obtaining image characteristic data output by the target convolutional layer according to the convolution operation result of the weight quantization value and the image quantization value of the target convolutional layer, the weight scaling scale and the image scaling scale.

In an embodiment, the step of obtaining the image feature data output by the target convolutional layer according to the convolution operation result of the weight quantization value and the image quantization value of the target convolutional layer, and the weight scaling scale and the image scaling scale includes:

obtaining a multiplication result of the weight scaling, the image scaling and the convolution operation result;

carrying out batch processing on the multiplication result, and adding the batch processing result and the offset to obtain an addition result;

and calculating the addition result by using an activation function to obtain image characteristic data output by the target convolution layer.

In an embodiment, the mapping the weight data of the target convolutional layer and the input image data to a set threshold interval to obtain a weight quantization value and a weight scaling scale corresponding to the weight data, and an image quantization value and an image scaling scale corresponding to the input image data respectively includes:

determining an upper threshold and a lower threshold of data to be mapped, wherein the data to be mapped comprises the weight data and/or the input image data;

and quantizing the data to be mapped based on the upper threshold and the lower threshold to obtain a weight quantization value and a weight scaling scale corresponding to the weight data, and an image quantization value and an image scaling scale corresponding to the input image data.

In an embodiment, the step of determining the upper threshold and the lower threshold of the data to be mapped includes:

and determining an upper threshold and a lower threshold of the data to be mapped based on the numerical distribution range of the data to be mapped.

In an embodiment, the step of performing quantization processing on the data to be mapped based on the upper threshold and the lower threshold to obtain a weight quantization value and a weight scaling scale corresponding to the weight data, and an image quantization value and an image scaling scale corresponding to the input image data includes:

in response to determining that the value of the data to be mapped is greater than or equal to the upper threshold, determining a quantized value of the data to be mapped as the upper threshold;

in response to determining that the value of the data to be mapped is less than or equal to the lower threshold, determining a quantized value of the data to be mapped as the lower threshold;

and in response to the fact that the numerical value of the data to be mapped is determined to be between the upper limit threshold and the lower limit threshold, carrying out quantization processing on the data to be mapped based on a preset quantization algorithm, and taking the result of the quantization processing as the quantization value of the data to be mapped.

According to a second aspect of the embodiments of the present disclosure, there is provided an image processing apparatus that performs feature extraction on an input image using a pre-trained convolutional neural network, the convolutional neural network including a plurality of convolutional layers; the device comprises:

a weight data acquisition module configured to perform acquiring weight data and input image data of a target convolutional layer in the pre-trained convolutional neural network;

a weight data quantization module configured to perform mapping of the weight data of the target convolutional layer and the input image data to a set threshold interval, and obtain a weight quantization value and a weight scaling scale corresponding to the weight data, and an image quantization value and an image scaling scale corresponding to the input image data, respectively;

and the image characteristic acquisition module is configured to execute a convolution operation result according to the weight quantization value of the target convolutional layer and the image quantization value, and the weight scaling scale and the image scaling scale to obtain image characteristic data output by the target convolutional layer.

In one embodiment, the image feature obtaining module includes:

a multiplication result obtaining unit configured to perform obtaining a multiplication result of the weight scaling scale, the image scaling scale, and the convolution operation result;

an addition result obtaining unit configured to perform batch processing on the multiplication result and add the batch processing result and an offset to obtain an addition result;

and the image characteristic acquisition unit is configured to perform operation on the addition result by using an activation function to obtain image characteristic data output by the target convolutional layer.

In one embodiment, the weight data quantization module comprises:

a data threshold determination unit configured to perform determining an upper threshold and a lower threshold of data to be mapped, the data to be mapped including the weight data and/or the input image data;

and the weight data quantization unit is configured to perform quantization processing on the data to be mapped based on the upper threshold and the lower threshold, so as to obtain a weight quantization value and a weight scaling scale corresponding to the weight data, and an image quantization value and an image scaling scale corresponding to the input image data.

In an embodiment, the data threshold determination unit is further configured to perform determining an upper threshold and a lower threshold of the data to be mapped based on a numerical value distribution range of the data to be mapped.

In an embodiment, the weight data quantization unit is further configured to perform:

According to a third aspect of the embodiments of the present disclosure, there is provided an image processing electronic device that performs feature extraction on an input image using a pre-trained convolutional neural network, the convolutional neural network including a plurality of convolutional layers; the electronic device includes:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the image processing method according to any one of the above.

According to a fourth aspect of embodiments of the present disclosure, there is provided a storage medium having instructions that, when executed by a processor of an image processing electronic device, enable the image processing electronic device to perform the image processing method according to any one of the above.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product, which, when executed by a processor of an image processing electronic device, enables the image processing electronic device to perform the image processing method as defined in any one of the above.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

the method comprises the steps of obtaining weight data and input image data of a target convolutional layer in the pre-trained convolutional neural network, mapping the weight data and the input image data of the target convolutional layer to a set threshold interval, respectively obtaining a weight quantization value and a weight scaling scale corresponding to the weight data, and an image quantization value and an image scaling scale corresponding to the input image data, further obtaining image characteristic data output by the target convolutional layer according to a convolution operation result of the weight quantization value and the image quantization value of the target convolutional layer, and the weight scaling scale and the image scaling scale, wherein the distribution of the image quantization value of the target convolutional layer obtained after quantization is approximate to the distribution of the input image data of the convolutional layer before quantization because the weight data and the input image data of the target convolutional layer are quantized in a mode of mapping to a corresponding set threshold space, furthermore, the quantization processing of the target convolutional layer can be independent of the processing of other convolutional layers in the current neural network, so that only part of convolutional layers (such as convolutional layers for classification) in the neural network can be quantized, the size of the model is reduced, the image processing speed is increased, the generalization capability of the quantization processing can be improved, the precision loss is reduced, and the image processing quality can be improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

Fig. 1A is a schematic diagram of a convolutional neural network structure according to an example.

FIG. 1B is a schematic diagram of a structure of a convolution kernel in a convolutional neural network, according to an example.

Fig. 1C is a schematic diagram of a process for calibration according to an example quantization scheme.

FIG. 1D is a process diagram of a quantization calculation after calibration according to an example quantization scheme.

FIG. 2A is a flow diagram illustrating an image processing method according to an exemplary embodiment.

FIG. 2B is a diagram illustrating a convolution calculation process, according to an exemplary embodiment.

FIG. 3 is a flow chart illustrating how image feature data output by the target convolutional layer is obtained according to an exemplary embodiment.

FIG. 4 is a flowchart illustrating how to map the weight data and input image data for the target convolutional layer to a set threshold interval, according to an exemplary embodiment.

FIG. 5 is a flowchart illustrating how to map the weight data and input image data for the target convolutional layer to a set threshold interval according to yet another exemplary embodiment.

Fig. 6 is a block diagram illustrating an image processing apparatus according to an exemplary embodiment.

Fig. 7 is a block diagram illustrating an image processing apparatus according to still another exemplary embodiment.

FIG. 8 is a block diagram illustrating an image processing electronic device according to an exemplary embodiment.

Detailed Description

In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

FIG. 1A is a schematic diagram of a convolutional neural network, according to an example; FIG. 1B is a schematic diagram of a structure of a convolution kernel in a convolutional neural network, according to an example; FIG. 1C is a schematic diagram of a process for calibration according to an exemplary quantization scheme; FIG. 1D is a process diagram of a quantization calculation after calibration according to an example quantization scheme.

As shown in fig. 1A, the CNN convolutional neural network is composed of a plurality of convolution kernels that perform convolution calculations, and typically, a part of the convolution kernels performs convolution calculations on input samples to extract features, and another part of the convolution kernels performs convolution calculations on the extracted features to obtain classification results. As shown in FIG. 1B, each convolution kernel consists of three parts of computation, convolution, batch, and activation functions, respectively. The convolution calculation part performs convolution calculation on input data and the weight, the operation result is input to batch processing and added with the offset, and the output result of the batch processing is subjected to an activation function to complete the convolution calculation. Further, as shown in fig. 1C, because the numerical range that INT8(Integer 8bit, 8-bit Integer format) can express is much smaller than FP32(Floating Point 32bit, single precision Floating Point format), in order to ensure consistent distribution, a calibration step is added in the current quantization technology, in this calibration process, the convolution of the original data and the quantized weight is calculated to obtain the distribution of the output tensor of each data, then statistical analysis is performed on the distribution to obtain an appropriate scale and offset, so that the distribution can fall into the range that INT8 can express, that is, -128 to 127 or 0 to 255, and then the scale and offset are combined into the scale of the weight and the offset of batch processing. As shown in fig. 1D, after calibration of a batch of data, the scale of weights and the shift amount of the batch are updated, and INT8 convolution calculation is started. The inventors have noted that this approach suffers from the following disadvantages: the resulting scale and offset are statistically based, i.e. strongly dependent on the selected calibration data set. Once other data is used, the obtained statistical scaling and offset will be different, and therefore the generalization ability of the quantized quantization scheme will be reduced. For example, in a model calibrated by selecting one batch of data in the training dataset, a relatively large degradation in accuracy may occur for another batch of training data because the scaling of the convolution is based on the data used in calibration and has been solidified into a scaled scale of weights. Furthermore, when another batch of data with a large distribution difference with the check set is selected for calculation, the calibrated scale and offset will not adapt any more, resulting in a large error, thereby reducing the generalization capability of the quantization scheme. This cannot be solved well by adding the check set, because when the distribution difference of the data in the check set is large, the obtained statistical scaling and offset cannot be adapted to each data well, i.e. the generalization capability is poor. On the other hand, the scope of the current quantization technique is convolution of the whole network (i.e., global quantization), and quantization of only a part of the convolution in the neural network cannot be realized. However, in the neural network, one part of the convolution kernel is generally responsible for feature extraction, and then the extracted features are input into another part of the convolution kernel for classification, and global quantization affects the extracted features, thereby causing great loss of classification accuracy and poor model generalization capability.

In view of this, the present application provides the following image processing method, apparatus, system, device and storage medium, so as to solve the technical problems that the convolution computation scheme in the related art causes a change in the output tensor distribution of the quantized model, which further causes a large loss of model precision and a poor generalization capability of the model. Specifically, the method is realized through the following technical scheme:

FIG. 2A is a flow diagram illustrating an image processing method according to an exemplary embodiment. FIG. 2B is a diagram illustrating a convolution calculation process, according to an exemplary embodiment. The image processing method of the embodiment can be used for computer equipment, such as a terminal or a server, and performs feature extraction on an input image by using a pre-trained convolutional neural network, wherein the convolutional neural network comprises a plurality of convolutional layers. As shown in fig. 2A, the following steps S101-S103 are included.

In step S101, weight data and input image data of a target convolutional layer in the pre-trained convolutional neural network are acquired.

Wherein the input image data may be data of an input image for a first convolutional layer; and for the other convolutional layer may be the output image data of the previous convolutional layer.

In practical applications, such as applications in which image data is processed by using a convolutional neural network, compression processing is usually performed on the convolutional neural network in order to reduce the amount of computation and increase the processing speed.

In this embodiment, after the convolutional neural network is trained, the weight of the convolutional kernel of each convolutional layer in the network can be obtained. Further, when a target convolutional layer that needs to be quantized is determined, weight data and input image data of the target convolutional layer may be acquired.

It should be noted that, considering that the neural network generally includes a convolution kernel for performing feature extraction and a convolution kernel for performing classification, in order to avoid affecting the extracted features, the convolution kernel for performing classification may be selected as the target convolution layer in this embodiment, so that accuracy of feature extraction may be ensured, and quantization processing on the convolutional neural network may be implemented.

In step S102, the weight data of the target convolutional layer and the input image data are mapped to a set threshold interval, and a weight quantization value and a weight scaling scale corresponding to the weight data, and an image quantization value and an image scaling scale corresponding to the input image data are obtained, respectively.

In this embodiment, after obtaining the weight data and the input image data of the target convolutional layer in the pre-trained convolutional neural network, the weight data and the input image data of the target convolutional layer may be mapped to a set threshold interval, and a weight quantization value and a weight scaling scale corresponding to the weight data, and an image quantization value and an image scaling scale corresponding to the input image data are obtained respectively.

It is to be noted that the calculation of the neural network is mainly time-consuming for the convolution calculation of FP32 type, and therefore, the present embodiment reduces the size of the model and improves the inference speed by quantizing the weight data of FP32 type and the input image data into the weight data of INT8 type and the input image data. The way of quantizing the data from FP32 type to INT8 type can be explained and illustrated in the related art, and this embodiment does not limit this.

The inventor has noted that, in the current quantization technology, only the weight of the convolution kernel is quantized, which causes the output tensor distribution of the quantized model to change, and further causes the technical problems of great loss of model precision and poor model generalization capability. In view of this, as shown in fig. 2B, in the present embodiment, the weighting data of the target convolutional layer and the input image data are both quantized by mapping to the corresponding set threshold space, and the scaling scale of the quantization process is calculated, so that the weighting quantization value and the weighting scaling scale corresponding to the weighting data, and the image quantization value and the image scaling scale corresponding to the input image data can be obtained.

In another embodiment, the above-mentioned manner of performing quantization processing on the weights and the input data respectively to obtain the quantized values of the weights and the input quantized values may be referred to the following embodiments shown in fig. 4 or fig. 5, and will not be described in detail herein.

In step S103, image feature data output by the target convolutional layer is obtained according to a convolution operation result of the weight quantization value and the image quantization value of the target convolutional layer, and the weight scaling scale and the image scaling scale.

In this embodiment, after the weights and the input data are quantized respectively to obtain weighted quantized values and input quantized values, convolution operation may be performed on the weighted quantized values and the input quantized values to obtain operation results.

For example, in the case of quantizing the weight data of FP32 type and the input image data into weight data of INT8 type and input image data, the obtained weight quantized value of INT8 type and the input quantized value may be subjected to INT8 convolution

For example, as shown in fig. 2B, after performing convolution operation on the weighted quantization value and the input quantization value to obtain an operation result, the image feature data output by the target convolutional layer may be obtained according to the convolution operation result of the weighted quantization value and the image quantization value of the target convolutional layer, and the weighted scaling scale and the image scaling scale.

It should be noted that, the above-mentioned manner of obtaining the image feature data output by the target convolutional layer according to the convolution operation result of the weight quantization value and the image quantization value of the target convolutional layer, and the weight scaling scale and the image scaling scale may also refer to the following embodiment shown in fig. 3, and will not be described in detail herein.

As can be seen from the above description, in this embodiment, the weight data and the input image data of the target convolutional layer in the pre-trained convolutional neural network are obtained, and the weight data and the input image data of the target convolutional layer are mapped to the set threshold interval, so as to obtain the weight quantization value and the weight scaling scale corresponding to the weight data, and the image quantization value and the image scaling scale corresponding to the input image data, respectively, and further, the image feature data output by the target convolutional layer can be obtained according to the convolution operation result of the weight quantization value and the image quantization value of the target convolutional layer, and the weight scaling scale and the image scaling scale, and since the quantization processing is performed on both the weight data and the input image data of the target convolutional layer by mapping to the corresponding set threshold space, the distribution of the image quantization value of the target convolutional layer obtained after quantization can be approximately close to the input graph of the convolutional layer before quantization The distribution of the image data can further enable the quantization processing of the target convolutional layer to be independent of the processing of other convolutional layers in the current neural network, can realize the quantization of only part of convolutional layers (such as convolutional layers for classification) in the neural network, reduce the size of the model, improve the image processing speed, improve the generalization capability of the quantization processing, reduce the precision loss and further improve the image processing quality.

FIG. 3 is a flow chart illustrating how image feature data output by the target convolutional layer is obtained according to an exemplary embodiment. The present embodiment is exemplified by how to obtain the image feature data output by the target convolutional layer on the basis of the above embodiments. As shown in fig. 3, the step of obtaining the image feature data output by the target convolutional layer according to the convolution operation result between the weight quantized value of the target convolutional layer and the image quantized value, and the weight scaling scale and the image scaling scale in step S104 may include the following steps S201 to S203:

in step S201, a multiplication result of the weight scaling scale, the image scaling scale, and the convolution operation result is obtained.

In this embodiment, after performing convolution operation on the weight quantization value and the input quantization value to obtain an operation result, a multiplication result of the weight scaling, the image scaling, and the convolution operation result may be obtained.

For example, after performing convolution operation on the weighted quantization value and the input quantization value to obtain an operation result, a product of the weighted scaling scale, the input scaling scale and the operation result may be calculated, and the product may be used as inverse mapping data, see the "amplification" process shown in fig. 2B. It will be appreciated that the amplified data is restored to the expression domain of the original FP32 type convolution, i.e. the distribution of the data convolved with the original FP32 remains similar. It is understood that the distribution of the amplified data can be considered similar to the distribution of the data convolved with the original FP32, and the theoretical basis is the distribution law and the binding law from convolution.

It is worth noting that the "amplification" process shown in fig. 2B can be understood as the concept of an amplifier, i.e. when the scaling is larger than 1, the effect is amplification; and when the telescopic dimension is less than 1, the effect is to reduce.

In step S202, the multiplication result is batch processed, and the batch processing result is added to the offset to obtain an addition result.

In this embodiment, after the multiplication result of the weight scaling, the image scaling, and the convolution operation result is obtained, the data may be Batch processed (Batch), and a predetermined offset may be added to the Batch processed data to obtain an addition result.

The manner of performing batch processing on the data may refer to a batch processing scheme in the related art, which is not limited in this embodiment.

In step S203, the addition result is calculated by using an activation function, so as to obtain image feature data output by the target convolutional layer.

In this embodiment, after the multiplication result is batch processed and the batch processing result is added to the offset to obtain an addition result, the addition result may be operated by using an activation function to obtain image feature data output by the target convolutional layer.

In this embodiment, after obtaining the result of multiplying the weight scaling, the image scaling, and the convolution operation result, and performing Batch processing (Batch) on the data and adding a predetermined offset, the obtained addition result may be subjected to an activation function operation, so as to obtain a target processing result, that is, a convolution result of the target convolution layer.

As can be seen from the above description, in this embodiment, the multiplication result of the weight scaling, the image scaling and the convolution operation result is obtained, the multiplication result is subjected to batch processing, the batch processing result is added to the offset to obtain an addition result, and the addition result is further subjected to operation by using the activation function to obtain the image characteristic data output by the target convolution layer.

FIG. 4 is a flowchart illustrating how to map the weight data and input image data for the target convolutional layer to a set threshold interval, according to an exemplary embodiment. The present embodiment exemplifies how to map the weight data of the target convolutional layer and the input image data to the set threshold interval on the basis of the above-described embodiments. As shown in fig. 4, the step of mapping the weight data of the target convolutional layer and the input image data to a set threshold interval to obtain the weight quantization value and the weight scaling scale corresponding to the weight data and the image quantization value and the image scaling scale corresponding to the input image data in step S102 may include the following steps S301 to S302:

in step S301, an upper threshold and a lower threshold of data to be mapped are determined.

In this embodiment, after obtaining the weight data of the target convolutional layer in the pre-trained convolutional neural network and the input image data, an upper threshold and a lower threshold of data to be mapped may be determined, where the data to be mapped may include the weight data and/or the input image data.

For example, after the data to be mapped is obtained, a numerical distribution range of the data to be mapped may be determined, and then an upper threshold and a lower threshold of the data to be mapped may be determined based on the data distribution range. For example, two end points of the data distribution range may be respectively determined as an upper threshold and a lower threshold of the data to be mapped.

It should be noted that, the above-mentioned manner for determining the numerical distribution range of the data to be mapped may refer to explanation and description in the related art, and this embodiment does not limit this.

In step S302, the data to be mapped is quantized based on the upper threshold and the lower threshold, so as to obtain a weight quantization value and a weight scaling scale corresponding to the weight data, and an image quantization value and an image scaling scale corresponding to the input image data.

In this embodiment, after the upper threshold and the lower threshold of the data to be mapped are determined, quantization processing may be performed on the data to be mapped based on the upper threshold and the lower threshold to obtain a weight quantization value and an input quantization value, and a weight scaling scale and an input scaling scale may be recorded.

For example, if the expression range of the currently selected INT8 is-128 to 127, the maximum value of the absolute value of the data to be mapped may be determined, and the maximum value is mapped to the upper bound 127 of the absolute value of INT8, so as to obtain the scaling of the quantized data.

In another embodiment, the manner of performing quantization processing on the data to be mapped based on the upper threshold and the lower threshold to obtain the weight quantization value and the input quantization value may also be referred to the following embodiment shown in fig. 5, which is not described in detail herein.

As can be seen from the above description, in the present embodiment, by determining the upper threshold and the lower threshold of the data to be mapped, and performing quantization processing on the data to be mapped based on the upper threshold and the lower threshold to obtain the weighted quantized value and the weighted scaling scale corresponding to the weighted data, and the image quantized value and the image scaling scale corresponding to the input image data, it is possible to perform subsequent convolution operation on the weighted quantized value and the input quantized value to obtain an operation result, and obtain the image feature data output by the target convolutional layer according to the convolution operation result of the weighted quantized value and the image quantized value of the target convolutional layer, and the weighted scaling scale and the image scaling scale, it is possible to improve the generalization capability of the quantization scheme, and since only the target convolutional layer in the neural network is subjected to quantization processing, therefore, the accuracy of subsequent feature extraction can be ensured, and the classification precision loss is reduced.

FIG. 5 is a flowchart illustrating how to map the weight data and input image data for the target convolutional layer to a set threshold interval according to yet another exemplary embodiment. The present embodiment exemplifies how to map the weight data of the target convolutional layer and the input image data to the set threshold interval on the basis of the above-described embodiments. As shown in fig. 5, the step of mapping the weight data of the target convolutional layer and the input image data to a set threshold interval to obtain the weight quantization value and the weight scaling scale corresponding to the weight data and the image quantization value and the image scaling scale corresponding to the input image data in step S102 may include the following steps S401 to S406:

in step S401, an upper threshold and a lower threshold of the data to be mapped are determined.

For the explanation and explanation of step S401, reference may be made to the above embodiments, which are not described herein again.

In step S402, it is determined whether the value of the data to be mapped is greater than or equal to the upper threshold: if yes, go to step S403; if not, go to step S404.

In step S403, the quantization value of the data to be mapped is determined as the upper threshold.

In step S404, it is determined whether the value of the data to be mapped is less than or equal to the lower threshold: if yes, go to step S405; if not, go to step S406.

In this embodiment, the data to be mapped includes the weight data and/or the input image data.

For example, after determining the upper threshold and the lower threshold of the data to be mapped, the numerical value of the data to be mapped may be compared with the upper threshold and the lower threshold, and then the following cases may be classified based on different obtained comparison results:

in the first case: if the numerical value of the data to be mapped is determined to be greater than or equal to the upper threshold, the quantized value of the data to be mapped may be determined to be the upper threshold.

In the second case: if it is determined that the value of the data to be mapped is less than or equal to the lower threshold, the quantized value of the data to be mapped may be determined as the lower threshold.

In the third case: if it is determined that the numerical value of the data to be mapped is between the upper threshold and the lower threshold, quantization processing may be performed on the data to be mapped based on a preset quantization algorithm, and a result of the quantization processing may be used as a quantization value of the data to be mapped.

In this embodiment, the above-mentioned manner of performing quantization processing on the data to be mapped based on a preset quantization algorithm and taking a result of the quantization processing as a quantization value of the data to be mapped may include: and mapping the data to be mapped from a single-precision floating point format FP32 to an 8-bit integer number format INT8, and taking a mapping result as a quantized value of the data to be mapped. For example, after obtaining the scaling (e.g., weighting scaling or input scaling) of the quantized data, each element in the data to be mapped may be divided by the corresponding scaling, so as to obtain a quantized value (e.g., weighting quantized value or input quantized value) of the data to be mapped, and then INT8 mapping may be completed.

It can be understood that the long tail effect of the data to be mapped can be effectively avoided by setting the upper threshold and the lower threshold, determining the quantization value of the data to be mapped as the upper threshold when the value of the data to be mapped is determined to be greater than or equal to the upper threshold, determining the quantization value of the data to be mapped as the lower threshold when the value of the data to be mapped is determined to be less than or equal to the lower threshold, and performing quantization processing on the data to be mapped based on a preset quantization algorithm when the value of the data to be mapped is determined to be between the upper threshold and the lower threshold, and taking the result of the quantization processing as the quantization value of the data to be mapped.

For example, the process of quantizing the FP32 type convolution into the INT8 type convolution in this embodiment can be visually shown by the following formulas (1) to (7):

input.32≈Threshold(input) (1)

weight.32≈Threshold(weight) (2)

scale＝max(abs(input.32))/127 (3)

scale′＝max(abs(weight.32))/127 (5)

input*weight≈input.32*weight.32＝(input.8×scale)*(weight.8×scale′)

＝(input.8*weight.8)×scale×scale′ (7)

where Threshold represents the clipping Threshold, input represents the input data, weight represents the weight, and ". 32" and ". 8" represent FP32 and INT8 formats, respectively.

Therefore, due to the fact that the input data and the weight are mapped and inversely mapped, information loss is caused by the difference of sampling density, the distribution of calculation results is guaranteed to be approximately consistent, the quantization scheme is guaranteed to have good generalization capability, meanwhile, the single convolution kernel in the neural network can be quantized, other convolution kernels are not affected to continue to be calculated in the FP32 domain, the precision loss of the whole network is lower, and the fine-grained INT8 quantization is achieved.

Fig. 6 is a block diagram illustrating an image processing apparatus according to an exemplary embodiment. The image processing method of the embodiment can be used for computer equipment, such as a terminal or a server, and performs feature extraction on an input image by using a pre-trained convolutional neural network, wherein the convolutional neural network comprises a plurality of convolutional layers. As shown in fig. 6, the apparatus includes: a weight data obtaining module 110, a weight data quantizing module 120, and an image feature obtaining module 130, wherein:

a weight data acquisition module 110 configured to perform acquiring weight data and input image data of a target convolutional layer in the pre-trained convolutional neural network;

a weight data quantization module 120 configured to perform mapping of the weight data of the target convolutional layer and the input image data to a set threshold interval, and obtain a weight quantization value and a weight scaling scale corresponding to the weight data, and an image quantization value and an image scaling scale corresponding to the input image data, respectively;

an image feature obtaining module 130 configured to perform a convolution operation result according to the weight quantization value of the target convolutional layer and the image quantization value, and the weight scaling scale and the image scaling scale, and obtain image feature data output by the target convolutional layer.

Fig. 7 is a block diagram illustrating an image processing apparatus according to still another exemplary embodiment. The image processing method of the embodiment can be used for computer equipment, such as a terminal or a server, and performs feature extraction on an input image by using a pre-trained convolutional neural network, wherein the convolutional neural network comprises a plurality of convolutional layers. The weight data obtaining module 210, the weight data quantizing module 220, and the image feature obtaining module 230 have the same functions as the weight data obtaining module 110, the weight data quantizing module 120, and the image feature obtaining module 130 in the embodiment shown in fig. 6, and are not described herein again. As shown in fig. 7, the image feature obtaining module 230 may include:

a multiplication result obtaining unit 231 configured to perform obtaining a multiplication result of the weight scaling scale, the image scaling scale, and the convolution operation result;

an addition result obtaining unit 232 configured to perform batch processing on the multiplication result and add the batch processing result and the offset to obtain an addition result;

an image feature obtaining unit 233 configured to perform an operation on the addition result by using an activation function, so as to obtain image feature data output by the target convolutional layer.

In an embodiment, the weight data quantization module 220 may include:

a data threshold determination unit 221 configured to perform determining an upper threshold and a lower threshold of data to be mapped, the data to be mapped including the weight data and/or the input image data;

a weight data quantization unit 222 configured to perform quantization processing on the data to be mapped based on the upper threshold and the lower threshold, and obtain a weight quantization value and a weight scaling scale corresponding to the weight data, and an image quantization value and an image scaling scale corresponding to the input image data.

In an embodiment, the data threshold determining unit 221 may be further configured to perform determining an upper threshold and a lower threshold of the data to be mapped based on a numerical value distribution range of the data to be mapped.

In an embodiment, the weight data quantization unit 222 is further configured to perform:

In an embodiment, the weight data quantization unit 222 may be further configured to perform mapping of the data to be mapped in a single-precision floating point format FP32 to an 8-bit integer number format INT8, and use the mapping result as a quantization value of the data to be mapped.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

It should be noted that, all the above-mentioned optional technical solutions may be combined arbitrarily to form the optional embodiments of the present disclosure, and are not described in detail herein.

The embodiment of the image processing device of the invention can be applied to network equipment. The device embodiments may be implemented by software, or by hardware, or by a combination of hardware and software. Taking a software implementation as an example, as a device in a logical sense, the device is formed by reading, by a processor of a device in which the device is located, a corresponding computer program instruction in a nonvolatile memory into an internal memory for running, where the computer program is used to execute the image processing method provided by the embodiment shown in fig. 2A to 5. From a hardware level, as shown in fig. 8, which is a hardware structure diagram of the convolution computing device of the present invention, besides the processor, the network interface, the memory, and the nonvolatile memory shown in fig. 8, the device may also include other hardware, such as a forwarding chip responsible for processing packets, and the like; the device may also be a distributed device in terms of hardware structure, and may include multiple interface cards to facilitate expansion of message processing at the hardware level.

On the other hand, the present application also provides a computer-readable storage medium, which, when a computer program stored in the storage medium is executed by a processor of an image processing electronic device, enables the image processing electronic device to execute the image processing method provided by the embodiment shown in fig. 2A to 5.

On the other hand, the present application also provides a computer program product, which, when executed by a processor of an image processing electronic device, enables the image processing electronic device to execute the image processing method provided by the embodiment shown in fig. 2A to 5.

For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the application. One of ordinary skill in the art can understand and implement it without inventive effort.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. An image processing method is characterized in that feature extraction is carried out on an input image by utilizing a pre-trained convolutional neural network, wherein the convolutional neural network comprises a plurality of convolutional layers; the method comprises the following steps:

2. The image processing method according to claim 1, wherein the step of obtaining the image feature data output by the target convolutional layer according to the convolution operation result of the weight quantization value and the image quantization value of the target convolutional layer, and the weight scaling scale and the image scaling scale comprises:

3. The image processing method according to claim 1, wherein the step of mapping the weight data of the target convolutional layer and the input image data to a set threshold interval to obtain a weight quantization value and a weight scaling scale corresponding to the weight data and an image quantization value and an image scaling scale corresponding to the input image data, respectively, comprises:

4. The image processing method according to claim 3, wherein the step of determining an upper threshold and a lower threshold for the data to be mapped comprises:

5. The image processing method according to claim 3, wherein the step of performing quantization processing on the data to be mapped based on the upper threshold and the lower threshold to obtain a weight quantization value and a weight scaling scale corresponding to the weight data, and an image quantization value and an image scaling scale corresponding to the input image data comprises:

6. An image processing apparatus is characterized in that feature extraction is performed on an input image by using a convolutional neural network trained in advance, the convolutional neural network including a plurality of convolutional layers; the device comprises:

7. The image processing apparatus according to claim 6, wherein the image feature acquisition module includes:

8. The image processing apparatus according to claim 6, wherein the weight data quantization module includes:

9. An image processing electronic device characterized by performing feature extraction on an input image using a pre-trained convolutional neural network, the convolutional neural network comprising a plurality of convolutional layers; the electronic device includes:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the image processing method of any one of claims 1 to 5.

10. A storage medium in which instructions, when executed by a processor of an image processing electronic device, enable the image processing electronic device to perform the image processing method of any one of claims 1 to 5.