CN115294108A

CN115294108A - Target detection method, target detection model quantification device, and medium

Info

Publication number: CN115294108A
Application number: CN202211197005.9A
Authority: CN
Inventors: 王程; 艾国; 杨作兴
Original assignee: Shenzhen MicroBT Electronics Technology Co Ltd
Current assignee: Shenzhen MicroBT Electronics Technology Co Ltd
Priority date: 2022-09-29
Filing date: 2022-09-29
Publication date: 2022-11-04
Anticipated expiration: 2042-09-29
Also published as: CN115294108B

Abstract

The embodiment of the application provides a target detection method, a target detection model quantification device and a medium, wherein the target detection method specifically comprises the following steps: receiving an image to be detected; processing an image to be detected according to the target detection model to obtain a corresponding detection result; in the process of processing an image to be detected, integer calculation is carried out according to a fixed point calculation result corresponding to the parameters of a network layer in a target detection model; outputting a detection result; wherein, the fixed point calculation result determination process comprises: determining original quantization information corresponding to a parameter to be quantized of a network layer; determining the change information of the quantization information corresponding to the parameter to be quantized of the network layer; according to the change information, establishing a mapping relation between the parameter identification and the target quantization information; mapping the quantization information of the parameters to be quantized of the network layer into target quantization information; and performing fixed-point calculation according to the target quantization information. The method and the device for processing the target detection model can improve the processing performance of the target detection model.

Description

Target detection method, target detection model quantification device, and medium

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a target detection method, a target detection model quantization method, an apparatus, and a medium.

Background

The neural network model is widely applied to the fields of artificial intelligence such as computer vision, voice recognition and the like. The neural network model usually has hundreds or even tens of millions of parameters after training is completed, and the parameters are usually stored and calculated based on floating point numbers; these all increase the storage and computational resources of the neural network model. At present, the parameters of the neural network model can be quantified; because quantization can convert parameters of the neural network model from floating point numbers with higher bits to fixed point numbers with lower bits, storage resources and computing resources consumed by the neural network model can be reduced.

In the current quantization method, floating point numbers of parameters are acquired by a network layer, fixed point calculation is performed on the floating point numbers, and a fixed point calculation result is stored. The fixed point calculation result of the network layer can be used in the processing procedure of the neural network model, and the processing procedure can include: computer vision processing, or speech recognition processing, etc.

In practical application, for a network layer, there may be a case that a parameter corresponding to a fixed-point calculation result is not matched with a parameter required by a processing procedure; in this case, applying the fixed-point calculation result to the processing procedure of this network layer will result in a decrease in processing performance of the neural network model.

Disclosure of Invention

The embodiment of the application provides a quantification method of a target detection model, which can realize the matching between parameters corresponding to fixed-point calculation results and parameters required by a processing process, so that the processing performance of the target detection model can be improved.

Correspondingly, the embodiment of the application also provides a quantification device of the target detection model, a target detection method and device, electronic equipment and a machine readable medium, which are used for ensuring the realization and application of the method.

In order to solve the above problem, an embodiment of the present application discloses a target detection method, which is applied to an embedded neural network processor, and the method includes:

receiving an image to be detected;

processing an image to be detected according to the target detection model to obtain a corresponding detection result; in the process of processing the image to be detected, integer calculation is carried out according to the fixed point calculation result corresponding to the parameters of the network layer in the target detection model;

outputting the detection result;

wherein the fixed point calculation result determination process comprises: according to the processing result of the target detection model on the calibration image set, determining original quantization information corresponding to the parameter to be quantized of the network layer; determining the change information of the quantization information corresponding to the parameter to be quantized of the network layer in the process of processing the preset image by using the target detection model; the change information is used for modifying the original quantization information which does not accord with the preset condition into the target quantization information which accords with the preset condition; the preset condition represents the requirement of the processing process of the target detection model on quantitative information; according to the change information, establishing a mapping relation between the parameter identification and the target quantization information; according to the mapping relation, the quantization information of the parameters to be quantized of the network layer is mapped into target quantization information; and performing fixed-point calculation according to the target quantization information of the parameter to be quantized of the network layer.

In order to solve the above problem, an embodiment of the present application discloses a method for quantizing a target detection model, where the method is used to quantize parameters to be quantized in a network layer in the target detection model, and includes:

receiving a set of calibration images;

according to the processing result of the target detection model on the calibration image set, determining original quantization information corresponding to the parameter to be quantized of the network layer;

determining the change information of the quantization information corresponding to the parameter to be quantized of the network layer in the process of processing the preset image by using the target detection model; the change information is used for modifying the original quantization information which does not accord with the preset condition into target quantization information which accords with the preset condition; the preset condition represents the requirement of the processing process of the target detection model on quantitative information;

according to the change information, establishing a mapping relation between the parameter identification and the target quantization information;

according to the mapping relation, the quantization information of the parameters to be quantized of the network layer is mapped into target quantization information;

performing fixed-point calculation according to target quantization information of a parameter to be quantized of a network layer;

and storing a fixed-point calculation result corresponding to the parameter to be quantized so as to perform integer calculation in the process of processing the image to be detected in the environment of the embedded neural network processor.

In order to solve the above problem, an embodiment of the present application discloses a target detection apparatus, which is applied to an embedded neural network processor, and the apparatus includes:

the receiving module is used for receiving an image to be detected;

the processing module is used for processing the image to be detected according to the target detection model so as to obtain a corresponding detection result; in the process of processing the image to be detected, integer calculation is carried out according to the fixed point calculation result corresponding to the parameters of the network layer in the target detection model;

the output module is used for outputting the detection result;

wherein, the fixed point calculation result determination process comprises: according to the processing result of the target detection model on the calibration image set, determining original quantization information corresponding to the parameter to be quantized of the network layer; determining the change information of the quantization information corresponding to the parameter to be quantized of the network layer in the process of processing the preset image by using the target detection model; the change information is used for modifying the original quantization information which does not accord with the preset condition into target quantization information which accords with the preset condition; the preset condition represents the requirement of the processing process of the target detection model on quantitative information; according to the change information, establishing a mapping relation between the parameter identification and the target quantization information; according to the mapping relation, the quantization information of the parameters to be quantized of the network layer is mapped into target quantization information; and performing fixed-point calculation according to the target quantization information of the parameter to be quantized of the network layer.

Optionally, the processing result includes: floating point numbers corresponding to the first parameters of the network layer under the condition of calibrating the image set;

the determining original quantization information of the parameter to be quantized of the network layer includes:

counting floating point numbers corresponding to the first parameters of the network layer under the condition of calibrating the image set to obtain corresponding statistical results;

and determining the original quantitative information of the second parameter of the network layer according to the statistical result.

Optionally, the determining the variation information of the parameter to be quantized of the network layer includes:

in the process of processing the preset image by using the target detection model, if the original quantization information corresponding to the parameter to be quantized of the network layer does not meet the preset condition, the change information of the parameter to be quantized of the network layer is generated.

Optionally, the determining original quantization information of a parameter to be quantized of a network layer includes:

establishing a first mapping table; the first mapping table includes: and mapping relation between the parameter identification corresponding to the parameter to be quantized of the network layer and the original quantization information.

Optionally, the parameter identification includes: parameter memory addresses, or parameter indexes or parameter meaning information corresponding to the parameter memory addresses.

Optionally, the mapping relationship between the parameter identifier and the target quantization information includes:

mapping relation between parameter index of parameter to be quantized and target quantization information; or alternatively

Mapping relation between a first parameter index of the parameter to be quantized and a second parameter index of the target quantization information; or

And mapping relation between the first parameter meaning information of the parameter to be quantized and the second parameter meaning information corresponding to the source parameter of the target quantization information.

Optionally, the mapping the quantization information of the parameter to be quantized of the network layer to the target quantization information includes:

according to the parameter index of the parameter to be quantized, searching in the mapping relation between the parameter index of the parameter to be quantized and the target quantization information to obtain the target quantization information corresponding to the parameter to be quantized; or alternatively

According to the parameter index of the parameter to be quantized, searching in the mapping relation between the first parameter index of the parameter to be quantized and the second parameter index of the target quantization information to obtain the target parameter index of the target quantization information corresponding to the parameter to be quantized; determining a target parameter memory address corresponding to the target parameter index; determining target quantization information corresponding to the parameter to be quantized according to the target parameter memory address and the mapping relation between the parameter memory address and the original quantization information; or

According to first parameter meaning information of a parameter to be quantized, searching in a mapping relation C between the first parameter meaning information of the parameter to be quantized and second parameter meaning information corresponding to a source parameter of target quantization information to obtain target parameter meaning information of the target quantization information corresponding to the parameter to be quantized; determining a target parameter index corresponding to the target parameter meaning information according to the mapping relation between the parameter index corresponding to the parameter to be quantized and the parameter meaning information; determining a target parameter memory address corresponding to the target parameter index; and determining the target quantization information corresponding to the parameter to be quantized according to the target parameter memory address and the mapping relation between the parameter memory address and the original quantization information.

Optionally, searching for a mapping relationship C between first parameter meaning information of the parameter to be quantized and second parameter meaning information corresponding to a source parameter of the target quantization information includes:

executing N times of queries aiming at the mapping relation C; n is a positive integer;

and under the condition that the parameter meaning information matched with the parameter meaning information obtained by the (N-1) th query does not exist in the Nth query representation mapping relation C, taking the (N-1) th parameter meaning information as target parameter meaning information.

In order to solve the above problem, an embodiment of the present application discloses a quantization apparatus for a target detection model, where the apparatus is configured to quantize a parameter to be quantized in a network layer in the target detection model, and the apparatus includes:

a receiving module for receiving a set of calibration images;

the original quantization information determining module is used for determining original quantization information corresponding to the parameter to be quantized of the network layer according to the processing result of the target detection model on the calibration image set;

the change information determining module is used for determining the change information of the quantization information corresponding to the parameter to be quantized of the network layer in the process of processing the preset image by using the target detection model; the change information is used for modifying the original quantization information which does not accord with the preset condition into target quantization information which accords with the preset condition; the preset condition represents the requirement of the processing process of the target detection model on quantitative information;

the first establishing module is used for establishing a mapping relation between the parameter identification and the target quantization information according to the change information;

the mapping module is used for mapping the quantization information of the parameters to be quantized of the network layer into target quantization information according to the mapping relation;

the fixed-point calculation module is used for carrying out fixed-point calculation according to the target quantization information of the parameter to be quantized of the network layer;

and the storage module is used for storing the fixed-point calculation result corresponding to the parameter to be quantized so as to perform integer calculation in the process of processing the image to be detected in the environment of the embedded neural network processor.

the original quantization information determination module includes:

the statistical module is used for counting floating point numbers corresponding to the first parameter of the network layer under the condition of the calibration image set to obtain a corresponding statistical result;

and the parameter determining module is used for determining the original quantitative information of the second parameter of the network layer according to the statistical result.

Optionally, the change information determining module is specifically configured to, in a process of processing a preset image by using the target detection model, generate change information of a parameter to be quantized of the network layer if original quantization information corresponding to the parameter to be quantized of the network layer does not meet a preset condition.

Optionally, the original quantization information determining module includes:

the second establishing module is used for establishing a first mapping table; the first mapping table includes: and mapping relation between the parameter identification corresponding to the parameter to be quantized of the network layer and the original quantization information.

mapping relation between parameter index of parameter to be quantized and target quantization information; or

Optionally, the mapping module includes:

the first mapping module is used for searching a mapping relation between the parameter index of the parameter to be quantized and the target quantization information according to the parameter index of the parameter to be quantized so as to obtain the target quantization information corresponding to the parameter to be quantized; or

The second mapping module is used for searching the mapping relation between the first parameter index of the parameter to be quantized and the second parameter index of the target quantization information according to the parameter index of the parameter to be quantized so as to obtain the target parameter index of the target quantization information corresponding to the parameter to be quantized; determining a target parameter memory address corresponding to the target parameter index; determining target quantization information corresponding to the parameter to be quantized according to the target parameter memory address and the mapping relation between the parameter memory address and the original quantization information; or

The third mapping module is used for searching a mapping relation C between the first parameter meaning information of the parameter to be quantized and second parameter meaning information corresponding to the source parameter of the target quantization information according to the first parameter meaning information of the parameter to be quantized so as to obtain target parameter meaning information of the target quantization information corresponding to the parameter to be quantized; determining a target parameter index corresponding to the target parameter meaning information according to the mapping relation between the parameter index corresponding to the parameter to be quantized and the parameter meaning information; determining a target parameter memory address corresponding to the target parameter index; and determining the target quantization information corresponding to the parameter to be quantized according to the target parameter memory address and the mapping relation between the parameter memory address and the original quantization information.

Optionally, the third mapping module includes:

the query module is used for executing N times of queries aiming at the mapping relation C; n is a positive integer;

and the target parameter meaning information determining module is used for taking the parameter meaning information of the (N-1) th time as the target parameter meaning information under the condition that the parameter meaning information matched with the parameter meaning information obtained by the (N-1) th time query does not exist in the Nth time query representation mapping relation C.

The embodiment of the application also discloses an electronic device, which comprises: a processor; and a memory having executable code stored thereon that, when executed, causes the processor to perform a method as described in embodiments of the present application.

The embodiment of the application also discloses a machine-readable medium, wherein executable codes are stored on the machine-readable medium, and when the executable codes are executed, a processor is caused to execute the method according to the embodiment of the application.

The embodiment of the application has the following advantages:

after the statistical link determines the original quantization information corresponding to the parameter to be quantized of the network layer, the embodiment of the application further comprises a forging processing link for processing the preset image by using the target detection model and a mapping link for mapping the quantization information of the parameter to be quantized of the network layer into the target quantization information. The method comprises the steps that a target detection model is used for detecting an image to be detected, wherein a forgery processing link corresponds to an actual processing link for processing the image to be detected by using the target detection model; in the counterfeit processing link, the change information of the quantization information corresponding to the parameter to be quantized of the network layer can be determined, and the mapping relation between the parameter identifier and the target quantization information is established according to the change information. In the mapping link, the quantization information of the parameter to be quantized of the network layer is mapped into target quantization information according to the mapping relation.

The forgery processing link of the embodiment of the application establishes the mapping relation between the parameter identification and the target quantization information, and the mapping relation comprises the information of the target quantization information which is corresponding to one parameter identification and meets the preset condition; in this way, the mapping link maps the quantization information of the parameter to be quantized of the network layer into the target quantization information meeting the preset condition, so that the target quantization information participating in fixed-point calculation can meet the preset condition. The preset condition can be used for representing the requirement of the processing process of the target detection model on the quantitative information, so that the matching between the parameters corresponding to the fixed-point calculation result and the parameters required by the processing process can be realized, and the processing performance of the target detection model can be improved.

The embodiment of the application can store the fixed-point calculation result corresponding to the parameter to be quantized, so as to perform integer calculation in the process of processing the image to be detected in an NPU (neutral-network processing units) environment. Because the image to be detected can be processed by using the fixed-point calculation result in the form of the integer in the processing process of the target detection model, the range of the integer calculation in the processing process of the target detection model can be increased, the storage resource and the calculation resource consumed by the target detection model can be further reduced, and the processing efficiency of the target detection model can be improved. For example, the calculations in the processing of the object detection model may include: integer calculation, such as integer multiplication, integer addition and integer shift, can reduce the memory resource and the calculation resource consumed by the target detection model and improve the processing efficiency of the target detection model because the processing process of the target detection model does not need to involve the calculation of floating point numbers.

Drawings

FIG. 1 is a schematic structural diagram of an object detection model according to an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of the network layer structure in the target detection model according to an embodiment of the present application;

FIG. 3 is a schematic diagram of the network layer structure of the residual error network according to one embodiment of the present application;

FIG. 4 is a flow chart illustrating steps of a method for quantifying an object detection model according to an embodiment of the present application;

FIG. 5 is a flow chart illustrating steps of a target detection method according to an embodiment of the present application;

FIG. 6 is a schematic diagram of an apparatus for quantizing a target detection model according to an embodiment of the present application;

FIG. 7 is a schematic diagram of the structure of an object detection device according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of an apparatus provided in an embodiment of the present application.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, the present application is described in further detail with reference to the accompanying drawings and the detailed description.

The neural network is a machine learning technology which simulates the neural network of the human brain so as to realize artificial intelligence, and is the basis of deep learning. The neural network model is a mathematical model based on a neural network, can be applied to the artificial intelligence fields of computer vision, voice recognition and the like, and is used for processing images so as to complete tasks in the artificial intelligence field. Tasks in the field of artificial intelligence may include: computer vision tasks, speech recognition tasks, etc., which may further include: an image classification task, a target detection task, etc. The image classification task can classify the image or pixel points or regions in the image into one of a plurality of categories; the target detection task can detect whether the image to be detected contains targets such as pedestrians and vehicles, and if yes, position information of the targets can be given.

The neural network model may include: and a plurality of network layers, wherein different network layers can be connected with each other. The parameters of the network layer may be the parameters of the network layer itself, or may be related parameters related to the parameters of the network layer. The parameters of the network layer may include: at least one of a weight parameter, an input parameter, and an output parameter.

A network layer may include a plurality of neurons. The weight parameter may characterize the strength of the connection between neurons. The input parameter may refer to a parameter corresponding to an input feature provided by a network layer. The output parameter may refer to a parameter corresponding to an output characteristic of one network layer.

The neural network model of the embodiment of the application can be a target detection model, the target detection model can extract the feature representation of the image to be detected, whether the image to be detected contains the targets such as pedestrians is detected according to the feature representation, and if yes, the position information of the targets such as the pedestrians can be given. The targets may include: moving objects such as pedestrians, vehicles, animals and the like, it can be understood that the embodiment of the present application is not limited to the specific target to be detected.

Referring to fig. 1, a schematic structural diagram of a target detection model according to an embodiment of the present application is shown, where the target detection model specifically includes: a feature extraction unit 101, a feature fusion unit 102, and a detection unit 103.

The feature extraction unit 101 may be configured to perform feature extraction on an image to be detected. The feature extraction unit 101 may be configured to receive an image to be detected, and extract an image feature of the image from the image to be detected, where the image feature may refer to a deep image feature.

The feature fusion unit 102 is a unit that is started from the top in the target detection model, and can fuse the image features extracted by the feature extraction unit 101 to obtain fused image features, which can improve the diversity of the features and the performance of the target detection model.

The detection unit 103 is configured to perform target detection according to the feature of the fused image output by the feature fusion unit 102 to obtain a corresponding detection result.

The feature extraction unit 101 may be a backbone (backbone) network, and may include: VGG (Visual Geometry Group Network), resNet (Residual Network), lightweight Network, and the like. It is understood that, in the embodiment of the present application, a specific network corresponding to the feature extraction unit 101 is not limited.

Wherein the residual network may be a convolutional network. The convolution network can be a deep feedforward artificial neural network and has better performance in image recognition. The convolutional network may specifically include a convolutional layer (convolutional layer) and a pooling layer (pooling layer). The convolutional layer is used to automatically extract features from an input image to obtain a feature map (feature map). The pooling layer is used for pooling the feature map to reduce the number of features in the feature map. The pooling treatment of the pooling layer may include: maximum pooling, average pooling, random pooling, etc., which may be selected as appropriate according to actual needs.

Referring to fig. 2, a schematic diagram of a structure of a network layer in a target detection model according to an embodiment of the present application is shown, where the network layer in fig. 2 may be an example of a component of the target detection model, and should not be construed as a limitation to the structure of the network layer in the target detection model, for example, the structure of the network layer in the embodiment of the present application may be a packet convolution network structure, or a residual error network structure, etc. The network layers in fig. 2 may include: a first network layer 201, a second network layer 202, a third network layer 203, a fourth network layer 204, a fifth network layer 205, and a sixth network layer 206.

The present quantization method generally uses the output parameter of the ith (i may be a positive integer) network layer as the input parameter of the (i + 1) th network layer. For example, the input parameters of the third network layer 203 in fig. 1 may be referred to as third input parameters; in the present quantization method, the output parameter (second output parameter) of the second network layer 202 is used as a third input parameter, the floating point number of the second output parameter is calculated in a fixed point manner, and the obtained fixed point calculation result a is used as the fixed point calculation result of the third network layer 203 for storage.

In practical applications, the processing procedure of the object detection model may require the output parameter (first output parameter) of the first network layer 201 as the third input parameter. In this case, since the second output parameter corresponding to the fixed-point calculation result a does not match the first output parameter required by the processing procedure, applying the fixed-point calculation result a to the processing procedure of the network layer will result in a decrease in the processing performance of the target detection model.

The embodiment of the present application provides a method for quantifying a target detection model, which may specifically include: according to the processing result of the target detection model on the calibration image set, determining original quantization information corresponding to the parameter to be quantized of the network layer; in the process of processing a preset image by using a target detection model, determining the change information of quantization information corresponding to the parameter to be quantized of a network layer; the change information is used for modifying the original quantization information which does not accord with the preset condition into target quantization information which accords with the preset condition; the preset condition can be used for representing the requirement of the processing process of the target detection model on quantitative information; according to the change information, establishing a mapping relation between the parameter identification and the target quantization information; according to the mapping relation, the quantization information of the parameters to be quantized of the network layer is mapped into target quantization information; performing fixed-point calculation according to target quantization information of a parameter to be quantized of a network layer; and storing a fixed-point calculation result corresponding to the parameter to be quantized so as to perform integer calculation in the process of processing the image to be detected in the environment of the embedded neural network processor.

In the conventional technology, after original quantization information corresponding to a parameter to be quantized of a network layer is determined, fixed-point calculation is performed according to the original quantization information to obtain a corresponding fixed-point calculation result a. Since the original quantization information corresponding to the fixed-point calculation result a does not match the parameters required by the processing procedure, applying the fixed-point calculation result a to the processing procedure of the network layer will result in a decrease in the processing performance of the target detection model.

According to the quantization method, after the original quantization information corresponding to the parameters to be quantized of the network layer is determined in the statistics link, a counterfeiting processing link for processing the preset image by using the target detection model and a mapping link for mapping the quantization information of the parameters to be quantized of the network layer into the target quantization information are added. The counterfeiting processing link corresponds to an actual processing link for processing an image to be detected by using a target detection model; in the counterfeit processing step, in the process of processing the preset image by using the target detection model, the change information of the quantization information corresponding to the parameter to be quantized of the network layer is determined, and the mapping relation between the parameter identifier and the target quantization information is established according to the change information. In the mapping link, the quantization information of the parameters to be quantized of the network layer is mapped into target quantization information according to the mapping relation.

Taking fig. 2 as an example, the original quantization information uses the output parameter of the second network layer 202 as the input parameter of the third network layer 203, and the processing procedure of the target detection model requires the output parameter of the first network layer 201 as the input parameter of the third network layer 203. In the embodiment of the present application, a forgery processing link and a mapping link are added in the quantization process, and the input parameter of the third network layer 203 can be mapped from the output parameter of the second network layer 202 to the output parameter of the first network layer 201, so that the parameter corresponding to the fixed point calculation result can be matched with the parameter required by the processing process.

Referring to fig. 3, a schematic structural diagram of a network layer of a residual error network according to an embodiment of the present application is shown, where the network layer in fig. 3 may be an example of a component of the residual error network, and the network layer in fig. 3 may include: the device comprises an activation module, a first convolution module, a second convolution module, a third convolution module and a fourth convolution module.

In this embodiment, the original quantization information uses the output parameter (activated output parameter) of the activation module as the input parameter of the fourth convolution module, and the processing procedure of the target detection model requires that the input parameter (activated input parameter) of the activation module is used as the input parameter of the fourth convolution module. In the embodiment of the application, a forgery processing link and a mapping link are added in the quantization process, and the input parameter of the fourth convolution module can be mapped into the activation input parameter, so that the parameter corresponding to the fixed point calculation result can be matched with the parameter required by the processing process.

The NPU of the embodiment of the application adopts a data-driven parallel computing architecture, simulates human neurons and highlights in a circuit, implements artificial intelligence computing, and can improve the processing efficiency of image-like multimedia data.

Method embodiment one

Referring to fig. 4, a schematic flow chart illustrating steps of a quantization method of a target detection model according to an embodiment of the present application is shown, where the method may be used to quantize a parameter to be quantized of a network layer in the target detection model, and specifically may include the following steps:

step 401, receiving a calibration image set;

step 402, determining original quantization information corresponding to a parameter to be quantized of a network layer according to a processing result of a target detection model on a calibration image set;

step 403, determining the variation information of the quantization information corresponding to the parameter to be quantized of the network layer in the process of processing the preset image by using the target detection model; the change information is used for modifying the original quantization information which does not accord with the preset condition into target quantization information which accords with the preset condition; the preset condition represents the requirement of the processing process of the target detection model on quantitative information; the preset image may be any one of all images in the set of calibration images.

Step 404, according to the change information, establishing a mapping relation between a parameter identifier and target quantization information, wherein the parameter identifier can be used for identifying a parameter to be quantized of a network layer;

step 405, according to the mapping relation, mapping the quantization information of the parameter to be quantized of the network layer into target quantization information;

step 406, performing fixed-point calculation according to the target quantization information of the parameter to be quantized of the network layer;

and 407, storing a fixed-point calculation result corresponding to the parameter to be quantized so as to perform integer calculation in the process of processing the image to be detected in the environment of the embedded neural network processor.

The embodiment of the method shown in fig. 4 is used for quantizing parameters to be quantized of a network layer in a target detection model. The execution subject of the first embodiment of the method shown in fig. 4 may be an operation framework of the object detection model, and the operation framework may support training and processing of the object detection model. In practical application, the running framework may load data of the target detection model, and quantize parameters of the target detection model in a training process, or quantize parameters of the target detection model after training is completed.

In the embodiment of the present application, the quantization may be a readjustment of the numerical range. Assuming that r represents a floating point number of a parameter to be quantized of the network layer and q represents a quantized fixed point integer, the mapping of the floating point number to the fixed point integer may be expressed as:

（1）

s represents a parameter of the proportional relationship between floating point numbers and integers, also called scale; z represents zero point (zero point parameter), and is a number corresponding to a floating point number 0 mapped to a fixed point integer; round () represents a rounding function.

Wherein, the calculation formula of scale and zero point is as follows:

（2）

（3）

wherein the content of the first and second substances,

and

respectively representing the maximum and minimum values in a floating-point number,

and

respectively representing the maximum value and the minimum value in the quantized fixed-point integers.

The parameter to be quantized of the network layer may be the parameter of the network layer itself, or may be a related parameter related to the parameter of the network layer. scale may be one of the relevant parameters, called the proportional relation parameter. According to the embodiment of the application, the matching between the parameters corresponding to the fixed-point calculation result and the parameters required by the processing process can be realized aiming at the parameters such as the proportional relation parameters. For convenience of description, parameters such as the weight parameter, the input parameter, and the output parameter are referred to as a first parameter, and parameters such as scale and zero point are referred to as a second parameter.

In step 401, the set of calibration images may be used to count the maximum and minimum values of the floating point number of the first parameter. The set of calibration images may include: a plurality of calibration images associated with the task. It is understood that the embodiments of the present application do not impose limitations on the specific calibration images. The path of the calibration image set and the number of calibration images in the calibration image set may be saved as configuration parameters.

The embodiment of the application can utilize the target detection model to process the calibration image set, and the obtained processing result can include: the first parameter of the network layer corresponds to a floating point number under the condition of the calibration image set. Floating point numbers typically occupy 32 bits.

The step 402 of determining the original quantization information of the parameter to be quantized of the network layer may specifically include: counting floating point numbers corresponding to the first parameters of the network layer under the condition of calibrating the image set to obtain corresponding statistical results; and determining the original quantitative information of the second parameter of the network layer according to the statistical result.

The statistical result may include: maximum and minimum values of the weight parameters or input parameters or output parameters of the network layer that need to be quantized. Further, the original quantization information of scale can be obtained according to the formula (2) and the original quantization information of zero point can be obtained according to the formula (3) according to the configuration parameters.

The configuration parameters may include: quantization mode parameters such as quantization bit parameter, symmetric quantization, asymmetric quantization, and the like. The quantization bit parameter may represent the number of bits occupied by the quantized fixed point calculation result, such as 8 bits, and represents quantization to 8 bits. In the case of the symmetric quantization method, the 0 point of the floating point number is mapped to the 0 point of the fixed point calculation result. In the case of the asymmetric quantization method, the 0 point of the floating-point number is not mapped to the 0 point of the fixed-point calculation result. For example, in the case of a symmetric quantization approach, the range of the fixed point calculation result may be: -127 to 127.

The embodiment of the application can also establish a first mapping table; the first mapping table may include: and mapping relation between the parameter identification corresponding to the parameter to be quantized of the network layer and the original quantization information.

The parameter identifier may be used to uniquely identify a parameter to be quantized of the network layer, and the parameter identifier may correspond to a parameter to be quantized (hereinafter referred to as a parameter to be quantized). The parameter identification may include: parameter memory addresses, parameter indexes corresponding to the parameter memory addresses, parameter meaning information and the like. The parameter meaning information may include: network layer identification, parameter type and parameter number. The network layer identifier may characterize the network layer identifier of the network layer where the parameter identifier is located. The network layer identification may be information such as a name of the network layer. The parameter type may characterize an input type or an output type. The parameter number may characterize the number of inputs or outputs of the network layer where the parameter is located.

The raw quantization information may characterize the values of the parameters to be quantized, which are obtained based on the statistical step of step 402. The original quantization information of the embodiment of the present application may include: scale and zero point, etc. The original quantization information may include: the maximum and minimum values of the floating-point number of the second parameter.

Referring to table 1, an example of a first mapping table of one embodiment of the present application is shown. The parameter identifier may be a parameter memory address, and the original quantization information may include: the maximum and minimum values of the floating point number. The embodiment of the application can adopt a key value pair (key, value) format to store the mapping relation between the parameter identifier and the original quantization information, wherein the key corresponds to the parameter identifier, and the original quantization information corresponds to the value.

The source parameter for the second floating-point number may be layer2_ scale _ out. The source parameter may characterize the source of the second parameter, for example, the source parameter layer2_ scale _ out may represent an output parameter of which scale is derived from the network layer 2. It will be appreciated that the source parameter is used to illustrate the original quantization information, such as the second floating point number, and is not part of table 1.

TABLE 1

The embodiment of the present application may further establish a mapping relationship from the parameter memory address to the parameter index, where the mapping relationship is shown in table 2. In practical application, the parameter memory address to be quantized of at least one network layer in the target detection model can be traversed according to the parameter structure information of the network layer in the target detection model, and the mapping relation from the parameter memory address to the parameter index is established according to the traversing result. For example, the parameter index corresponding to the first parameter memory address of the first network layer in the target detection model is index 0, and the parameter index corresponding to the second parameter memory address of the first network layer in the target detection model is index 1. For another example, a parameter index corresponding to the last parameter memory address of the jth (j may be a positive integer) network layer in the target detection model is an index M, and a parameter index corresponding to the first parameter memory address of the (j + 1) th network layer in the target detection model is an index (M + 1), and so on. According to the embodiment of the application, the parameter indexes can be arranged according to the sequence of the network layers from low to high. It can be understood that, in the embodiment of the present application, no limitation is imposed on a specific parameter index corresponding to a parameter memory address.

TABLE 2

The embodiment of the application can also establish a second mapping table according to the parameter structure information of the network layer; the second mapping table may include: and mapping relation between parameter indexes corresponding to the parameters to be quantized of the network layer and the parameter meaning information.

Referring to table 3, a schematic diagram of a mapping relationship between a parameter index corresponding to a parameter to be quantized in a network layer and parameter meaning information is shown in this embodiment of the present application. The parameter meaning information may include: network layer identification, parameter type (input or output) and parameter number (input or output).

TABLE 3

In step 403, at least one preset image may be processed by using the target detection model to implement a forgery process of the preset image. The counterfeit processing link corresponds to an actual processing link for processing an image to be detected by using the target detection model. The preset image may be any image, such as the preset image may be any one of a set of calibration images, or the preset image may be an image different from the calibration image.

In the embodiment of the application, in the process of processing the preset image by using the target detection model, if the original quantization information corresponding to the parameter to be quantized of the network layer does not meet the preset condition, the change information of the parameter to be quantized of the network layer can be generated. The change information is used for modifying the original quantization information which does not accord with the preset condition into the target quantization information which accords with the preset condition. For example, the source parameter corresponding to the original quantization information corresponding to the index 0 is layer2_ scale _ out, and in the process of processing the preset image by using the target detection model, the quantization information corresponding to the index 0 is modified as follows: the source parameter corresponding to the target quantization information is layer1_ scale _ out. It can be understood that, according to the actual application requirement, a person skilled in the art may modify the original quantization information into the target quantization information meeting the preset condition, and the specific target quantization information is not limited in the embodiment of the present application.

It should be noted that quantization information corresponding to one parameter to be quantized may be changed once or multiple times. Examples of the change information of one change may include: the quantization information of the parameter to be quantized is changed into a source parameter a. Examples of the change information of the plurality of changes may include: the quantization information of the parameter to be quantized is first changed into a source parameter A and then changed into a source parameter B.

It should be noted that the target detection model may include: the original quantization information corresponding to the first part of the parameters to be quantized can be changed once or for multiple times, and the original quantization information corresponding to the second part of the parameters to be quantized can not be changed. Therefore, step 406 may perform fixed-point calculation according to the target quantization information corresponding to the first part of the parameters to be quantized and the original quantization information corresponding to the second part of the parameters to be quantized.

In addition, the parameter memory address can be used in both the statistical step 402 and the falsification processing step 403. Step 403 may also establish a mapping table similar to table 1, or table 2, or table 3.

However, for the same parameter, the statistical step in step 402 and the falsification processing step in step 403 may correspond to different parameter memory addresses. Since the parameter index or the parameter meaning information is obtained according to the parameter structure information of the network layer in the target detection model, the parameter structure information of the network layer in the target detection model may have fixity in the statistical step 402 and the falsification processing step 403, so that the same parameter may correspond to the same parameter index or the same parameter meaning information in the statistical step 402 and the falsification processing step 403. Therefore, in the subsequent processing flow in the embodiment of the present application, the parameter index or the parameter meaning information may be used as the parameter identifier.

In step 404, a mapping relationship between the parameter identifier and the target quantization information may be established according to the variation information.

In one implementation, the change information may include: if the parameter memory address of the parameter to be quantized and the target quantization information are the same, the mapping relationship between the parameter identifier and the target quantization information may include mapping relationship a: and mapping relation between the parameter index of the parameter to be quantized and the target quantization information. The table 2 may be queried according to the parameter memory address of the parameter to be quantized in the change information, so as to obtain the parameter index of the parameter to be quantized.

Table 4 shows an example of a mapping relationship between the parameter index and the target quantization information according to an embodiment of the present application. The target quantization information corresponding to the index 0 may be a first target floating point number, where the first target floating point number may meet a preset condition.

TABLE 4

In another implementation, the change information may include: the mapping relationship between the parameter identifier and the target quantization information may include mapping relationship B: and mapping relation between the first parameter index of the parameter to be quantized and the second parameter index of the target quantization information. The table 2 may be looked up according to the memory address of the first parameter to obtain the index of the first parameter. Table 2 may be looked up according to the second parameter memory address to obtain the second parameter index.

Table 5 shows an example of a mapping relationship between a first parameter index of a parameter to be quantized and a second parameter index of target quantization information according to an embodiment of the present application. The target quantization information corresponding to the index 0 may be a first target floating point number, where the first target floating point number may meet a preset condition.

TABLE 5

In yet another implementation, the change information may include: the mapping relationship between the parameter identifier and the target quantization information may include a mapping relationship C: and mapping relation between the first parameter meaning information of the parameter to be quantized and the second parameter meaning information corresponding to the source parameter of the target quantization information. The table 2 may be queried according to the first parameter memory address to obtain a first parameter index; and indexing the lookup table 3 according to the first parameter to obtain the meaning information of the first parameter. The table 2 can be queried according to the second parameter memory address to obtain a second parameter index; and indexing the lookup table 3 according to the second parameter to obtain the meaning information of the second parameter.

Table 6 shows an example of a mapping relationship between first parameter meaning information of a parameter to be quantized and second parameter meaning information corresponding to a source parameter of target quantization information according to an embodiment of the present application. The mapping relation between layer3-input1 and layer1-out1 can represent that the 1 st input of the network layer3 is derived from the 1 st output of the network layer 1.

TABLE 6

In practical applications, the quantization information of one parameter to be quantized may correspond to a plurality of changes. In this case, a single data record may be recorded in table 6, and the second parameter meaning information corresponding to the single data record may be associated with the target quantization information corresponding to the end point of the change (last change). Alternatively, a plurality of data records may be recorded in table 6, and one of the data records may correspond to one change.

For example, examples of the change information of the plurality of changes may include: the quantization information of the parameter X is first changed to the source parameter a and then changed to the source parameter B. A data record may include: and mapping relation between the parameter X and the parameter B. The plurality of data records may include: the mapping relation between the parameter X and the parameter A, and the mapping relation between the parameter A and the parameter B.

In step 405, according to the mapping relationship, the quantization information of the parameter to be quantized of the network layer is mapped to the target quantization information, so that matching between the parameter corresponding to the fixed point calculation result and the parameter required by the processing procedure can be realized.

According to the mapping relation a to the mapping relation C, the embodiment of the application may respectively provide a mapping scheme for mapping the quantization information of the parameter to be quantized of the network layer to the target quantization information:

the mapping scheme A is used for searching a mapping relation between the parameter index of the parameter to be quantized and the target quantization information according to the parameter index of the parameter to be quantized so as to obtain the target quantization information corresponding to the parameter to be quantized; or

According to the mapping scheme B, according to the parameter index of the parameter to be quantized, searching is carried out in the mapping relation between the first parameter index of the parameter to be quantized and the second parameter index of the target quantization information, so as to obtain the target parameter index of the target quantization information corresponding to the parameter to be quantized; determining a target parameter memory address corresponding to the target parameter index; determining target quantization information corresponding to the parameter to be quantized according to the target parameter memory address and the mapping relation between the parameter memory address and the original quantization information; or alternatively

According to the mapping scheme C, searching in the mapping relation between the first parameter meaning information of the parameter to be quantized and the second parameter meaning information corresponding to the source parameter of the target quantization information according to the first parameter meaning information of the parameter to be quantized so as to obtain the target parameter meaning information of the target quantization information corresponding to the parameter to be quantized; determining a target parameter index corresponding to the target parameter meaning information according to the mapping relation between the parameter index corresponding to the parameter to be quantized and the parameter meaning information; determining a target parameter memory address corresponding to the target parameter index; and determining the target quantization information corresponding to the parameter to be quantized according to the target parameter memory address and the mapping relation between the parameter memory address and the original quantization information.

In practical application, the parameter structure information of the network layer in the target detection model can be traversed, and the mapping scheme C is executed according to the parameter meaning information in the parameter structure information. Or, the parameter structure information of the network layer in the target detection model may be traversed, the table 3 may be queried according to the parameter meaning information in the parameter structure information, the parameter index of the parameter to be quantized is determined, and the mapping scheme a or the mapping scheme B may be executed according to the parameter index of the parameter to be quantized.

In a particular implementation, mapping scheme C may perform one query, or multiple queries, of mapping relationship C. Examples of a query may include: the parameter meaning information in the parameter structure information is layer3-input1, and it can be known from table 6 that the target quantization information of layer3-input1 is provided by layer1-out1, so that the target quantization information corresponding to layer3-input1 can be determined according to the original quantization information corresponding to layer1-out 1.

Examples of multiple queries may include: the meaning information of the parameter in the parameter structure information is the parameter X, and as can be seen from table 6, the target quantization information of the parameter X is provided by the parameter a, and the target quantization information of the parameter a is provided by the parameter B, so that the target quantization information corresponding to the parameter X can be determined according to the original quantization information of the parameter B.

According to the first parameter meaning information of the parameter to be quantized, the mapping relation C between the first parameter meaning information of the parameter to be quantized and the second parameter meaning information corresponding to the source parameter of the target quantization information is searched to obtain the second parameter meaning information of the target quantization information corresponding to the parameter to be quantized; and judging whether third parameter meaning information matched with the second parameter meaning information exists in the mapping relation C or not based on the second query, if so, judging whether fourth parameter meaning information matched with the third parameter meaning information exists in the mapping relation C or not based on the third query, and so on, and taking the (N-1) th parameter meaning information as the target parameter meaning information under the condition that the parameter meaning information matched with the (N-1) th parameter meaning information does not exist in the Nth query representation mapping relation C.

Therefore, searching for the mapping relationship C between the first parameter meaning information of the parameter to be quantized and the second parameter meaning information corresponding to the source parameter of the target quantization information may specifically include: executing N times of queries aiming at the mapping relation C; n is a positive integer; and under the condition that the parameter meaning information matched with the parameter meaning information obtained by the (N-1) th query does not exist in the Nth query representation mapping relation C, the (N-1) th parameter meaning information can be used as the target parameter meaning information.

In step 406, a fixed-point calculation is performed according to the target quantization information of the parameter to be quantized of the network layer, so as to process the image to be detected according to the fixed-point calculation result corresponding to the parameter to be quantized of the network layer.

The fixed point calculation of step 406 may include: fixed point calculation of the second parameter.

It should be noted that the target detection model may include: and the original quantization information corresponding to the first part of the second parameters can be changed once or for multiple times, and the original quantization information corresponding to the second part of the second parameters can not be changed. Therefore, the fixed-point calculation process of the second parameter in step 406 specifically includes: and performing fixed-point calculation according to the target quantization information corresponding to the first part of second parameters and the original quantization information corresponding to the second part of second parameters.

The fixed point calculation of the second parameter may be used to convert floating point numbers of the second parameter to integers. The fixed-point calculation of the second parameter can be performed according to the formula (4) and the formula (5).

（4）

（5）

Wherein, M can be any floating point number,

for intermediate calculations, mo may range from [0.5,1]N may be a positive integer, and Mint may be a fixed point calculation result.

The fixed point calculation of step 406 may further include: fixed point calculation of the first parameter. For example, the fixed-point calculation of the first parameter may be performed using the fixed-point calculation result corresponding to the second parameter and equation (1). For another example, the fixed-point calculation of the first parameter may be performed using the fixed-point calculation result corresponding to the formula (6) and the second parameter.

（6）

Wherein, i, w and o respectively represent input parameters, weight parameters and output parameters, ri represents the floating point number of the input parameters, rw represents the floating point number of the weight parameters, qo represents the fixed point calculation result of the output parameters, qi represents the fixed point calculation result of the input parameters, and Qw represents the fixed point calculation result of the weight parameters; so represents the proportional relation parameter of the output parameter, sw represents the proportional relation parameter of the weight parameter, si represents the proportional relation parameter of the input parameter, and So, sw and Si are fixed-point calculation results; b represents a bias parameter, zo represents a fixed-point integer corresponding to a quantized floating point 0 in an output parameter, zi represents a fixed-point integer corresponding to a quantized floating point 0 in an input parameter, and Zw represents a fixed-point integer corresponding to a quantized floating point 0 in a weight parameter.

The embodiment of the application can store the fixed-point calculation result corresponding to the parameter to be quantified so as to perform integer calculation in the process of processing the image to be detected in the environment of the embedded neural network processor. The saved fixed point computation results may include: the fixed point calculation result of the second parameter and the fixed point calculation result of the first parameter, and the like.

In summary, in the quantization method of the target detection model in the embodiment of the present application, after the statistics step determines the original quantization information corresponding to the parameter to be quantized of the network layer, a falsification step of processing the preset image by using the target detection model and a mapping step of mapping the quantization information of the parameter to be quantized of the network layer to the target quantization information are further added. The counterfeiting processing link corresponds to an actual processing link for processing an image to be detected by using a target detection model; in the counterfeit processing step, in the process of processing the preset image by using the target detection model, the change information of the quantization information corresponding to the parameter to be quantized of the network layer is determined, and the mapping relation between the parameter identifier and the target quantization information is established according to the change information. In the mapping step, the quantization information of the parameter to be quantized of the network layer is mapped into the target quantization information according to the mapping relation.

The counterfeiting processing link of the embodiment of the application establishes a mapping relation between the parameter identification and the target quantization information, and the mapping relation comprises the information of the target quantization information which is corresponding to one parameter identification and meets the preset condition; in this way, the mapping link maps the quantization information of the parameter to be quantized of the network layer into the target quantization information meeting the preset condition, so that the target quantization information participating in fixed-point calculation can meet the preset condition. The preset condition can be used for representing the requirement of the processing process of the target detection model on the quantitative information, so that the matching between the parameters corresponding to the fixed-point calculation result and the parameters required by the processing process can be realized, and the processing performance of the target detection model can be improved.

The embodiment of the application can store the fixed-point calculation result corresponding to the parameter to be quantified so as to perform integer calculation in the process of processing the image to be detected in the environment of the embedded neural network processor. Because the image to be detected can be processed by using the fixed-point calculation result in the form of the integer in the processing process of the target detection model, the range of the integer calculation in the processing process of the target detection model can be increased, the storage resource and the calculation resource consumed by the target detection model can be further reduced, and the processing efficiency of the target detection model can be improved. For example, the calculations in the processing of the object detection model may include: integer calculation, such as integer multiplication, integer addition and integer shift, can reduce the storage resource and the calculation resource consumed by the target detection model and improve the processing efficiency of the target detection model because the processing process of the target detection model does not involve the calculation of floating point numbers.

Method embodiment two

Referring to fig. 5, a schematic flow chart illustrating steps of a target detection method according to an embodiment of the present application is shown, where the method may be applied to an NPU, and specifically may include the following steps:

step 501, receiving an image to be detected;

step 502, processing an image to be detected according to a target detection model to obtain a corresponding detection result; in the process of processing the image to be detected, integer calculation is carried out according to the fixed point calculation result corresponding to the parameters of the network layer in the target detection model;

step 503, outputting the detection result;

The embodiment of the method shown in fig. 5 is used for executing an actual processing link for processing an image to be detected by using a target detection model.

Because the image to be detected can be processed by using the fixed-point calculation result in the form of the integer in the processing process of the target detection model, the range of the integer calculation in the processing process of the target detection model can be increased, the storage resource and the calculation resource consumed by the target detection model can be further reduced, and the processing efficiency of the target detection model can be improved. For example, the calculations in the processing of the object detection model may include: integer calculation, such as integer multiplication, integer addition and integer shift, can reduce the memory resource and the calculation resource consumed by the target detection model and improve the processing efficiency of the target detection model because the processing process of the target detection model does not need to involve the calculation of floating point numbers.

Taking fig. 1 as an example, the target detection model may specifically include: feature extraction section 101, feature fusion section 102, and detection section 103. Any one of the feature extraction unit 101, the feature fusion unit 102, and the detection unit 103 performs integer calculation according to the fixed-point calculation result corresponding to the parameter of the network layer in the process of processing the image to be detected.

It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the embodiments are not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the embodiments. Further, those skilled in the art will also appreciate that the embodiments described in the specification are presently preferred and that no particular act is required of the embodiments of the application.

On the basis of the foregoing embodiment, this embodiment further provides a quantization apparatus for a target detection model, and referring to fig. 6, the quantization apparatus is configured to quantize a parameter to be quantized of a network layer in the target detection model, and specifically may include the following modules:

a receiving module 601, configured to receive a calibration image set;

an original quantization information determining module 602, configured to determine, according to a processing result of the target detection model on the calibration image set, original quantization information corresponding to a parameter to be quantized of the network layer;

a change information determining module 603, configured to determine change information of quantization information corresponding to a parameter to be quantized in a network layer in a process of processing a preset image by using a target detection model; the change information is used for modifying the original quantization information which does not accord with the preset condition into target quantization information which accords with the preset condition; the preset condition represents the requirement of the processing process of the target detection model on quantitative information;

a first establishing module 604, configured to establish a mapping relationship between the parameter identifier and the target quantization information according to the change information;

a mapping module 605, configured to map quantization information of a parameter to be quantized of the network layer into target quantization information according to the mapping relationship;

a fixed-point calculation module 606, configured to perform fixed-point calculation according to target quantization information of a parameter to be quantized in a network layer;

the saving module 607 is configured to save the fixed-point calculation result corresponding to the parameter to be quantized, so as to perform integer calculation in the process of processing the image to be detected in the environment of the embedded neural network processor.

Optionally, the processing result may specifically include: floating point numbers corresponding to the first parameters of the network layer under the condition of calibrating the image set;

the original quantization information determining module 602 may specifically include:

Optionally, the change information determining module 603 is specifically configured to, in a process of processing a preset image by using the target detection model, generate change information of a parameter to be quantized of the network layer if original quantization information corresponding to the parameter to be quantized of the network layer does not meet a preset condition.

Optionally, the original quantization information determining module 602 may specifically include:

Optionally, the parameter identification specifically may include: parameter memory addresses, or parameter indexes corresponding to the parameter memory addresses, or parameter meaning information.

Optionally, the mapping relationship between the parameter identifier and the target quantization information may specifically include:

Optionally, the mapping module may specifically include:

Optionally, the third mapping module may specifically include:

the query module is used for executing N times of queries aiming at the mapping relation C; n may be a positive integer;

Referring to fig. 7, the target detection apparatus may be applied to an NPU, and specifically may include the following modules:

a receiving module 701, configured to receive an image to be detected;

a processing module 702, configured to process, according to the target detection model, an image to be detected to obtain a corresponding detection result; in the process of processing the image to be detected, integer calculation is carried out according to the fixed point calculation result corresponding to the parameters of the network layer in the target detection model;

an output module 703, configured to output the detection result;

the determining process of the fixed-point calculation result may specifically include: according to the processing result of the target detection model on the calibration image set, determining original quantization information corresponding to the parameter to be quantized of the network layer; determining the change information of the quantization information corresponding to the parameter to be quantized of the network layer in the process of processing the preset image by using the target detection model; the change information is used for modifying the original quantization information which does not accord with the preset condition into target quantization information which accords with the preset condition; the preset condition represents the requirement of the processing process of the target detection model on quantitative information; according to the change information, establishing a mapping relation between the parameter identification and the target quantization information; according to the mapping relation, the quantization information of the parameters to be quantized of the network layer is mapped into target quantization information; and performing fixed-point calculation according to the target quantization information of the parameter to be quantized of the network layer.

the determining of the original quantization information of the parameter to be quantized of the network layer includes:

counting floating point numbers corresponding to the first parameter of the network layer under the condition of calibrating the image set to obtain a corresponding statistical result;

Optionally, the parameter identification includes: parameter memory addresses, or parameter indexes corresponding to the parameter memory addresses, or parameter meaning information.

Mapping relation between a first parameter index of the parameter to be quantized and a second parameter index of the target quantization information; or alternatively

Optionally, the mapping quantization information of a parameter to be quantized in a network layer to target quantization information includes:

searching in a mapping relation between the parameter index of the parameter to be quantized and the target quantization information according to the parameter index of the parameter to be quantized so as to obtain the target quantization information corresponding to the parameter to be quantized; or

According to the first parameter meaning information of the parameter to be quantized, searching in a mapping relation C between the first parameter meaning information of the parameter to be quantized and second parameter meaning information corresponding to a source parameter of target quantization information to obtain target parameter meaning information of the target quantization information corresponding to the parameter to be quantized; determining a target parameter index corresponding to the target parameter meaning information according to the mapping relation between the parameter index corresponding to the parameter to be quantized and the parameter meaning information; determining a target parameter memory address corresponding to the target parameter index; and determining the target quantization information corresponding to the parameter to be quantized according to the target parameter memory address and the mapping relation between the parameter memory address and the original quantization information.

Optionally, searching a mapping relationship C between the first parameter meaning information of the parameter to be quantized and the second parameter meaning information corresponding to the source parameter of the target quantization information includes:

and under the condition that the parameter meaning information matched with the parameter meaning information obtained by the (N-1) th query does not exist in the Nth query representation mapping relation C, the (N-1) th parameter meaning information can be used as the target parameter meaning information.

The present application further provides a non-transitory, readable storage medium, where one or more modules (programs) are stored, and when the one or more modules are applied to a device, the device may execute instructions (instructions) of method steps in this application.

Embodiments of the present application provide one or more machine-readable media having instructions stored thereon, which when executed by one or more processors, cause an electronic device to perform the methods as described in one or more of the above embodiments. In the embodiment of the present application, the electronic device includes various types of devices such as a terminal device and a server (cluster).

Embodiments of the disclosure may be implemented as an apparatus for performing desired configurations using any suitable hardware, firmware, software, or any combination thereof, which may include: and the electronic equipment comprises terminal equipment, a server (cluster) and the like. Fig. 8 schematically illustrates an example apparatus 1100 that may be used to implement various embodiments described herein.

For one embodiment, fig. 8 illustrates an example apparatus 1100 having one or more processors 1102, a control module (chipset) 1104 coupled to at least one of the processor(s) 1102, a memory 1106 coupled to the control module 1104, a non-volatile memory (NVM)/storage 1108 coupled to the control module 1104, one or more input/output devices 1110 coupled to the control module 1104, and a network interface 1112 coupled to the control module 1104.

The processor 1102 may include one or more single-core or multi-core processors, and the processor 1102 may include any combination of general-purpose or special-purpose processors (e.g., graphics processors, application processors, baseband processors, etc.). In some embodiments, the apparatus 1100 can be implemented as a terminal device, a server (cluster), or the like in the embodiments of the present application.

In some embodiments, the apparatus 1100 may include one or more computer-readable media (e.g., the memory 1106 or the NVM/storage 1108) having instructions 1114 and one or more processors 1102 in combination with the one or more computer-readable media configured to execute the instructions 1114 to implement modules to perform the actions described in this disclosure.

For one embodiment, control module 1104 may include any suitable interface controllers to provide any suitable interface to at least one of the processor(s) 1102 and/or to any suitable device or component in communication with control module 1104.

The control module 1104 may include a memory controller module to provide an interface to the memory 1106. The memory controller module may be a hardware module, a software module, and/or a firmware module.

The memory 1106 may be used to load and store data and/or instructions 1114 for the device 1100, for example. For one embodiment, memory 1106 may include any suitable volatile memory, such as suitable DRAM. In some embodiments, the memory 1106 may comprise a double data rate type four synchronous dynamic random access memory (DDR 4 SDRAM).

For one embodiment, control module 1104 may include one or more input/output controllers to provide an interface to NVM/storage 1108 and input/output device(s) 1110.

For example, NVM/storage 1108 may be used to store data and/or instructions 1114. NVM/storage 1108 may include any suitable non-volatile memory (e.g., flash memory) and/or may include any suitable non-volatile storage device(s) (e.g., one or more Hard Disk Drives (HDDs), one or more Compact Disc (CD) drives, and/or one or more Digital Versatile Disc (DVD) drives).

NVM/storage 1108 may include storage resources that are physically part of the device on which apparatus 1100 is installed, or it may be accessible by the device and need not be part of the device. For example, NVM/storage 1108 may be accessed over a network via input/output device(s) 1110.

Input/output device(s) 1110 may provide an interface for apparatus 1100 to communicate with any other suitable device, input/output device(s) 1110 may include communication components, audio components, sensor components, and so forth. Network interface 1112 may provide an interface for device 1100 to communicate over one or more networks, and device 1100 may communicate wirelessly with one or more components of a wireless network according to any of one or more wireless network standards and/or protocols, such as access to a communication standard-based wireless network, e.g., wiFi, 2G, 3G, 4G, 5G, etc., or a combination thereof.

For one embodiment, at least one of the processor(s) 1102 may be packaged together with logic for one or more controller(s) (e.g., memory controller module) of the control module 1104. For one embodiment, at least one of the processor(s) 1102 may be packaged together with logic for one or more controller(s) of control module 1104 to form a System In Package (SiP). For one embodiment, at least one of the processor(s) 1102 may be integrated on the same die with logic for one or more controller(s) of the control module 1104. For one embodiment, at least one of the processor(s) 1102 may be integrated on the same die with logic for one or more controller(s) of control module 1104 to form a system on chip (SoC).

In various embodiments, the apparatus 1100 may be, but is not limited to being: a server, a desktop computing device, or a mobile computing device (e.g., a laptop computing device, a handheld computing device, a tablet, a netbook, etc.) among other terminal devices. In various embodiments, the apparatus 1100 may have more or fewer components and/or different architectures. For example, in some embodiments, device 1100 includes one or more cameras, keyboards, liquid Crystal Display (LCD) screens (including touch screen displays), non-volatile memory ports, multiple antennas, graphics chips, application Specific Integrated Circuits (ASICs), and speakers.

The detection device can adopt a main control chip as a processor or a control module, sensor data, position information and the like are stored in a memory or an NVM/storage device, a sensor group can be used as an input/output device, and a communication interface can comprise a network interface.

For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.

The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present application have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the true scope of the embodiments of the application.

Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "include", "including" or any other variations thereof are intended to cover non-exclusive inclusion, so that a process, method, article, or terminal device including a series of elements includes not only those elements but also other elements not explicitly listed or inherent to such process, method, article, or terminal device. Without further limitation, an element defined by the phrases "comprising one of \ 8230; \8230;" does not exclude the presence of additional like elements in a process, method, article, or terminal device that comprises the element.

The above detailed description is provided for a method and an apparatus for quantifying an object detection model, a method and an apparatus for detecting an object, an electronic device, and a machine-readable medium, and specific examples are applied herein to illustrate the principles and embodiments of the present application, and the above descriptions of the embodiments are only used to help understand the method and the core ideas of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A target detection method is applied to an embedded neural network processor, and comprises the following steps:

receiving an image to be detected;

outputting the detection result;

2. The method of claim 1, wherein the processing results comprise: floating point numbers corresponding to the first parameters of the network layer under the condition of calibrating the image set;

the determining of the original quantization information corresponding to the parameter to be quantized of the network layer includes:

3. The method according to claim 1, wherein the determining original quantization information corresponding to the parameter to be quantized of the network layer comprises:

4. The method of claim 1, wherein the parameter identification comprises: parameter memory addresses, or parameter indexes corresponding to the parameter memory addresses, or parameter meaning information.

5. The method of claim 4, wherein the mapping between the parameter identification and the target quantization information comprises:

mapping relation between parameter index corresponding to parameter memory address of parameter to be quantized and target quantization information; or

6. The method according to claim 5, wherein mapping the quantization information of the parameter to be quantized of the network layer to target quantization information comprises:

according to the parameter index of the parameter to be quantized, searching in the mapping relation between the parameter index of the parameter to be quantized and the target quantization information to obtain the target quantization information corresponding to the parameter to be quantized; or

7. The method according to claim 6, wherein the searching in the mapping relationship C between the first parameter meaning information of the parameter to be quantized and the second parameter meaning information corresponding to the source parameter of the target quantization information comprises:

8. A method for quantizing a parameter to be quantized of a network layer in a target detection model is characterized by comprising the following steps:

receiving a set of calibration images;

in the process of processing a preset image by using a target detection model, determining the change information of quantization information corresponding to the parameter to be quantized of a network layer; the change information is used for modifying the original quantization information which does not accord with the preset condition into target quantization information which accords with the preset condition; the preset condition represents the requirement of the processing process of the target detection model on quantitative information;

and storing the fixed-point calculation result corresponding to the parameter to be quantized so as to perform integer calculation in the process of processing the image to be detected in the environment of the embedded neural network processor.

9. An object detection device applied to an embedded neural network processor, the device comprising:

the receiving module is used for receiving an image to be detected;

the output module is used for outputting the detection result;

10. An apparatus for quantizing a parameter to be quantized in a network layer of a target detection model, the apparatus comprising:

a receiving module for receiving a set of calibration images;

11. An electronic device, comprising: a processor; and

a memory having executable code stored thereon that, when executed, causes the processor to perform the method of any of claims 1-8.

12. A machine readable medium having executable code stored thereon, which when executed, causes a processor to perform the method of any of claims 1-8.