WO2021179587A1 - 神经网络模型量化方法、装置、电子设备及计算机可读存储介质 - Google Patents

神经网络模型量化方法、装置、电子设备及计算机可读存储介质 Download PDF

Info

Publication number
WO2021179587A1
WO2021179587A1 PCT/CN2020/119608 CN2020119608W WO2021179587A1 WO 2021179587 A1 WO2021179587 A1 WO 2021179587A1 CN 2020119608 W CN2020119608 W CN 2020119608W WO 2021179587 A1 WO2021179587 A1 WO 2021179587A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer
neural network
integer
network model
quantized
Prior art date
Application number
PCT/CN2020/119608
Other languages
English (en)
French (fr)
Inventor
周舒畅
林大超
李翔
张志华
杨弋
王田
Original Assignee
北京迈格威科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京迈格威科技有限公司 filed Critical 北京迈格威科技有限公司
Publication of WO2021179587A1 publication Critical patent/WO2021179587A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Definitions

  • the present disclosure relates to the field of machine learning technology, and in particular to a neural network model quantification method, device and electronic equipment.
  • neural network models have been widely and successfully applied in many fields such as speech recognition, text recognition, and image and video recognition.
  • the trained neural network model needs to be further deployed to the target device for acceleration.
  • the general neural network model is double-precision or single-precision floating-point operations, in order to make as many target devices as possible to satisfy the operation
  • the researchers quantified the neural network model to reduce the computational complexity and computational unit overhead of the neural network model.
  • the existing quantized neural network model still has floating-point parameters in the normalization layer, so floating-point arithmetic units are still required to support operations, and cannot run on target devices that only support fixed-point or low-bit width operations. Therefore, the existing quantized neural network model still has the problem of being unable to run on target devices that only support fixed-point or low-bit width operations due to high computational complexity.
  • the purpose of the present disclosure is to provide a neural network model quantification method, device, electronic equipment, and computer-readable storage medium, which reduces the computational complexity of the neural network model, so that the quantized neural network model only supports fixed-point or The operation can be accelerated on the target device of low bit width operation.
  • the embodiment of the present disclosure provides a neural network model quantification method, including: obtaining a neural network model; wherein the neural network model includes a convolution layer, a normalization layer, and a quantized activation layer, and the output characteristics of the quantized activation layer Are integer features; convert the parameters of the convolutional layer to integer parameters; merge the normalization layer and the quantized activation layer to obtain a merged layer; convert the parameters of the merged layer to integer Parameter, get the quantified neural network model.
  • an embodiment of the present disclosure provides a possible implementation, wherein the step of merging the normalization layer and the quantized activation layer includes: combining the output of the normalization layer The feature is used as the input feature of the quantized activation layer, and the normalization layer and the quantized activation layer are combined to obtain the initial output feature of the combined layer; the initial output feature of the combined layer is subjected to a limit operation to obtain The target output feature of the merged layer.
  • the embodiment of the present disclosure provides a possible implementation manner, wherein the step of performing a limit operation on the initial output feature of the merging layer to obtain the target output feature of the merging layer includes:
  • the size limit of the initial output feature of the merging layer is within [0, 2 F -1].
  • the target output feature of the merging layer is set to 0, when When the initial output feature of the merging layer is greater than 2 F -1, the target output feature of the merging layer is set to 2 F -1.
  • an embodiment of the present disclosure provides a possible implementation manner, wherein the initial output characteristic of the merge layer is Where N is the input feature of the normalization layer, b is the offset of the merging layer, t is the quantization interval of the merging layer, b and t are floating-point numbers, and b and t are based on all The parameters of the normalization layer and the quantized activation layer are combined.
  • an embodiment of the present disclosure provides a possible implementation manner, wherein the step of converting the parameters of the merging layer into integer-type parameters includes: using a preset operational equivalent algorithm to determine the merging The public positive integer of the layer; based on the public positive integer, the parameters b and t of the merged layer are converted from floating-point numbers to integer parameters, respectively, to obtain parameter B and parameter T; where B is an integer offset , T is an integer quantization interval.
  • the embodiment of the present disclosure provides a possible implementation manner, including: obtaining the conversion integer of the merge layer based on the common positive integer, the integer offset of the merge layer, and the integer quantization interval
  • the initial output feature after the type parameter, the initial output feature of the merged layer after the integer type parameter conversion is Performing a limit operation on the initial output feature of the merged layer after the integer parameter conversion to obtain the target output feature of the merged layer after the integer parameter conversion; where K is the common positive integer.
  • an embodiment of the present disclosure provides a possible implementation, wherein the step of determining the common positive integer of the merging layer by using a preset arithmetic equivalent algorithm includes: selecting from a preset range that satisfies Each candidate sequence of preset conditions; wherein, the preset range is [1,2 F -1]; extract the minimum value from each candidate sequence to form a target sequence; take the maximum value in the target sequence as The common positive integer of the merge layer.
  • an embodiment of the present disclosure provides a possible implementation, wherein the step of converting the parameters of the convolutional layer into integer-type parameters includes: scaling the weight of the convolutional layer and Offset processing to obtain the integer weight of the convolutional layer.
  • the embodiments of the present disclosure provide a possible implementation manner, wherein the neural network model is trained in the following manner: in the iterative training process of the neural network model, the Candidate values for the weight of the convolutional layer, and determining the quantization weight of the convolutional layer according to the candidate value and the original weight of the convolutional layer; wherein the quantization weight of the convolutional layer is a floating point number; Each parameter in the normalization layer is set to a floating point number; wherein each parameter includes a mean value, a variance, a scaling value, and an offset; the quantization active layer is determined based on the preset quantization bit width value The output characteristics.
  • an embodiment of the present disclosure provides a possible implementation, in which the candidate value of the weight of the convolutional layer is determined based on the preset quantization bit width value, and the candidate value and the The step of determining the quantization weight of the convolution layer by the original weight of the convolution layer includes: determining the candidate value of the quantization weight of the convolution layer based on the preset quantization bit width value Wherein, F is the preset quantization bit width value, and the value range of the weight of the convolutional layer is [0, 1]; a value that satisfies a preset condition is selected from the candidate values as the volume The quantized weight of the stack.
  • an embodiment of the present disclosure provides a possible implementation, wherein the step of determining the output characteristic of the quantized activation layer based on the preset quantization bit width value includes: based on the preset Perform a limit operation on the output parameter of the quantized activation layer to obtain the output feature of the quantized activation layer; wherein the calculation formula of the limit operation is:
  • clip is the limit operator
  • Is the output feature of the quantized activation layer
  • the output feature of the quantized activation layer is 0, when When the output feature of the quantized activation layer is 2 F -1
  • M is the input feature of the quantized activation layer
  • s is the quantization interval of the quantized activation layer
  • F is the preset quantization bit width value.
  • the embodiments of the present disclosure provide a functional implementation, wherein the quantized neural network model is sent to a target electronic device, so as to execute the quantized neural network model through the target electronic device The corresponding target task.
  • the embodiment of the present disclosure also provides a neural network model quantization device, including: an acquisition module for acquiring a neural network model; wherein, the neural network model includes a convolutional layer, a normalization layer, and a quantitative activation layer; wherein, The output feature of the quantized activation layer is an integer type feature; a processing module is used to convert the parameters of the convolutional layer into an integer type parameter; a merging module is used to compare the normalization layer and the quantized activation layer Merging is performed to obtain a merged layer; a quantization module is used to convert the parameters of the merged layer into integer-type parameters to obtain a quantized neural network model.
  • An embodiment of the present disclosure provides an electronic device, including: a processor and a storage device; the storage device stores a computer program, and the computer program executes any one of the above-mentioned first aspects when run by the processor. The method described.
  • the embodiments of the present disclosure provide a mobile device that is communicatively connected with the electronic device of any one of the above-mentioned first aspects; the mobile device is used to obtain a quantified neural network model obtained by the operation of the electronic device, and is based on The quantized neural network model performs image processing; wherein, the mobile device includes a smart terminal and/or a smart camera.
  • the embodiments of the present disclosure provide a computer-readable storage medium on which a computer program is stored, and when the computer program is run by a processor, the steps of any one of the methods described above are executed.
  • the embodiments of the present disclosure provide a neural network model quantization method, device, and electronic equipment.
  • a neural network model including a convolutional layer, a normalization layer, and a quantized activation layer, and the output feature of the quantized activation layer is an integer feature.
  • convert the parameters of the convolutional layer to integer parameters then merge the normalization layer and the quantized activation layer to obtain the merged layer; finally convert the parameters of the merged layer to integer parameters to obtain the quantized neural network Model.
  • the parameters in the merged layer obtained by merging the normalization layer and the quantized activation layer are dumped as integers
  • Type parameters reduce the computational complexity of the neural network model, so that the quantized neural network model can be accelerated on target devices that only support fixed-point or low-bit width operations.
  • FIG. 1 shows a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure
  • Fig. 2 shows a flowchart of a neural network model quantification method provided by an embodiment of the present disclosure
  • FIG. 3 shows a quantization flowchart of a convolutional neural network provided by an embodiment of the present disclosure
  • FIG. 4 shows a flow chart of parameter dumping and deployment of a convolutional neural network provided by an embodiment of the present disclosure
  • FIG. 5 shows a schematic structural diagram of a neural network model quantification device provided by an embodiment of the present disclosure
  • Fig. 6 shows a schematic structural diagram of a neural network model quantization device provided by an embodiment of the present disclosure.
  • an embodiment of the present disclosure provides a A neural network model quantization method, device, electronic device, and computer-readable storage medium can be applied to reduce the computational complexity of the neural network model, so that the quantized neural network model can be used on target devices that only support fixed-point or low-bit width operations Speed up operation.
  • an example electronic device 100 for implementing a neural network model quantization method, apparatus, and electronic device according to an embodiment of the present disclosure can be described with reference to FIG. 1.
  • the electronic device 100 may include one or more processors 102, one or more storage devices 104, an input device 106, an output device 108, and an image acquisition device 110. These components It may be interconnected through the bus system 112 and/or other forms of connection mechanisms (not shown). It should be noted that the components and structure of the electronic device 100 shown in FIG. 1 are only exemplary and not restrictive, and the electronic device may also have other components and structures as required.
  • the processor 102 may be implemented in a hardware form of at least one of a digital signal processor (DSP), a field programmable gate array (FPGA), and a programmable logic array (PLA), and the processor 102 may be a central processing unit.
  • DSP digital signal processor
  • FPGA field programmable gate array
  • PLA programmable logic array
  • a unit CPU
  • GPU graphics processing unit
  • processing units with data processing capabilities and/or instruction execution capabilities, or a combination of several, and can control other components in the electronic device 100 To perform the desired function.
  • the storage device 104 may include one or more computer program products, and the computer program products may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
  • the volatile memory may include random access memory (RAM) and/or cache memory (cache), for example.
  • the non-volatile memory may include, for example, read-only memory (ROM), hard disk, flash memory, and the like.
  • One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 102 may run the program instructions to implement the client functions (implemented by the processor) in the embodiments of the present disclosure described below. And/or other desired functions.
  • Various application programs and various data such as various data used and/or generated by the application program, can also be stored in the computer-readable storage medium.
  • the input device 106 may be a device used by a user to input instructions, and may include one or more of a keyboard, a mouse, a microphone, and a touch screen.
  • the output device 108 may output various information (for example, images or sounds) to the outside (for example, a user), and may include one or more of a display, a speaker, and the like.
  • the image capture device 110 may capture images (for example, photos, videos, etc.) desired by the user, and store the captured images in the storage device 104 for use by other components.
  • the example electronic equipment used to implement the neural network model quantification method, apparatus, and electronic equipment according to the embodiments of the present disclosure may be implemented as smart terminals such as smart phones, tablet computers, and computers.
  • This embodiment provides a neural network model quantification method, which can be executed by the above-mentioned electronic equipment such as a computer.
  • the method mainly includes the following steps S202 to S208:
  • Step S202 Obtain a neural network model; where the neural network model includes a convolutional layer, a normalization layer, and a quantized activation layer, and the output feature of the quantized activation layer is an integer feature.
  • the above-mentioned neural network model may be a model that completes the training of a quantitative neural network model
  • the neural network model may be a convolutional neural network, where the convolutional layer, the normalization layer, and the quantized activation layer in the convolutional neural network layer are sequentially Connection, because there are still floating-point parameters in the model that completes the quantized neural network model training, when the model needs to be loaded to the target device, the model’s computing power is required. Therefore, it is also necessary to complete the quantized neural network model.
  • the trained model is further quantified. Without affecting the accuracy of the neural network model, the floating-point parameters in the neural network model can be equivalently converted into integer parameters, thereby reducing the computational complexity of the neural network model and reducing the target The hardware requirements of the equipment.
  • Step S204 Convert the parameters of the convolutional layer into integer parameters.
  • the parameter of the convolutional layer can be the weight of the convolutional layer. Because the weight of the convolutional layer of the neural network model before quantization is a floating-point parameter, when the neural network performs the recognition task, the computational complexity is relatively large.
  • the weights of the accumulation layer are scaled and offset to obtain the integer weights of the convolutional layer, which can reduce the computational complexity during model derivation, and can also reduce the size of the memory occupied by the neural network model. Based on the preset quantization bit width value, the weight of the convolutional layer of the neural network model is transferred from a floating-point parameter to an integer parameter.
  • the weight of the convolutional layer is first scaled by a preset multiple, and then reduced Decrease or increase the preset value (for example, the preset value may be 0) to obtain an integer weight.
  • the aforementioned preset multiple may be determined according to a preset quantization bit width value, for example, the preset multiple may be 2 F -1, F is the preset quantization bit width value, and the size of the preset quantization bit width value may be based on actual conditions.
  • the hardware configuration of the target device to be adapted is determined, for example, the size of the preset quantization bit width may be 2, 4, or 8.
  • step S206 the normalization layer and the quantized activation layer are merged to obtain a merged layer.
  • the normalization layer also called batch normalization layer
  • quantitative activation layer of the neural network model can be integrated into a combined network layer, and the combined layer includes the normalization layer and the original quantitative activation layer Operation, the input feature of the merging layer is the same as the input feature of the normalization layer, and the output feature of the merging layer is the same as the output feature of the quantized activation layer.
  • Step S208 Convert the parameters of the merging layer into integer parameters to obtain a quantized neural network model.
  • the calculation formulas in the merge layer can be converted and simplified, so that the calculation formula after the equivalent conversion is equal to the calculation formula before the conversion, so as to transfer the floating-point parameters in the calculation formula of the merge layer into integer parameters.
  • a further quantified neural network model is obtained.
  • the quantized neural network model can be loaded on the target device, so that the quantized neural network model can realize inference acceleration on the target device.
  • the above-mentioned neural network model quantization method converts the weights of the convolutional layer of the trained neural network model into integer weights, and combines the normalization layer and the quantized activation layer based on a preset operation equivalent algorithm
  • the parameters in the merged layer obtained by the merging are dumped as integer parameters, which reduces the computational complexity of the neural network model, so that the quantized neural network model can run faster on target devices that only support fixed-point or low-bit width operations.
  • this embodiment provides an implementation manner for merging the normalization layer and the quantized activation layer. For details, refer to the following steps (1) to (2) for execution:
  • Step (1) Use the output feature of the normalized layer as the input feature of the quantized activation layer, and merge the normalized layer and the quantized activation layer to obtain the initial output feature of the merged layer.
  • the formula for the above normalization layer is The formula for quantifying the activation layer is Among them, M is the input feature of the quantized active layer, s is the quantization interval of the quantized active layer, and [] in the above equation of the quantized active layer is a rounding sign. Substituting the input feature N of the above-mentioned normalization layer into the formula of the above-mentioned normalization layer, the output characteristic of the above-mentioned normalization layer is obtained Then the output characteristics of the above normalization layer As the input feature M of the quantized activation layer, substitute the formula of the quantized activation layer , Get the initial output characteristics of the merged layer
  • the initial output feature of the merged layer is simplified by the following formula, and the initial output feature of the merged layer is simplified as
  • N is the input feature of the normalization layer
  • b is the offset of the merged layer
  • t is the quantization interval of the merged layer
  • b and t are floating-point numbers
  • b and t are both based on the normalization layer and quantized activation
  • mean is the mean value
  • variance is the variance
  • is the scaling of the normalization layer
  • is the offset of the normalization layer
  • is a constant artificially set.
  • the value of the constant can be much less than 1 and A value other than 0 to prevent the value of the denominator where ⁇ is 0.
  • Step (2) Perform a limit operation on the initial output feature of the merging layer to obtain the target output feature of the merging layer.
  • clip is the limit operation
  • b is the offset of the merged layer
  • t is the quantization interval of the merged layer
  • b and t are both floating-point numbers
  • F is the preset quantization bit width.
  • [] is the rounding symbol.
  • the output feature of the merged layer can be fixed based on the preset quantization bit width value and limit operation. Limit value. The purpose of the above limit operation is to limit the target output feature of the merged layer to [0, 2 F -1].
  • the target output feature of the merged layer is When the initial output feature of the merged layer is When it is in the range of [0, 2 F -1], the target output feature of the merged layer is When the initial output characteristics of the merged layer When the target output feature of the merged layer is 0, when the initial output feature of the merged layer When, the target output feature of the merged layer is 2 F -1.
  • this embodiment provides an implementation manner for converting the parameters of the merging layer into integer-type parameters. For details, please refer to the following steps 1) to 2) for execution:
  • Step 1) Use a preset operational equivalent algorithm to determine the common positive integer of the merging layer.
  • Each candidate sequence that meets the preset conditions can be selected from the preset range; among them, the preset range is [1, 2 F -1]. Extract the minimum value from each candidate sequence to form the target sequence. The maximum value in the target sequence is taken as the common positive integer of the merge layer. The candidate sequence that meets the preset condition is selected from the preset range of [1, 2 F -1], and the candidate sequence may be one or more.
  • the above-mentioned preset conditions can be determined according to actual conditions.
  • the above inequality is to substitute i and j satisfying the condition of j>i into , Extract when the condition j>i is satisfied The maximum value of; Substitute i and j that satisfy the condition of j ⁇ i , Extract when the condition of j ⁇ i is satisfied The minimum value.
  • Each candidate sequence is searched, and the minimum value of the sequence is extracted from each candidate sequence, and the maximum value is selected from the target sequence formed by the extracted minimum values as the common positive integer in the calculation formula of the merge layer.
  • a brute force search algorithm can also be used to search from the above preset range, and the calculation formula in the merge layer can be equivalently dumped into a common positive integer of the integer parameter.
  • Step 2) Convert the parameters b and t of the merging layer from floating point numbers to integer parameters based on the common positive integer, respectively, to obtain the parameter B and the parameter T; where B is the integer offset and T is the integer quantization interval .
  • the integer offset B and the integer quantization interval T can be obtained based on the public positive integer K that meets the preset condition obtained by the above search.
  • the offset b of the merged layer can be converted from a floating-point parameter to an integer parameter B
  • the quantization interval of the merged layer can be converted from a floating-point parameter t to an integer parameter T.
  • the integer quantization interval T may be set according to the common positive integer K first, and then the integer offset B can be obtained according to the formula conversion relationship. For example, you can set T to be two integers adjacent to t ⁇ K, and verify that these two integers satisfy the following inequality:
  • the neural network model quantization method provided by the embodiment of the present disclosure may further include: obtaining the conversion of the merged layer based on the shared positive integer, the integer offset of the merged layer, and the integer quantization interval
  • the initial output feature after integer parameter, the initial output feature of the merged layer after the integer parameter conversion is Performing a limit operation on the initial output feature of the merged layer after the integer parameter conversion to obtain the target output feature of the merged layer after the integer parameter conversion; where K is the common positive integer.
  • the initial output characteristics can be determined The initial output characteristics after equivalent conversion of integer parameters Through the initial output characteristics after converting integer parameters Perform limit calculations to obtain the target output characteristics combined after converting integer parameters Among them, clip is the limit operator, Indicates the initial output characteristics after converting integer parameters When it is greater than 2 F -1, set the target output feature of the merged layer after the integer parameter conversion to 2 F -1, and the initial output feature after the integer parameter conversion When it is less than 0, set the target output feature of the merged layer after the integer parameter conversion is set to 0, and the initial output feature after the integer parameter conversion When it is greater than 0 and less than 2 F -1, the target output characteristic after the integer parameter conversion is the same as the initial output characteristic after the integer parameter conversion, and the value remains unchanged before and after.
  • the above-mentioned neural network model may be a neural network model that has completed model training, and its training method can be performed with reference to the following steps a to c:
  • Step a In the iterative training process of the neural network model, determine the candidate value of the weight of the convolutional layer based on the preset quantization bit width value, and determine the quantization of the convolutional layer according to the candidate value and the original weight of the convolutional layer Weight; where the quantization weight of the convolutional layer is a floating point number.
  • the candidate value of the quantization weight of the convolutional layer can be determined based on the preset quantization bit width value Among them, F is the preset quantization bit width value, and the value range of the weight of the convolutional layer is [0, 1]. A value that satisfies a preset condition can be selected from the candidate values as the quantization weight of the convolutional layer.
  • the aforementioned preset quantization bit width value can limit the value range and expression space of the weight of the convolutional layer, so that the weight of the convolution layer is dumped as a floating point number that meets the preset quantization bit width.
  • the aforementioned preset condition may be the value with the smallest absolute value of the difference between the weight of the convolutional layer, that is, the value closest to the weight of the convolutional layer.
  • the preset quantization bit width value is 2
  • the weight of the convolutional layer is 0.6223, calculate the absolute value of the difference between the weight of the convolutional layer and each candidate value.
  • the smallest absolute value of the difference between the weight of the convolutional layer and the weight of the convolutional layer can be 0.67 as the quantization of the convolutional layer. Weights. Therefore, the quantization of the parameters of the convolutional layer is realized in the training of the neural network model, and the weight of the convolutional layer is converted into a floating point number that meets the preset quantization bit width.
  • Step b Set each parameter in the normalization layer to a floating point number; wherein, each parameter includes a mean value, a variance, a scaling value, and an offset.
  • the parameters of the normalization layer can be quantified, and the formula for the above normalization layer is
  • Each parameter in the normalization layer can be set to a floating point number.
  • Step c Determine the output feature of the quantized active layer based on the preset quantization bit width value.
  • the output parameters of the quantized activation layer can be subjected to a limit operation based on the preset quantization bit width value to obtain the output characteristics of the quantized activation layer; wherein the calculation formula of the above limit operation is:
  • clip is the limit operator
  • M is the input feature of the quantized activation layer
  • s is the quantization interval of the quantized activation layer
  • F is the preset quantization bit width value
  • the above limit calculation is to limit the output characteristics of the quantized activation layer to [0, 2 F -1], when When it is between 0 and 2 F -1, the calculation formula of the active layer will be quantified
  • the calculated value is used as the output feature output of the quantized activation layer, when , Use 0 as the output feature output of the quantized activation layer, when At the time, 2 F -1 is output as the output feature of the quantized activation layer.
  • the neural network model quantification method provided in this embodiment may further include: sending the quantized neural network model to the target electronic device, so that the quantized neural network model is executed by the target electronic device.
  • Corresponding target task Send the quantized convolutional neural network whose each network layer is an integer parameter to the target electronic device that needs to perform target recognition.
  • the target electronic device executes the target task corresponding to the quantized neural network model to achieve the target task.
  • Speed up reasoning The target task may be related to the training process of the neural network model. For example, the target task is to identify a target in an image, and the neural network model is trained based on the target.
  • the above-mentioned neural network model quantization method integrates the normalization layer with floating-point operations and the quantized activation function in the neural network model into a network layer that only uses integer parameter operations, so that the quantized neural network
  • the model can run on target devices that only support fixed-point arithmetic or integer arithmetic, which reduces the computational complexity and resource overhead of the neural network model.
  • this embodiment provides an example of applying the foregoing neural network model quantization method to quantify a convolutional neural network.
  • the convolutional neural network quantization flowchart shown in FIG. 3 for details, please refer to The following steps S302 to S306 are executed:
  • Step S302 In the iterative training of the convolutional neural network, the parameters of the target network layer of the convolutional neural network are quantified until the iterative training is completed, and the trained convolutional neural network is obtained.
  • the aforementioned target network layer may include a convolutional layer, a normalization layer, and a quantized activation layer, and may also include other network layers.
  • the target network layer can be changed based on the preset quantization bit width.
  • the parameters are converted into floating-point parameters, and the output characteristics of the quantized activation layer can be operated on the limit operation to limit the output limit of the target network layer to [0, 2 F -1], so that the target network layer The output characteristics conform to the preset quantization bit width value.
  • Step S304 Perform network layer merging and parameter deployment on the trained convolutional neural network, and transfer the floating-point parameters in the trained convolutional neural network to integer parameters based on the preset arithmetic equivalent algorithm to obtain The quantized convolutional neural network.
  • the normalization layer and the quantized activation layer of the convolutional neural network can be merged to use one merged layer for merging operations.
  • the weight of the convolutional layer is scaled and shifted to convert the parameters of the convolutional layer into integer-type parameters, and then the parameters in the calculation formula of the merged layer can be converted into integer-type parameters.
  • Step S306 Load the quantized convolutional neural network into the target device, so that the quantized convolutional neural network can accelerate inference in the target device.
  • the quantized convolutional neural network with integer parameters can be transplanted or loaded to the target device that needs target recognition.
  • the network reduces the computational complexity, so the quantized convolutional neural network can realize the accelerated reasoning of the target recognition process in the target device.
  • the aforementioned neural network model quantization method provided in this embodiment enables the quantized convolutional neural network to run on target devices that only support fixed-point operations or integer operations, reducing the computational complexity and resource overhead of the convolutional neural network. .
  • an embodiment of the present disclosure provides a neural network model quantization device.
  • the device includes the following modules :
  • the obtaining module 51 is used to obtain a neural network model; wherein the neural network model includes a convolutional layer, a normalization layer, and a quantized activation layer; wherein, the output feature of the quantized activation layer is an integer type feature.
  • the processing module 52 is used to convert the parameters of the convolutional layer into integer parameters.
  • the merging module 53 is used to merge the normalization layer and the quantized activation layer to obtain a merged layer.
  • the quantization module 54 is used to convert the parameters of the merging layer into integer-type parameters to obtain a quantized neural network model.
  • the above-mentioned neural network model quantization device transfers the weights of the convolutional layer of the trained neural network model into integer weights, and activates the normalization layer and quantization based on a preset operation equivalent algorithm
  • the parameters in the merged layer obtained by layer merging are converted to integer parameters, which reduces the computational complexity of the neural network model, so that the quantized neural network model can run faster on target devices that only support fixed-point or low-bit width operations.
  • the above-mentioned merging module 53 is optionally configured to use the output feature of the normalization layer as the input feature of the quantized activation layer, and merge the normalization layer and the quantized activation layer to obtain the merged layer's output features.
  • Initial output feature Perform limit operation on the initial output feature of the merged layer to obtain the target output feature of the merged layer.
  • the aforementioned merging module 53 is optionally configured to set the target output feature of the merging layer to 0 when the initial output feature of the merging layer is less than 0, and when the initial output feature of the merging layer is greater than 2 F When -1, the target output feature of the merged layer is set to 2 F -1 to limit the size of the initial output feature of the merged layer to [0, 2 F -1].
  • the initial output characteristic of the merged layer is Among them, N is the input feature of the normalization layer, b is the offset of the merged layer, t is the quantization interval of the merged layer, b and t are floating-point numbers, and b and t are both based on the normalization layer and quantized activation The parameters of the layers are combined.
  • the above-mentioned quantization module 54 is optionally used to determine the common positive integer of the merging layer by using a preset arithmetic equivalent algorithm; based on the common positive integer, the parameters b and t of the merging layer are replaced by floating-point numbers. Converted to an integer parameter, the parameter B and the parameter T are obtained respectively; among them, B is the integer offset and T is the integer quantization interval.
  • the above-mentioned quantization module 54 is optionally configured to obtain the converted integer type of the merged layer based on the shared positive integer, the integer offset of the merged layer, and the integer quantization interval.
  • the initial output feature after the parameter, the initial output feature of the merged layer after the integer parameter conversion is It is also used to perform a limit operation on the initial output characteristics of the merged layer after the integer-type parameters are converted to obtain the target output characteristics of the merged layer after the integer-type parameters are converted; where K is the common positive integer.
  • the above-mentioned quantization module 54 is optionally used to select each candidate sequence satisfying preset conditions from a preset range; wherein, the preset range is [1, 2 F -1]; The minimum value is extracted from the candidate sequence to form the target sequence; the maximum value in the target sequence is used as the common positive integer of the merge layer.
  • the above-mentioned processing module 52 is optionally configured to perform scaling and offset processing on the weight of the convolutional layer to obtain the integer weight of the convolutional layer.
  • the above-mentioned device further includes:
  • the training module 60 is used to determine the candidate value of the weight of the convolutional layer based on the preset quantization bit width value in the iterative training process of the neural network model, and determine the convolution based on the candidate value and the original weight of the convolutional layer
  • the quantization weight of the layer where the quantization weight of the convolutional layer is a floating point number; each parameter in the normalization layer is set to a floating point number; where each parameter includes the mean, variance, scaling value and offset; based on preset
  • the quantization bit width value determines the output characteristics of the quantized active layer.
  • the above-mentioned training module 60 is optionally used to determine the candidate value of the quantization weight of the convolutional layer based on a preset quantization bit width value
  • F is the preset quantization bit width value
  • the value range of the weight of the convolutional layer is [0, 1]
  • a value that satisfies the preset condition is selected from the candidate values as the quantization weight of the convolutional layer.
  • the above-mentioned training module 60 is optionally configured to perform limit operations on the output parameters of the quantized activation layer based on a preset quantization bit width value to obtain the output characteristics of the quantized activation layer; wherein, the limit value
  • the calculation formula of the operation is:
  • clip is the limit operator
  • M is the input feature of the quantized active layer
  • s is the quantization interval of the quantized active layer
  • F is the quantized bit width value
  • the above-mentioned device further includes:
  • the sending module is used to send the quantized neural network model to the target electronic device, so as to execute the target task corresponding to the quantized neural network model through the target electronic device.
  • the above-mentioned neural network model quantization device can integrate the normalization layer with floating-point operation and the quantized activation function in the neural network model into a network layer that only uses integer parameter operations, so that the quantized neural network
  • the network model can run on target devices that only support fixed-point arithmetic or integer arithmetic, which reduces the computational complexity and resource overhead of the neural network model.
  • the embodiments of the present disclosure provide a mobile device, wherein the mobile device can be communicatively connected with the electronic device provided in the foregoing embodiment, and the electronic device can be used to execute the neural network model quantification method provided in the foregoing embodiment.
  • the above-mentioned mobile device is used to obtain a quantized neural network model obtained from the operation of the electronic device, and perform image processing based on the quantized neural network model; wherein, the mobile device includes a smart terminal and/or a smart camera.
  • the embodiments of the present disclosure provide a computer-readable medium, wherein the computer-readable medium stores computer-executable instructions, and when the computer-executable instructions are called and executed by a processor, the computer-executable instructions cause The processor implements the neural network model quantization method described in the foregoing embodiment.
  • the neural network model quantization method, device, and computer program product of the electronic device provided by the embodiments of the present disclosure include a computer-readable storage medium storing program code, and the instructions included in the program code can be used to execute the above method embodiments.
  • program code storing program code
  • the instructions included in the program code can be used to execute the above method embodiments.
  • the terms “installed”, “connected”, and “connected” should be interpreted broadly, for example, they may be fixed connections or detachable connections. , Or integrally connected; it can be a mechanical connection or an electrical connection; it can be directly connected or indirectly connected through an intermediate medium, and it can be the internal communication between two components.
  • installed should be interpreted broadly, for example, they may be fixed connections or detachable connections. , Or integrally connected; it can be a mechanical connection or an electrical connection; it can be directly connected or indirectly connected through an intermediate medium, and it can be the internal communication between two components.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present disclosure essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present disclosure.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes. .
  • This embodiment provides a neural network model quantization method, device, electronic equipment, and computer-readable storage medium.
  • the weights of the convolutional layer of the trained neural network model are transferred to integer weights and normalized by merging.
  • the parameters of the merged layer obtained by the layer and the quantized activation layer are converted into integer parameters, thereby reducing the computational complexity of the neural network model, so that the quantized neural network model can be accelerated on target devices that only support fixed-point or low-bit width operations .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

本公开提供了一种神经网络模型量化方法、装置、电子设备以及计算机可读存储介质,涉及机器学习技术领域,该方法包括:获取神经网络模型;其中,神经网络模型包括卷积层、归一化层和量化激活层,量化激活层的输出特征为整数型特征;将所述卷积层的参数转换为整数型参数;对归一化层和所述量化激活层进行合并,得到合并层;将合并层的参数转换为整数型参数,得到量化后的神经网络模型。本公开降低了神经网络模型的运算复杂度,使得量化后的神经网络模型在只支持定点或低位宽运算的目标设备上可以加速运行。

Description

神经网络模型量化方法、装置、电子设备及计算机可读存储介质
相关申请的交叉引用
本申请要求于2020年03月10日提交中国专利局的申请号为202010164284.3、名称为“神经网络模型量化方法、装置及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本公开涉及机器学习技术领域,尤其是涉及一种神经网络模型量化方法、装置及电子设备。
背景技术
目前神经网络模型在语音识别、文字识别、以及图像视频识别等许多领域中已经有了广泛而成功的应用。在一些指定的目标任务中,训练好的神经网络模型需要进一步部署到目标设备进行加速,由于一般的神经网络模型都是双精度或单精度浮点数运算,为了使尽可能多地目标设备满足运算需求,研究人员对神经网络模型进行了量化,以降低神经网络模型的运算复杂度及运算单元开销。然而,现有的量化后的神经网络模型在归一化层依然存在浮点参数,所以仍需要浮点运算单元来支持运算,而无法运行在只支持定点或低位宽运算的目标设备上。因此,现有的量化后的神经网络模型,还存在因计算复杂度较高而无法运行在只支持定点或低位宽运算的目标设备上的问题。
发明内容
有鉴于此,本公开的目的提供一种神经网络模型量化方法、装置、电子设备及计算机可读存储介质,降低了神经网络模型的运算复杂度,使得量化后的神经网络模型在只支持定点或低位宽运算的目标设备上可以加速运行。
为了实现上述目的,本公开实施例采用的技术方案如下:
本公开实施例提供了一种神经网络模型量化方法,包括:获取神经网络模型;其中,所述神经网络模型包括卷积层、归一化层和量化激活层,所述量化激活层的输出特征为整数型特征;将所述卷积层的参数转换为整数型参数;对所述归一化层和所述量化激活层进行合并,得到合并层;将所述合并层的参数转换为整数型参数,得到量化后的神经网络模型。
可选地,本公开实施例提供了一种可能的实施方式,其中,所述对所述归一化层和所述量化激活层进行合并的步骤,包括:将所述归一化层的输出特征作为所述量化激活层的输入特征,对所述归一化层和所述量化激活层进行合并,得到合并层的初始输出特征;对所述合并层的初始输出特征进行限值运算,得到所述合并层的目标输出特征。
可选地,本公开实施例提供了一种可能的实施方式,其中,所述对所述合并层的初始输出特征进行限值运算,得到所述合并层的目标输出特征的步骤,包括:将所述合并层的初始输出特征的大小限值在[0,2 F-1]内,当所述合并层的初始输出特征小于0时,将所述合并层的目标输出特征设置为0,当所述合并层的初始输出特征大于2 F-1时,将所述合并层的目标输出特征设置为2 F-1。
可选地,本公开实施例提供了一种可能的实施方式,其中,所述合并层的初始输出特征为
Figure PCTCN2020119608-appb-000001
其中,N为所述归一化层的输入特征,b为所述合并层的偏移量,t为所述合并层的量化间隔,b和t均为浮点数,b和t均是基于所述归一化层和所述量化激活层的参数合并得到的。
可选地,本公开实施例提供了一种可能的实施方式,其中,所述将所述合并层的参数转换为整数型参数的步骤,包括:利用预设的运算等效算法确定所述合并层的公用正整数;基于所述公用正整数,将所述合并层的参数b和t分别由浮点数转换为整数型参数,分别得到参数B和参数T;其中,B为整数型偏移量,T为整数型量化间隔。
可选地,本公开实施例提供了一种可能的实施方式,包括:基于所述共用正整数、所述合并层的整数型偏移量和整数型量化间隔,得到所述合并层的转换整数型参数后的初始输出特征,所述合并层的转换整数型参数后的初始输出特征为
Figure PCTCN2020119608-appb-000002
对所述合并层的转换整数型参数后的初始输出特征进行限值运算,得到所述合并层的转换整数型参数后的目标输出特征;其中,K为所述共用正整数。可选地,本公开实施例提供了一种可能的实施方式,其中,所述利用预设的运算等效算法确定所述合并层的公用正整数的步骤,包括:从预设范围中选取满足预设条件的各个候选序列;其中,所述预设范围为[1,2 F-1];从各个所述候选序列中提取最小值,构成目标序列;将所述目标序列中的最大值作为所述合并层的公用正整数。
可选地,本公开实施例提供了一种可能的实施方式,其中,所述将所述卷积层的参数转换为整数型参数的步骤,包括:对所述卷积层的权重进行缩放和偏移处理,得到所述卷积层的整数型权重。
可选地,本公开实施例提供了一种可能的实施方式,其中,所述神经网络模型通过以下方式训练:在神经网络模型的迭代训练过程中,基于预设的量化位宽值确定所述卷积层的权重的候选取值,并根据所述候选取值和所述卷积层的原始权重确定所述卷积层的量化 权重;其中,所述卷积层的量化权重为浮点数;将所述归一化层中的各个参数设置为浮点数;其中,各个所述参数包括均值、方差、缩放值和偏移量;基于所述预设的量化位宽值确定所述量化激活层的输出特征。
可选地,本公开实施例提供了一种可能的实施方式,其中,基于所述预设的量化位宽值确定所述卷积层的权重的候选取值,并根据所述候选取值和所述卷积层的原始权重确定所述卷积层的量化权重的步骤,包括:基于所述预设的量化位宽值确定所述卷积层的量化权重的候选取值为
Figure PCTCN2020119608-appb-000003
其中,F为所述预设的量化位宽值,所述卷积层的权重的取值范围为[0,1];从所述候选取值中选取满足预设条件的数值作为所述卷积层的量化权重。
可选地,本公开实施例提供了一种可能的实施方式,其中,所述基于所述预设的量化位宽值确定所述量化激活层的输出特征的步骤,包括:基于所述预设的量化位宽值对所述量化激活层的输出参数进行限值运算,得到所述量化激活层的输出特征;其中,所述限值运算的计算算式为:
Figure PCTCN2020119608-appb-000004
其中,clip为限值运算符,
Figure PCTCN2020119608-appb-000005
为所述量化激活层的输出特征,当
Figure PCTCN2020119608-appb-000006
时,所述量化激活层的输出特征为0,当
Figure PCTCN2020119608-appb-000007
时,所述量化激活层的输出特征为2 F-1,M为所述量化激活层的输入特征,s为所述量化激活层的量化间隔,F为所述预设的量化位宽值。
可选地,本公开实施例提供了一种能的实施方式,其中,将所述量化后的神经网络模型发送至目标电子设备,以通过所述目标电子设备执行所述量化后的神经网络模型所对应的目标任务。
本公开实施例还提供了一种神经网络模型量化装置,包括:获取模块,用于获取神经网络模型;其中,所述神经网络模型包括卷积层、归一化层和量化激活层;其中,所述量化激活层的输出特征为整数型特征;处理模块,用于将所述卷积层的参数转换为整数型参数;合并模块,用于对所述归一化层和所述量化激活层进行合并,得到合并层;量化模块,用于将所述合并层的参数转换为整数型参数,得到量化后的神经网络模型。
本公开实施例提供了一种电子设备,包括:处理器和存储装置;所述存储装置上存储有计算机程序,所述计算机程序在被所述处理器运行时执行上述第一方面任一项所述的方法。
本公开实施例提供了一种移动设备,与上述第一方面任一项所述的电子设备通信连接;所述移动设备用于获取所述电子设备运行得到的量化后的神经网络模型,并基于所述量化后的神经网络模型进行图像处理;其中,所述移动设备包括智能终端和/或智能相机。
本公开实施例提供了一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器运行时执行上述任一项所述的方法的步骤。
本公开实施例提供了一种神经网络模型量化方法、装置及电子设备,首先通过获取神经网络模型(包括卷积层、归一化层和量化激活层,量化激活层的输出特征为整数型特征);然后将卷积层的参数转换为整数型参数;再对归一化层和量化激活层进行合并,得到合并层;最后将合并层的参数转换为整数型参数,得到量化后的神经网络模型。通过将训练完成的神经网络模型的卷积层的权重转换为整数型权重,并基于预设的运算等效算法将归一化层和量化激活层合并得到的合并层中的参数转存为整数型参数,降低了神经网络模型的运算复杂度,使得量化后的神经网络模型在只支持定点或低位宽运算的目标设备上可以加速运行。
本公开实施例的其他特征和优点将在随后的说明书中阐述,或者,部分特征和优点可以从说明书推知或毫无疑义地确定,或者通过实施本公开实施例的上述技术即可得知。
为使本公开的上述目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附附图,作详细说明如下。
附图说明
为了更清楚地说明本公开具体实施方式或现有技术中的技术方案,下面将对具体实施方式或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本公开的一些实施方式,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1示出了本公开实施例所提供的一种电子设备的结构示意图;
图2示出了本公开实施例所提供的一种神经网络模型量化方法流程图;
图3示出了本公开实施例所提供的一种卷积神经网络量化流程图;
图4示出了本公开实施例所提供的一种卷积神经网络的参数转存与部署流程图;
图5示出了本公开实施例所提供的一种神经网络模型量化装置结构示意图;
图6示出了本公开实施例所提供的一种神经网络模型量化装置结构示意图。
具体实施方式
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合附图对本公开的技术方案进行描述,显然,所描述的实施例是本公开一部分实施例,而不是全部的实施例。
考虑到现有的量化后的神经网络模型,还存在因计算复杂度较高而无法运行在只支持定点或低位宽运算的目标设备上的问题,为改善此问题,本公开实施例提供的一种神经网 络模型量化方法、装置、电子设备以及计算机可读存储介质,可以应用于降低神经网络模型的运算复杂度,使得量化后的神经网络模型在只支持定点或低位宽运算的目标设备上可以加速运行。以下对本公开实施例进行详细介绍。
首先,可以参照图1来描述用于实现本公开实施例的一种神经网络模型量化方法、装置及电子设备的示例电子设备100。
如图1所示的一种电子设备的结构示意图,电子设备100可以包括一个或多个处理器102、一个或多个存储装置104、输入装置106、输出装置108以及图像采集装置110,这些组件可以通过总线系统112和/或其它形式的连接机构(未示出)互连。应当注意,图1所示的电子设备100的组件和结构只是示例性的,而非限制性的,根据需要,所述电子设备也可以具有其他组件和结构。
所述处理器102可以采用数字信号处理器(DSP)、现场可编程门阵列(FPGA)、可编程逻辑阵列(PLA)中的至少一种硬件形式来实现,所述处理器102可以是中央处理单元(CPU)、图形处理单元(GPU)或者具有数据处理能力和/或指令执行能力的其它形式的处理单元中的一种或几种的组合,并且可以控制所述电子设备100中的其它组件以执行期望的功能。
所述存储装置104可以包括一个或多个计算机程序产品,所述计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。所述易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。所述非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存等。在所述计算机可读存储介质上可以存储一个或多个计算机程序指令,处理器102可以运行所述程序指令,以实现下文所述的本公开实施例中(由处理器实现)的客户端功能以及/或者其它期望的功能。在所述计算机可读存储介质中还可以存储各种应用程序和各种数据,例如所述应用程序使用和/或产生的各种数据等。
所述输入装置106可以是用户用来输入指令的装置,并且可以包括键盘、鼠标、麦克风和触摸屏等中的一个或多个。
所述输出装置108可以向外部(例如,用户)输出各种信息(例如,图像或声音),并且可以包括显示器、扬声器等中的一个或多个。
所述图像采集装置110可以拍摄用户期望的图像(例如照片、视频等),并且将所拍摄的图像存储在所述存储装置104中以供其它组件使用。
示例性地,用于实现根据本公开实施例的神经网络模型量化方法、装置及电子设备的示例电子设备可以被实现为诸如智能手机、平板电脑、计算机等智能终端。
本实施例提供了一种神经网络模型量化方法,该方法可以由诸如计算机等上述电子设 备执行,参见图2所示的神经网络模型量化方法流程图,该方法主要包括以下步骤S202~步骤S208:
步骤S202,获取神经网络模型;其中,神经网络模型包括卷积层、归一化层和量化激活层,量化激活层的输出特征为整数型特征。
上述神经网络模型可以是完成量化神经网络模型训练的模型,该神经网络模型可以是卷积神经网络,其中,该卷积神经网络网络层中的卷积层、归一化层和量化激活层依次连接,由于完成量化神经网络模型训练的模型中依然存在浮点型参数,在需要将模型加载至目标设备的情况下,对模型的运算能力要求较高,因此,还需要对完成量化神经网络模型训练的模型进行进一步量化,在不影响神经网络模型精度的情况下,可以将神经网络模型中的浮点型参数等效转换为整数型参数,进而降低神经网络模型的运算复杂度,降低对于目标设备的硬件条件要求。
步骤S204,将卷积层的参数转换为整数型参数。
上述卷积层的参数可以是卷积层的权重,由于量化前上述神经网络模型的卷积层的权重为浮点型参数,在神经网络执行识别任务时,计算复杂度较大,通过对卷积层的权重进行缩放和偏移处理,得到卷积层的整数型权重,从而可以降低模型推导时的计算复杂度,还可以降低神经网络模型占用内存的大小。基于预设的量化位宽值将神经网络模型的卷积层的权重由浮点型参数转存为整数型参数,具体的,首先对上述卷积层的权重进行预设倍数的缩放,再减小或增加预设数值(诸如,该预设数值可以是0),从而得到整数型权重。上述预设倍数可以根据预设的量化位宽值确定,诸如,该预设倍数可以是2 F-1,F为预设的量化位宽值,预设的量化位宽值的大小可以根据实际要适配的目标设备的硬件配置确定,诸如,该预设的量化位宽的大小可以是2、4或8。
步骤S206,对归一化层和量化激活层进行合并,得到合并层。
可以将神经网络模型的归一化层(也可以称为批归一化层)和量化激活层整合为一个合并的网络层,且该合并层包括了归一化层和量化激活层原有的运算,该合并层的输入特征与上述归一化层的输入特征相同,该合并层的输出特征与上述量化激活层的输出特征相同。
步骤S208,将合并层的参数转换为整数型参数,得到量化后的神经网络模型。
可以对合并层中的运算算式进行转换化简,使等效转换后的算式与转换前的算式相等,从而将合并层的运算算式中的浮点型参数转存为整数型参数。通过将神经网络模型的卷积层和合并层的参数均转换为整型参数,得到进一步量化后的神经网络模型。此后,可以将该量化后的神经网络模型加载在目标设备,使该量化后的神经网络模型可以在目标设备上实现推理加速。
本实施例提供的上述神经网络模型量化方法,通过将训练完成的神经网络模型的卷积层的权重转换为整数型权重,并基于预设的运算等效算法将归一化层和量化激活层合并得到的合并层中的参数转存为整数型参数,降低了神经网络模型的运算复杂度,使得量化后的神经网络模型在只支持定点或低位宽运算的目标设备上可以加速运行。
为了降低神经网络模型的运算复杂度,本实施例提供了对归一化层和量化激活层进行合并的实施方式,具体可参照如下步骤(1)~步骤(2)执行:
步骤(1):将归一化层的输出特征作为量化激活层的输入特征,对归一化层和量化激活层进行合并,得到合并层的初始输出特征。
上述归一化层的算式为
Figure PCTCN2020119608-appb-000008
量化激活层的算式为
Figure PCTCN2020119608-appb-000009
其中,M为量化激活层的输入特征,s为量化激活层的量化间隔,上述量化激活层的算式中的[]为取整符号。将上述归一化层的输入特征N代入上述归一化层的算式中,得到上述归一化层的输出特征
Figure PCTCN2020119608-appb-000010
再将上述归一化层的输出特征
Figure PCTCN2020119608-appb-000011
作为量化激活层的输入特征M代入量化激活层的算式
Figure PCTCN2020119608-appb-000012
中,得到合并层的初始输出特征
Figure PCTCN2020119608-appb-000013
通过以下算式对合并层的初始输出特征进行化简运算,化简得到合并层的初始输出特征为
Figure PCTCN2020119608-appb-000014
Figure PCTCN2020119608-appb-000015
其中,N为归一化层的输入特征,b为合并层的偏移量,t为合并层的量化间隔,b和t均为浮点数,b和t均是基于归一化层和量化激活层的参数合并得到的,
Figure PCTCN2020119608-appb-000016
Figure PCTCN2020119608-appb-000017
mean为均值,variance为方差,γ为归一化层的缩放量,β为归一化层的偏移量,∈为人为设置的常量,诸如,该常量的值可以是一个远远小于1且不为0的数值,以防止∈所在分母的值为0。
步骤(2):对合并层的初始输出特征进行限值运算,得到合并层的目标输出特征。
可以利用限值运算对合并层的初始输出特征
Figure PCTCN2020119608-appb-000018
进行限值运算,得到合并层的目标输出特征
Figure PCTCN2020119608-appb-000019
其中,clip为限值运算,b为合并层的偏移量,t为合并层的量化间隔,b和t均为浮点数,F为预设的量化位宽值,
Figure PCTCN2020119608-appb-000020
为合并层的初始输出特征,[]为取整符号,为了使上述合并 层的输出特征也为整数型参数,可以基于预设的量化位宽值及限值运算对合并层的输出特征进行定点限值,上述限制运算的目的是将合并层的目标输出特征限值在[0,2 F-1]之间,当合并层的初始输出特征
Figure PCTCN2020119608-appb-000021
处于[0,2 F-1]范围内时,合并层的目标输出特征为
Figure PCTCN2020119608-appb-000022
当合并层的初始输出特征
Figure PCTCN2020119608-appb-000023
时,合并层的目标输出特征为0,当合并层的初始输出特征
Figure PCTCN2020119608-appb-000024
时,合并层的目标输出特征为2 F-1。
为了进一步降低神经网络模型的运算复杂度,本实施例提供了将合并层的参数转换为整数型参数的实施方式,具体可参照如下步骤1)~步骤2)执行:
步骤1):利用预设的运算等效算法确定合并层的公用正整数。
可以从预设范围中选取满足预设条件的各个候选序列;其中,预设范围为[1,2 F-1]。从各个候选序列中提取最小值,构成目标序列。将目标序列中的最大值作为合并层的公用正整数。从预设范围为[1,2 F-1]中选取满足预设条件的候选序列,该候选序列可以是一个或多个。上述预设条件可以根据实际情况确定,诸如,设上述候选序列S的中的元素S[i]=[it-b],b为合并层的偏移量,t为合并层的量化间隔,从预设范围[1,2 F-1]中递推式选取数值i和j,将满足以下算式的i和j代入候选序列中:
Figure PCTCN2020119608-appb-000025
上述不等式是将满足j>i的条件的i和j代入
Figure PCTCN2020119608-appb-000026
中,提取满足j>i条件时
Figure PCTCN2020119608-appb-000027
的最大值;将满足j<i的条件的i和j代入
Figure PCTCN2020119608-appb-000028
中,提取满足j<i条件时
Figure PCTCN2020119608-appb-000029
的最小值。搜索各个候选序列,并从每个候选序列中提取该序列的最小值,再从提取到的各个最小值构成的目标序列中选取最大值作为合并层的运算算式中的公用正整数。在一种具体的实施方式中,还可以采用暴力搜索算法从上述预设范围中搜索可以将合并层中的运算算式等效转存为整型参数的公用正整数。
步骤2):基于公用正整数将合并层的参数b和t分别由浮点数转换为整数型参数,分别得到参数B和参数T;其中,B为整数型偏移量,T为整数型量化间隔。
具体地,可以基于上述搜索得到的符合预设条件的公共正整数K得到整数型偏移量B,整数型量化间隔T。从而可以将合并层的偏移量b由浮点型参数转换为整型参数B,将合并层的量化间隔由浮点型参数t转换为整型参数T。
在一种具体的实施方式中,可以首先根据公共正整数K设定整数型量化间隔T,再根据公式转换关系得到整数型偏移量B。诸如,可以设T为与t×K相邻的两个整数,并对这两个整数进行逐一验证是否满足以下不等式:
max i(iT-K[it-b])<min i(iT-K[it-b])+K
其中,上述i为预设范围[1,2 F-1]中的任意一个整数,将预设范围[1,2 F-1]中的整数逐一作为值代入上述不等式,当(iT-K[it-b])中的最大值小于(iT-K[it-b])+K的最小值时,即得到满足上述不等式的i和T。将满足上述不等式的i和T代入B=max i(iT-Kit-b,从而求解出整数型偏移量B和整数型量化间隔T。
可选地,本公开实施例提供的神经网络模型量化方法,还可以包括:基于所述共用正整数、所述合并层的整数型偏移量和整数型量化间隔,得到所述合并层的转换整数型参数后的初始输出特征,所述合并层的转换整数型参数后的初始输出特征为
Figure PCTCN2020119608-appb-000030
对所述合并层的转换整数型参数后的初始输出特征进行限值运算,得到所述合并层的转换整数型参数后的目标输出特征;其中,K为所述共用正整数。
具体地,可以基于得到的公共正整数K、整数型偏移量B和整数型量化间隔T,确定与初始输出特征
Figure PCTCN2020119608-appb-000031
等效的转换整数型参数后的初始输出特征
Figure PCTCN2020119608-appb-000032
通过对转换整数型参数后的初始输出特征
Figure PCTCN2020119608-appb-000033
进行限值运算,可以得到转换整数型参数后合并成的目标输出特征
Figure PCTCN2020119608-appb-000034
其中,clip为限值运算符,
Figure PCTCN2020119608-appb-000035
表示当转换整数型参数后的初始输出特征
Figure PCTCN2020119608-appb-000036
大于2 F-1时,将合并层的转换整数型参数后的目标输出特征设置为2 F-1,当转换整数型参数后的初始输出特征
Figure PCTCN2020119608-appb-000037
小于0时,将合并层的转换整数型参数后的目标输出特征设置为0,当转换整数型参数后的初始输出特征
Figure PCTCN2020119608-appb-000038
大于0并且小于2 F-1时,则转换整数型参数后的目标输出特征与转换整数型参数后的初始输出特征的数值相同,数值前后保持不变。
在一种具体的实施方式中,上述神经网络模型可以是已经完成模型训练的神经网络模型,其训练方式可参照如下步骤a~步骤c执行:
步骤a:在神经网络模型的迭代训练过程中,基于预设的量化位宽值确定卷积层的权重的候选取值,并根据候选取值和卷积层的原始权重确定卷积层的量化权重;其中,卷积层的量化权重为浮点数。
在神经网络模型训练的每一轮迭代中,可以基于预设的量化位宽值确定卷积层的量化权重的候选取值为
Figure PCTCN2020119608-appb-000039
其中,F为预设的量化位宽值,卷积层的权重的取值范围为[0,1]。可以从候选取值中选取满足预设条件的数值作为卷积层的量化权重。上述预设的量化位宽值可以限制卷积层权重的值域范围和表达空间,使卷积层的权重转存为符合预设的量化位宽的浮点数。上述预设条件可以是与卷积层的权 重之差的绝对值最小的数值,即与卷积层的权重最为接近的数值。诸如,当上述预设的量化位宽值为2时,上述卷积层的量化权重的候选取值的个数被限制为2 F=2 2=4个,这4个候选取值是从[0,1]中均匀获取到的,即这4个候选取值分别是
Figure PCTCN2020119608-appb-000040
Figure PCTCN2020119608-appb-000041
若上述卷积层的权重为0.6223,计算该卷积层的权重与各个候选取值之差的绝对值,可以将与卷积层权重之差的绝对值最小的0.67作为上述卷积层的量化权重。从而在神经网络模型训练中实现了对于卷积层参数的量化,使卷积层的权重转换成符合预设的量化位宽的浮点数。
步骤b:将归一化层中的各个参数设置为浮点数;其中,各个参数包括均值、方差、缩放值和偏移量。
可以对归一化层的参数进行量化,上述归一化层的算式为
Figure PCTCN2020119608-appb-000042
将归一化层中的各个参数可以均设置为浮点数。
步骤c:基于预设的量化位宽值确定量化激活层的输出特征。
可以基于预设的量化位宽值对量化激活层的输出参数进行限值运算,得到量化激活层的输出特征;其中,上述限值运算的计算算式为:
Figure PCTCN2020119608-appb-000043
其中,clip为限值运算符,
Figure PCTCN2020119608-appb-000044
为量化激活层的输出特征,当
Figure PCTCN2020119608-appb-000045
时,量化激活层的输出特征为0,当
Figure PCTCN2020119608-appb-000046
时,量化激活层的输出特征为2 F-1,M为量化激活层的输入特征,s为量化激活层的量化间隔,F为预设的量化位宽值。上述限值运算是将量化激活层的输出特征限制在[0,2 F-1]之间,当
Figure PCTCN2020119608-appb-000047
在0到2 F-1之间时,将量化激活层的运算算式
Figure PCTCN2020119608-appb-000048
计算得到的值作为量化激活层的输出特征输出,当
Figure PCTCN2020119608-appb-000049
时,将0作为量化激活层的输出特征输出,当
Figure PCTCN2020119608-appb-000050
时,将2 F-1作为量化激活层的输出特征输出。
在一种具体的实施方式中,本实施例提供的神经网络模型量化方法,还可以包括:将量化后的神经网络模型发送至目标电子设备,以通过目标电子设备执行量化后的神经网络模型所对应的目标任务。将各个网络层均为整型参数的量化后的卷积神经网络发送至需要进行目标识别的目标电子设备,通过目标电子设备执行量化后的神经网络模型所对应的目标任务,可以实现目标任务的加速推理。该目标任务可以与神经网络模型的训练过程相关,诸如,目标任务为识别图像中的目标,神经网络模型是基于该目标训练得到的。
本实施例提供的上述神经网络模型量化方法,将神经网络模型中有浮点运算的归一化层和量化的激活函数整合成一个只用整型参数运算的网络层,使量化后的神经网络模型可 以运行在只支持定点运算或整型运算的目标设备上,降低了神经网络模型的运算复杂度和资源开销。
在前述实施例的基础上,本实施例提供了一种应用前述神经网络模型量化方法对卷积神经网络进行量化的示例,参见如图3所示的卷积神经网络量化流程图,具体可参照如下步骤S302~步骤S306执行:
步骤S302:在卷积神经网络的迭代训练中,对卷积神经网络的目标网络层的参数进行量化,直至迭代训练结束,得到训练后的卷积神经网络。
上述目标网络层可以包括卷积层、归一化层和量化激活层,还可以包括其他网络层。在卷积神经网络的训练阶段,参见如图4所示的卷积神经网络的参数转存与部署流程图,在每一轮迭代训练中,基于预设的量化位宽值可以将目标网络层的参数转化为浮点型参数,并可以对量化激活层的输出特征进行限值运算,以将目标网络层的输出限值在[0,2 F-1]之间,以使目标网络层的输出特征符合预设的量化位宽值。
步骤S304:对上述训练后的卷积神经网络进行网络层合并和参数部署,基于预设的运算等效算法将训练后的卷积神经网络中的浮点型参数转存为整型参数,得到量化后的卷积神经网络。
如图4所示,在卷积神经网络的部署阶段,可以将卷积神经网络的归一化层和量化激活层进行合并,以使用一个合并层进行合并运算。对卷积层的权重进行缩放、偏移,以将卷积层的参数转换为整数型参数,然后可以将合并层的运算算式中的参数转换为整数型参数。
步骤S306:将上述量化后的卷积神经网络加载进目标设备,以使量化后的卷积神经网络在目标设备中实现推理加速。
如图4所示,在卷积神经网络的加速阶段,可以将带有整型参数的量化后的卷积神经网络移植或加载至需要进行目标识别的目标设备,由于上述量化后的卷积神经网络降低了运算复杂度,因此量化后的卷积神经网络在目标设备中可以实现目标识别过程的加速推理。
本实施例提供的上述神经网络模型量化方法,可以使量化后的卷积神经网络可以运行在只支持定点运算或整型运算的目标设备上,降低了卷积神经网络的运算复杂度和资源开销。
对应于实施例二中所提供的神经网络模型量化方法,本公开实施例提供了一种神经网络模型量化装置,参见图5所示的一种神经网络模型量化装置结构示意图,该装置包括以下模块:
获取模块51,用于获取神经网络模型;其中,神经网络模型包括卷积层、归一化层和量化激活层;其中,量化激活层的输出特征为整数型特征。
处理模块52,用于将卷积层的参数转换为整数型参数。
合并模块53,用于对归一化层和量化激活层进行合并,得到合并层。
量化模块54,用于将合并层的参数转换为整数型参数,得到量化后的神经网络模型。
本实施例提供的上述神经网络模型量化装置,通过将训练完成的神经网络模型的卷积层的权重转存为整数型权重,并基于预设的运算等效算法将归一化层和量化激活层合并得到的合并层中的参数转换为整数型参数,降低了神经网络模型的运算复杂度,使得量化后的神经网络模型在只支持定点或低位宽运算的目标设备上可以加速运行。
在一种实施方式中,上述合并模块53,可选地,用于将归一化层的输出特征作为量化激活层的输入特征,对归一化层和量化激活层进行合并,得到合并层的初始输出特征;对合并层的初始输出特征进行限值运算,得到合并层的目标输出特征。
在一种实施方式中,上述合并模块53,可选地,用于当合并层的初始输出特征小于0时,将合并层的目标输出特征设置为0,当合并层的初始输出特征大于2 F-1时,将合并层的目标输出特征设置为2 F-1,以将合并层的初始输出特征的大小限制在[0,2 F-1]内。
在一种实施方式中,上述合并层的初始输出特征为
Figure PCTCN2020119608-appb-000051
其中,N为归一化层的输入特征,b为合并层的偏移量,t为合并层的量化间隔,b和t均为浮点数,b和t均是基于归一化层和量化激活层的参数合并得到的。
在一种实施方式中,上述量化模块54,可选地,用于利用预设的运算等效算法确定合并层的公用正整数;基于公用正整数将合并层的参数b和t分别由浮点数转换为整数型参数,分别得到参数B和参数T;其中,B为整数型偏移量,T为整数型量化间隔。
在一种实施方式中,上述量化模块54,可选地,用于基于所述共用正整数、所述合并层的整数型偏移量和整数型量化间隔,得到所述合并层的转换整数型参数后的初始输出特征,所述合并层的转换整数型参数后的初始输出特征为
Figure PCTCN2020119608-appb-000052
还用于对所述合并层的转换整数型参数后的初始输出特征进行限值运算,得到所述合并层的转换整数型参数后的目标输出特征;其中,K为所述共用正整数。在一种实施方式中,上述量化模块54,可选地,用于从预设范围中选取满足预设条件的各个候选序列;其中,预设范围为[1,2 F-1];从各个候选序列中提取最小值,构成目标序列;将目标序列中的最大值作为合并层的公用正整数。
在一种实施方式中,上述处理模块52,可选地,用于对卷积层的权重进行缩放和偏移处理,得到卷积层的整数型权重。
在一种实施方式中,参见图6所示的另一种神经网络模型量化装置结构示意图,上述装置还包括:
训练模块60,用于在神经网络模型的迭代训练过程中,基于预设的量化位宽值确定卷 积层的权重的候选取值,并根据候选取值和卷积层的原始权重确定卷积层的量化权重;其中,卷积层的量化权重为浮点数;将归一化层中的各个参数设置为浮点数;其中,各个参数包括均值、方差、缩放值和偏移量;基于预设的量化位宽值确定量化激活层的输出特征。
在一种实施方式中,上述训练模块60,可选地,用于基于预设的量化位宽值确定卷积层的量化权重的候选取值为
Figure PCTCN2020119608-appb-000053
其中,F为预设的量化位宽值,卷积层的权重的取值范围为[0,1];从候选取值中选取满足预设条件的数值作为卷积层的量化权重。
在一种实施方式中,上述训练模块60,可选地,用于基于预设的量化位宽值对量化激活层的输出参数进行限值运算,得到量化激活层的输出特征;其中,限值运算的计算算式为:
Figure PCTCN2020119608-appb-000054
其中,clip为限值运算符,
Figure PCTCN2020119608-appb-000055
为量化激活层的输出特征,当
Figure PCTCN2020119608-appb-000056
时,量化激活层的输出特征为0,当
Figure PCTCN2020119608-appb-000057
时,量化激活层的输出特征为2 F-1,M为量化激活层的输入特征,s为量化激活层的量化间隔,F为量化位宽值。
在一种实施方式中,上述装置还包括:
发送模块,用于将量化后的神经网络模型发送至目标电子设备,以通过目标电子设备执行量化后的神经网络模型所对应的目标任务。
本实施例提供的上述神经网络模型量化装置,可以将神经网络模型中有浮点运算的归一化层和量化的激活函数整合成一个只用整型参数运算的网络层,使量化后的神经网络模型可以运行在只支持定点运算或整型运算的目标设备上,降低了神经网络模型的运算复杂度和资源开销。
本实施例所提供的装置,其实现原理及产生的技术效果和前述实施例相同,为简要描述,装置实施例部分未提及之处,可参考前述方法实施例中相应内容。
本公开实施例提供了一种移动设备,其中,该移动设备可以与上述实施例提供的电子设备通信连接,该电子设备可以用于执行上述实施例提供的神经网络模型量化方法。
上述移动设备用于获取电子设备运行得到的量化后的神经网络模型,并基于量化后的神经网络模型进行图像处理;其中,该移动设备包括智能终端和/或智能相机。
本公开实施例提供了一种计算机可读介质,其中,所述计算机可读介质存储有计算机可执行指令,所述计算机可执行指令在被处理器调用和执行时,所述计算机可执行指令促使所述处理器实现上述实施例所述的神经网络模型量化方法。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统具体 工作过程,可以参考前述实施例中的对应过程,在此不再赘述。
本公开实施例所提供的神经网络模型量化方法、装置及电子设备的计算机程序产品,包括存储了程序代码的计算机可读存储介质,所述程序代码包括的指令可用于执行前面方法实施例中所述的方法,具体实现可参见方法实施例,在此不再赘述。
另外,在本公开实施例的描述中,除非另有明确的规定和限定,术语“安装”、“相连”、“连接”应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或一体地连接;可以是机械连接,也可以是电连接;可以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通。对于本领域的普通技术人员而言,可以具体情况理解上述术语在本公开中的具体含义。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
在本公开的描述中,需要说明的是,术语“中心”、“上”、“下”、“左”、“右”、“竖直”、“水平”、“内”、“外”等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本公开和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本公开的限制。此外,术语“第一”、“第二”、“第三”仅用于描述目的,而不能理解为指示或暗示相对重要性。
最后应说明的是:以上所述实施例,仅为本公开的具体实施方式,用以说明本公开的技术方案,而非对其限制,本公开的保护范围并不局限于此,尽管参照前述实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本公开实施例技术方案的精神和范围,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应以所述权利要求的保护范围为准。
工业实用性
本实施例提供了神经网络模型量化方法、装置、电子设备及计算机可读存储介质,通 过将训练完成的神经网络模型的卷积层的权重转存为整数型权重,并将通过合并归一化层和量化激活层得到的合并层的参数转换为整数型参数,从而降低了神经网络模型的运算复杂度,使得量化后的神经网络模型在只支持定点或低位宽运算的目标设备上可以加速运行。

Claims (16)

  1. 一种神经网络模型量化方法,其特征在于,包括:
    获取神经网络模型;其中,所述神经网络模型包括卷积层、归一化层和量化激活层,所述量化激活层的输出特征为整数型特征;
    将所述卷积层的参数转换为整数型参数;
    对所述归一化层和所述量化激活层进行合并,得到合并层;
    将所述合并层的参数转换为整数型参数,得到量化后的神经网络模型。
  2. 根据权利要求1所述的方法,其特征在于,
    所述对所述归一化层和所述量化激活层进行合并的步骤,包括:
    将所述归一化层的输出特征作为所述量化激活层的输入特征,对所述归一化层和所述量化激活层进行合并,得到合并层的初始输出特征;
    对所述合并层的初始输出特征进行限值运算,得到所述合并层的目标输出特征。
  3. 根据权利要求1或2所述的方法,其特征在于,所述对所述合并层的初始输出特征进行限值运算,得到所述合并层的目标输出特征的步骤,包括:
    当所述合并层的初始输出特征小于0时,将所述合并层的目标输出特征设置为0,当所述合并层的初始输出特征大于2 F-1时,将所述合并层的目标输出特征设置为2 F-1。
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,所述合并层的初始输出特征为
    Figure PCTCN2020119608-appb-100001
    其中,N为所述归一化层的输入特征,b为所述合并层的偏移量,t为所述合并层的量化间隔,b和t均为浮点数,b和t均是基于所述归一化层和所述量化激活层的参数合并得到的。
  5. 根据权利要求1至4中任一项所述的方法,其特征在于,所述将所述合并层的参数转换为整数型参数的步骤,包括:
    利用预设的运算等效算法确定所述合并层的公用正整数;
    基于所述公用正整数,将所述合并层的参数b和t分别由浮点数转换为整数型参数,分别得到参数B和参数T;
    其中,B为整数型偏移量,T为整数型量化间隔。
  6. 根据权利要求1至5中任一项所述的方法,其特征在于,所述方法还包括:
    基于所述公用正整数、所述合并层的整数型偏移量和整数型量化间隔,得到所述合并层的转换整数型参数后的初始输出特征,所述初始输出特征为
    Figure PCTCN2020119608-appb-100002
    对所述合并层的转换整数型参数后的初始输出特征进行限值运算,得到所述合并 层的转换整数型参数后的目标输出特征;其中,K为所述共用正整数。
  7. 根据权利要求1至6中任一项所述的方法,其特征在于,所述利用预设的运算等效算法确定所述合并层的公用正整数的步骤,包括:
    从预设范围中选取满足预设条件的各个候选序列;其中,所述预设范围为[1,2 F-1];
    从各个所述候选序列中提取最小值,构成目标序列;
    将所述目标序列中的最大值作为所述合并层的公用正整数。
  8. 根据权利要求1至7中任一项所述的方法,其特征在于,所述将所述卷积层的参数转换为整数型参数,包括:
    对所述卷积层的权重进行缩放和偏移处理,得到所述卷积层的整数型权重。
  9. 根据权利要求1-至8中任一项所述的方法,其特征在于,所述神经网络模型在被获取之前,通过以下方式训练:
    在神经网络模型的迭代训练过程中,基于预设的量化位宽值确定所述卷积层的权重的候选取值,并根据所述候选取值和所述卷积层的原始权重确定所述卷积层的量化权重;其中,所述卷积层的量化权重为浮点数;
    将所述归一化层中的各个参数设置为浮点数;其中,各个所述参数包括均值、方差、缩放值和偏移量;
    基于所述预设的量化位宽值确定所述量化激活层的输出特征。
  10. 根据权利要求9所述的方法,其特征在于,基于所述预设的量化位宽值确定所述卷积层的权重的候选取值,并根据所述候选取值和所述卷积层的原始权重确定所述卷积层的量化权重的步骤,包括:
    基于所述预设的量化位宽值确定所述卷积层的量化权重的候选取值为
    Figure PCTCN2020119608-appb-100003
    Figure PCTCN2020119608-appb-100004
    其中,F为所述预设的量化位宽值,所述卷积层的权重的取值范围为[0,1];
    从所述候选取值中选取满足预设条件的数值作为所述卷积层的量化权重。
  11. 根据权利要求9或10所述的方法,其特征在于,所述基于所述量化位宽值确定所述量化激活层的输出特征的步骤,包括:
    基于所述预设的量化位宽值对所述量化激活层的输出参数进行限值运算,得到所述量化激活层的输出特征;其中,所述限值运算的计算算式为:
    Figure PCTCN2020119608-appb-100005
    其中,clip为限值运算符,
    Figure PCTCN2020119608-appb-100006
    为所述量化激活层的输出特征,当
    Figure PCTCN2020119608-appb-100007
    时,所述量化激活层的输出特征为0,当
    Figure PCTCN2020119608-appb-100008
    时,所述量化激活层的输出特征为2 F-1, M为所述量化激活层的输入特征,s为所述量化激活层的量化间隔,F为所述预设的量化位宽值。
  12. 根据权利要求1至11中任一项所述的方法,其特征在于,将所述量化后的神经网络模型发送至目标电子设备,以通过所述目标电子设备执行所述量化后的神经网络模型所对应的目标任务。
  13. 一种神经网络模型量化装置,其特征在于,包括:
    获取模块,用于获取神经网络模型;其中,所述神经网络模型包括卷积层、归一化层和量化激活层;其中,所述量化激活层的输出特征为整数型特征;
    处理模块,用于将所述卷积层的参数转换为整数型参数;
    合并模块,用于对所述归一化层和所述量化激活层进行合并,得到合并层;
    量化模块,用于将所述合并层的参数转换为整数型参数,得到量化后的神经网络模型。
  14. 一种电子设备,其特征在于,包括:处理器和存储装置;
    所述存储装置上存储有计算机程序,所述计算机程序在被所述处理器运行时执行如权利要求1至12中任一项所述的方法。
  15. 一种移动设备,其特征在于,与权利要求14所述的电子设备通信连接;
    所述移动设备用于获取所述电子设备运行得到的量化后的神经网络模型,并基于所述量化后的神经网络模型进行图像处理;其中,所述移动设备包括智能终端和/或智能相机。
  16. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,其特征在于,所述计算机程序被处理器运行时执行上述权利要求1至12中任一项所述的方法的步骤。
PCT/CN2020/119608 2020-03-10 2020-09-30 神经网络模型量化方法、装置、电子设备及计算机可读存储介质 WO2021179587A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010164284.3A CN111401550A (zh) 2020-03-10 2020-03-10 神经网络模型量化方法、装置及电子设备
CN202010164284.3 2020-03-10

Publications (1)

Publication Number Publication Date
WO2021179587A1 true WO2021179587A1 (zh) 2021-09-16

Family

ID=71432331

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/119608 WO2021179587A1 (zh) 2020-03-10 2020-09-30 神经网络模型量化方法、装置、电子设备及计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN111401550A (zh)
WO (1) WO2021179587A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113850374A (zh) * 2021-10-14 2021-12-28 安谋科技(中国)有限公司 神经网络模型的量化方法、电子设备及介质
CN113949592A (zh) * 2021-12-22 2022-01-18 湖南大学 一种基于fpga的对抗攻击防御系统及方法
CN114677548A (zh) * 2022-05-26 2022-06-28 之江实验室 基于阻变存储器的神经网络图像分类系统及方法
WO2023125815A1 (zh) * 2021-12-31 2023-07-06 杭州海康威视数字技术股份有限公司 一种数据处理方法、装置及边缘计算设备

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111401550A (zh) * 2020-03-10 2020-07-10 北京迈格威科技有限公司 神经网络模型量化方法、装置及电子设备
CN112200296B (zh) * 2020-07-31 2024-04-05 星宸科技股份有限公司 网络模型量化方法、装置、存储介质及电子设备
CN112269595A (zh) * 2020-10-28 2021-01-26 清华大学 图像处理方法、装置、计算机设备及存储介质
CN112766456B (zh) * 2020-12-31 2023-12-26 平安科技(深圳)有限公司 浮点型深度神经网络的量化方法、装置、设备及存储介质
CN113011569A (zh) * 2021-04-07 2021-06-22 开放智能机器(上海)有限公司 离线量化参数加注方法、装置、电子设备和存储介质
CN115705482A (zh) * 2021-07-20 2023-02-17 腾讯科技(深圳)有限公司 模型量化方法、装置、计算机设备及存储介质
CN113593538B (zh) * 2021-09-02 2024-05-03 北京声智科技有限公司 语音特征的分类方法、相关设备及可读存储介质
CN114676830A (zh) * 2021-12-31 2022-06-28 杭州雄迈集成电路技术股份有限公司 一种基于神经网络编译器的仿真实现方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107292382A (zh) * 2016-03-30 2017-10-24 中国科学院声学研究所 一种神经网络声学模型激活函数定点量化方法
CN108520299A (zh) * 2018-03-16 2018-09-11 新智认知数据服务有限公司 级间激活值量化方法及装置
CN108537322A (zh) * 2018-03-16 2018-09-14 新智认知数据服务有限公司 神经网络层间激活值量化方法及装置
DE102018130084A1 (de) * 2017-12-28 2019-07-04 Intel Corporation Dynamische Quantisierung neuronaler Netzwerke
CN110059733A (zh) * 2019-04-01 2019-07-26 苏州科达科技股份有限公司 卷积神经网络的优化及快速目标检测方法、装置
CN111401550A (zh) * 2020-03-10 2020-07-10 北京迈格威科技有限公司 神经网络模型量化方法、装置及电子设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107292382A (zh) * 2016-03-30 2017-10-24 中国科学院声学研究所 一种神经网络声学模型激活函数定点量化方法
DE102018130084A1 (de) * 2017-12-28 2019-07-04 Intel Corporation Dynamische Quantisierung neuronaler Netzwerke
CN108520299A (zh) * 2018-03-16 2018-09-11 新智认知数据服务有限公司 级间激活值量化方法及装置
CN108537322A (zh) * 2018-03-16 2018-09-14 新智认知数据服务有限公司 神经网络层间激活值量化方法及装置
CN110059733A (zh) * 2019-04-01 2019-07-26 苏州科达科技股份有限公司 卷积神经网络的优化及快速目标检测方法、装置
CN111401550A (zh) * 2020-03-10 2020-07-10 北京迈格威科技有限公司 神经网络模型量化方法、装置及电子设备

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113850374A (zh) * 2021-10-14 2021-12-28 安谋科技(中国)有限公司 神经网络模型的量化方法、电子设备及介质
CN113949592A (zh) * 2021-12-22 2022-01-18 湖南大学 一种基于fpga的对抗攻击防御系统及方法
CN113949592B (zh) * 2021-12-22 2022-03-22 湖南大学 一种基于fpga的对抗攻击防御系统及方法
WO2023125815A1 (zh) * 2021-12-31 2023-07-06 杭州海康威视数字技术股份有限公司 一种数据处理方法、装置及边缘计算设备
CN114677548A (zh) * 2022-05-26 2022-06-28 之江实验室 基于阻变存储器的神经网络图像分类系统及方法
CN114677548B (zh) * 2022-05-26 2022-10-14 之江实验室 基于阻变存储器的神经网络图像分类系统及方法

Also Published As

Publication number Publication date
CN111401550A (zh) 2020-07-10

Similar Documents

Publication Publication Date Title
WO2021179587A1 (zh) 神经网络模型量化方法、装置、电子设备及计算机可读存储介质
JP7166322B2 (ja) モデルを訓練するための方法、装置、電子機器、記憶媒体およびコンピュータプログラム
WO2019184823A1 (zh) 基于卷积神经网络模型的图像处理方法和装置
US10504009B2 (en) Image hash codes generated by a neural network
US11030522B2 (en) Reducing the size of a neural network through reduction of the weight matrices
WO2020233130A1 (zh) 一种深度神经网络压缩方法及相关设备
US20190164043A1 (en) Low-power hardware acceleration method and system for convolution neural network computation
US10860829B2 (en) Data-parallel parameter estimation of the Latent Dirichlet allocation model by greedy Gibbs sampling
WO2023138188A1 (zh) 特征融合模型训练及样本检索方法、装置和计算机设备
CN111105017B (zh) 神经网络量化方法、装置及电子设备
CN112489677A (zh) 基于神经网络的语音端点检测方法、装置、设备及介质
JP7405933B2 (ja) 量子チャンネルクラシック容量の推定方法及び装置、電子機器と媒体
WO2023020613A1 (zh) 一种模型蒸馏方法及相关设备
WO2022021834A1 (zh) 神经网络模型确定方法、装置、电子设备、介质及产品
US11861498B2 (en) Method and apparatus for compressing neural network model
US20240135698A1 (en) Image classification method, model training method, device, storage medium, and computer program
US20230079275A1 (en) Method and apparatus for training semantic segmentation model, and method and apparatus for performing semantic segmentation on video
WO2022252640A1 (zh) 图像分类预处理、图像分类方法、装置、设备及存储介质
WO2023273017A1 (zh) 测井图像清晰度的识别方法、装置、介质及电子设备
WO2022246986A1 (zh) 数据处理方法、装置、设备及计算机可读存储介质
WO2024094094A1 (zh) 一种模型训练方法及装置
CN112269595A (zh) 图像处理方法、装置、计算机设备及存储介质
WO2024021504A1 (zh) 人脸识别模型训练方法、识别方法、装置、设备及介质
WO2023159945A1 (zh) 多模态模型训练以及图像识别方法、装置、电子设备
CN116703659A (zh) 一种应用于工程咨询的数据处理方法、装置及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20924476

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20924476

Country of ref document: EP

Kind code of ref document: A1