WO2019238029A1 - 卷积神经网络系统和卷积神经网络量化的方法 - Google Patents

卷积神经网络系统和卷积神经网络量化的方法 Download PDF

Info

Publication number
WO2019238029A1
WO2019238029A1 PCT/CN2019/090660 CN2019090660W WO2019238029A1 WO 2019238029 A1 WO2019238029 A1 WO 2019238029A1 CN 2019090660 W CN2019090660 W CN 2019090660W WO 2019238029 A1 WO2019238029 A1 WO 2019238029A1
Authority
WO
WIPO (PCT)
Prior art keywords
convolution
layer
quantization
convolution layer
quantized
Prior art date
Application number
PCT/CN2019/090660
Other languages
English (en)
French (fr)
Inventor
郭鑫
罗龙强
余国生
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2019238029A1 publication Critical patent/WO2019238029A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Definitions

  • This application relates to the field of convolutional neural networks, and more specifically, to a convolutional neural network system and a method for quantizing a convolutional neural network.
  • the deep convolutional neural network After training, the deep convolutional neural network has hundreds or even tens of millions of parameters, for example, the weight parameters and bias parameters included in the parameters of the convolutional neural network model parameters, and the feature map parameters of each convolution layer. . And model parameters and feature map parameters are stored based on 32-bit bits. Due to the large number of parameters and the large amount of data, the entire convolution calculation process requires a large amount of storage and calculation resources. The development of deep convolutional neural networks is moving towards “deeper, larger, and more complex". As far as the model size of deep convolutional neural networks is concerned, it cannot be transplanted into mobile phones or embedded chips. Through network transmission, higher bandwidth occupancy often becomes a problem for engineering implementation.
  • the solution for reducing the complexity of the convolutional neural network without reducing the accuracy of the convolutional neural network is mainly realized by quantifying the parameters of the convolutional neural network.
  • the current quantization methods will cause the accuracy of the convolutional neural network to decline, affecting the user experience.
  • This application provides a convolutional neural network system and a method for quantizing a convolutional neural network.
  • Convolution calculation is performed on the quantized input data, the quantized weight, and the quantized offset to obtain the calculation result of each convolution layer. It reduces the calculation amount of the convolutional neural network and improves the quantization accuracy of the convolutional neural network.
  • a convolutional neural network system including: a quantization module, configured to quantify input data of the i-th convolution layer of the system, quantize the weight and offset of the i-th convolution layer, respectively , I is a positive integer; a convolution module is configured to perform convolution calculation on the quantized input data of the i-th layer convolution layer, the quantized weight, and the quantized offset to obtain the i-th volume The convolution result of the layer.
  • the convolutional neural network system uses a quantization module to quantify the weights and offsets of the convolutional layer in the convolutional layer and the input data to the convolutional layer.
  • the convolutional module uses the quantized input Convolution calculations are performed on the data, the quantized weights, and the quantized offsets to obtain the calculation results for each convolutional layer. It can make the calculation results obtained by using this system more accurate, reduce the calculation volume of the convolutional neural network, and at the same time reduce the amount of data stored in the convolutional neural network model and the calculation results of the convolutional calculation, and improve the quantization accuracy of the convolutional neural network.
  • the convolution module includes: a multiplier, configured to perform a multiplication operation on the quantized input data of the i-th convolution layer and the quantized weight; An adder is configured to add an output result of the multiplier and the quantized offset to obtain a convolution result of the i-th convolution layer.
  • the input data of the i-th convolution layer is the original input picture; or, when i is greater than 1, the i-th convolution layer's
  • the input data is feature map data.
  • the quantization module is further configured to: The inverse quantization corresponding to the weighted quantization and the biased quantization is performed on the convolution result, wherein the convolution result of the i-th convolution layer after the dequantization is the input data of the i + 1th convolution layer .
  • the reversibility of quantization and the accuracy of the model are maintained, and the accuracy and precision of the convolutional neural network are further improved.
  • the quantization module is further configured to: The convolution result is inverse quantized corresponding to the quantization of the weight and the offset quantization; the result obtained after the inverse quantization is inverse quantized in the feature map; the convolution module is further used for: after inverse quantization of the feature map
  • the convolution calculation is performed on the result of the i + 1th layer convolution layer and the offset of the i + 1th layer convolution layer to obtain the convolution result of the i + 1th layer convolution layer.
  • the reversibility of the quantization is maintained, and the accuracy and precision of the convolutional neural network are further improved.
  • the quantization module is further configured to: correct the offset after quantization; and the convolution module is specifically configured to: Convolution calculation is performed on the input data, the quantized weight, and the modified offset to obtain the convolution result of the i-th convolution layer.
  • the accuracy of quantization is further improved. Guarantee the accuracy and accuracy of the model.
  • the system further includes: a quantization parameter obtaining module, configured to obtain quantization parameters of the input data of the i-th convolution layer, and the weight of the i-th convolution layer The quantization parameter of the i-th layer and the offset quantization parameter; the quantization module is specifically configured to: quantize the input data of the i-th layer of the convolutional layer according to the quantization parameter of the input data of the i-th layer of the convolutional layer; The quantization parameter of the weight of the layer convolution layer quantizes the weight, and quantizes the offset according to the quantization parameter of the offset of the i-th convolution layer.
  • a method for quantizing a convolutional neural network including: quantizing input data of an i-th layer of the convolutional neural network, a weight and an offset of the i-th layer of the convolutional layer, respectively.
  • I is a positive integer; convolution calculation is performed on the input data of the quantized convolution layer, the quantized weight, and the quantized offset, to obtain the convolution result of the i-th convolution layer .
  • the method of quantization of a convolutional neural network is to quantize the weights and offsets of the convolutional layer in the convolutional layer and the input data input to the convolutional layer, and use the quantized input data,
  • the quantized weight and the quantized offset are subjected to convolution calculation to obtain the calculation result of each convolution layer.
  • the calculation result obtained can be made more accurate, the calculation amount of the convolutional neural network can be reduced, and the amount of stored data of the convolutional neural network model and the calculation result of the convolutional calculation can be reduced, thereby improving the quantization accuracy of the convolutional neural network.
  • the quantized input data of the i-th convolution layer, the quantized weight, and the quantized offset are subjected to convolution calculation to obtain the i-th
  • the convolution result of the layer convolution layer includes: multiplying the quantized input data of the i-th convolution layer and the quantized weight; the result of the multiplication operation and the quantized bias And perform an addition operation to obtain a convolution result of the i-th convolution layer.
  • the input data of the i-th convolution layer is the original input picture; or when i is greater than 1, the i-th convolution layer's
  • the input data is feature map data.
  • the method when the data to be performed for the i + 1th layer convolution calculation is data to be quantized, the method further includes: a convolution result of the ith layer convolution layer An inverse quantization corresponding to the weight quantization and the offset quantization is performed, wherein the convolution result of the i-th layer convolution layer after the inverse quantization is the input data of the i + 1th layer convolution layer.
  • the method further includes: convolution of the ith layer convolution layer The result performs inverse quantization corresponding to the weight quantization and the offset quantization; the result obtained after the inverse quantization is inverse quantized in the feature map; the result after inverse quantization in the feature map, the i + 1th layer volume The convolution calculation is performed on the weight of the convolution layer and the offset of the i + 1th convolution layer to obtain the convolution result of the i + 1th convolution layer.
  • the method further includes: correcting the quantized offset; the pair of quantized input data of the i-th convolution layer, the quantized weight, and Convolution calculation of the quantized offset includes: convolution calculation of the input data of the quantized convolution layer, the quantized weight, and the modified offset to obtain the i-th Convolution result of layer convolution.
  • the method further includes: obtaining a quantization parameter of the input data of the i-th convolution layer, a quantization parameter of the weight of the i-th convolution layer, and the offset
  • the quantization parameters of the i-th convolutional layer of the convolutional neural network, the weights and offsets of the i-th convolutional layer are quantified respectively, including: according to the input of the i-th convolutional layer
  • the quantization parameter of the data quantizes the input data of the i-th convolution layer, quantizes the weight according to the quantization parameter of the weight of the i-th convolution layer, and quantizes the bias of the i-th convolution layer.
  • the parameter quantifies the offset.
  • a chip includes a quantization module and a convolution module, and is configured to support the new chip to perform a corresponding function in the foregoing method.
  • a computer system includes a quantization module and a convolution module, and is configured to support the new chip to perform a corresponding function in the foregoing method.
  • a computer-readable storage medium for storing a computer program, the computer program including instructions for executing the method of the second aspect or any one of the possible implementation manners of the second aspect.
  • a computer program product includes instructions for executing a method of any one of the foregoing first to fourth aspects, or the second aspect or the second possible implementation manner.
  • FIG. 1 is a schematic block diagram of a convolutional neural network system structure according to an embodiment of the present application.
  • FIG. 2 is a schematic block diagram of a structure of a convolutional neural network system according to another embodiment of the present application.
  • FIG. 3 is a schematic block diagram of a structure of a convolutional neural network system according to another embodiment of the present application.
  • FIG. 4 is a schematic block diagram of a convolutional neural network system structure according to another embodiment of the present application.
  • FIG. 5 is a schematic flowchart of a convolutional neural network quantization method according to an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of a convolutional neural network quantization method according to another embodiment of the present application.
  • FIG. 7 is a schematic flowchart of a convolutional neural network quantization method according to another embodiment of the present application.
  • FIG. 8 is a schematic flowchart of a convolutional neural network quantization method according to another embodiment of the present application.
  • FIG. 9 is a schematic flowchart of a convolutional neural network quantization method according to another embodiment of the present application.
  • Quantization is the process of mapping a set of numbers in the original range to a range of targets through a mathematical transformation. Available methods are table lookup, shift, and truncation. Often a linear transformation is used, and this transformation is usually done using multiplication.
  • Inverse quantization The process of inverse transforming the quantized number to the original range based on the previous linear transformation (quantization process). Inverse quantization can ensure that the system uses the quantized data to perform calculations according to a certain calculation rule. After inverse quantization, the results can still be very close to the calculation results calculated using the same calculation rules using data in the original range. Range of values. Based on this, the loss of accuracy of the convolutional neural network is small.
  • each layer of data gets an amplification multiplier.
  • the multiplication and accumulation output after convolution with this amplification multiplier needs to remove the same amplification multiplier (quantization parameter) to ensure The value range of the whole calculation process is reversible and the value range is approximate.
  • the premise of reversible calculation is that the multiplication and accumulation based on the convolution calculation is a linear process.
  • Retrain (retrain) method refers to the model after slightly modifying the characteristics of the convolutional neural network based on the trained convolutional neural network model (hereinafter referred to as the "model"). The process of retraining on the basis is a fine-tuning of network training.
  • Hyperparameters A common concept in deep learning refers to the configuration parameters of the model training process.
  • Convolutional layers in deep convolutional neural networks generally include multiple layers. In the model obtained after the neural network training is completed, there are hundreds or even tens of millions of parameters. This parameter can include the weight parameters and bias parameters of each layer of the convolutional layer in the convolutional neural network model. It can also include each Layer convolution layer feature map parameters, etc. Due to the large number of parameters and the large amount of data, the entire convolution calculation process requires a large amount of storage and calculation resources. At present, the solution for reducing the complexity of the convolutional neural network without reducing the accuracy of the convolutional neural network is mainly realized by quantizing the parameters of the convolutional neural network.
  • OP Overflow Protection
  • round stands for rounding
  • FM stands for Feature Map (FM)
  • W stands for weight
  • Bias stands for bias
  • ⁇ and ⁇ respectively correspond to weights (weight) and bias.
  • the scheme for quantifying a model is mainly a quantization design scheme for weights and feature maps to achieve the purpose of quantification.
  • the quantization scheme in the prior art does not involve quantizing the offset, and the offset is a very important parameter in the model. Will have a very large impact on the accuracy of the entire model.
  • the purpose of model compression is mainly to reduce the complexity of the model and reduce the size of the model.
  • Model compression mainly includes classification process and target detection process. The classification process is to distinguish what the object is in the picture, and the target detection process is to first find the position of the object in the picture, and then determine what the object is.
  • the present application provides a convolutional neural network system, which can support the quantization of offsets, improve the calculation accuracy of the system, reduce the calculation amount of the convolutional neural network, and reduce the convolutional neural network.
  • the amount of data stored in the model and convolution calculation results improves the user experience.
  • FIG. 1 is a schematic block diagram of a convolutional neural network system provided in the present application. As shown in FIG. 1, the system 100 includes a quantization module 110 and a convolution module 120.
  • a quantization module 110 is configured to quantify input data of the i-th convolution layer of the system (input data of the i-th convolution calculation in the system), weights and offsets of the i-th convolution layer, i is a positive integer, and the data calculated by the i-th layer convolution is data to be quantized.
  • the convolution module 120 is configured to perform convolution calculation on the quantized input data of the i-th layer convolution layer, the quantized weight, and the quantized offset to obtain a convolution result of the i-th convolution layer.
  • the convolutional neural network system uses a quantization module to quantize the weights and offsets of the convolutional layer in the convolutional layer and the input data input to the convolutional layer.
  • the convolutional module uses the quantized input data.
  • the quantized weight and the quantized offset are used for convolution calculation to obtain the calculation result of each convolution layer. It can make the calculation results obtained by using this system more accurate, reduce the calculation volume of the convolutional neural network, and at the same time reduce the amount of data stored in the convolutional neural network model and the calculation results of the convolutional calculation, and improve the quantization accuracy of the convolutional neural network.
  • the convolutional layer includes multiple layers, and each layer of the convolutional layer has its own convolution model, that is, each layer of the convolutional layer has its own
  • the weight and offset values of the convolution model can be understood as a computational model.
  • the convolution model of the i-th convolution layer (the i-th convolution layer is a logical convolution layer concept) is the model shown in formula (2):
  • a can be understood as the weight
  • b can be understood as the offset
  • x can be understood as the input data input to the i-th convolution layer.
  • the quantization module 110 may perform quantization processing on multiple convolutional layers in the convolutional neural network.
  • the i-th convolution layer is quantized with weights, offsets, and input data.
  • the data calculated by the i-th layer convolution is data to be quantized.
  • the i-th layer convolution layer may also be referred to as a convolution layer to be quantized.
  • the data of the i-th convolution calculation includes the input data of the i-th convolution layer, the weight of the i-th convolution layer, and the offset.
  • the input data of the i-th convolution layer can also be called the input data of the i-th convolution calculation, that is, the quantization of the input data of the i-th convolution layer can also be called the input data of the i-th convolution calculation.
  • Quantification The quantization module 110 obtains the quantized input data, the quantized weight, and the quantized offset by quantizing the data calculated by the i-th convolution layer. The quantization of weights and offsets can be seen as the quantization of the convolution model of the i-th convolution layer.
  • the convolution module 120 may perform a convolution calculation on each of the quantized convolution layers to obtain a calculation result of each convolution layer.
  • the convolutional layer of a convolutional neural network generally includes multiple layers, it does not mean that the convolutional neural network includes all convolutional layers that need to be quantified. That is, in the convolutional neural network provided in this application, some convolutional layers may need to be quantized.
  • the part of the convolution layer that needs to be quantized may be a continuous convolution layer or a discontinuous convolution layer. For example, assuming that the convolutional layer has 10 layers in total, it can be the second to sixth layers that need to be quantized, or it can also be 2, 4, 7, and 9 layers of convolutional layers that need to be quantized. In this embodiment of the present application, There are no restrictions.
  • the convolution module 120 may also perform convolution calculations on the convolution layers without quantization to obtain the convolution results.
  • the embodiments of the present application are not limited herein.
  • system 100 may further include other modules, such as an input module and a pooling module. Used to support the system to complete other functions of the convolutional neural network.
  • modules such as an input module and a pooling module. Used to support the system to complete other functions of the convolutional neural network.
  • the embodiments of the present application are not limited herein.
  • system 100 may be a chip or a device, and the chip or device may include a quantization module 110, a convolution module 120, and the like.
  • the embodiments of the present application are not limited herein.
  • the input data of the i-th convolution layer is the original input picture.
  • the input data of the first convolution layer is the original input picture input to the system. That is, the quantization module 110 needs to quantize the original input picture input to the system, and needs to quantize the weight and offset of the first layer convolution layer.
  • the convolution module 120 performs a convolution calculation on the quantized original input picture, the quantized weight, and the quantized offset of the first-layer convolution layer to obtain a convolution calculation result of the first-layer convolution layer.
  • the quantization formula shown in formula (3) may be used for quantization.
  • IMG represents the original input picture input to the first convolution layer
  • input - img is a matrix representing the original input picture IMG.
  • Q (IMG) represents the input data of the first layer of the convolution layer after quantization
  • ⁇ 1 represents the quantization parameter of the feature map data of the first layer.
  • the quantization parameter is a parameter used in the quantization process, which is equivalent to an amplification multiplier.
  • ⁇ 1 represents the quantization parameter of the weight of the first convolution layer
  • W 1 represents the weight of the first convolution layer
  • Bias 1 represents the bias of the first convolution layer.
  • the quantization module 110 may also use other formulas to quantize the original input picture of the first-layer convolution layer.
  • any modified formula or the like of formula (3) is used. The embodiments of the present application are not limited herein.
  • the input data of the i-th convolution layer is feature map data.
  • the convolutional layer of the convolutional neural network includes multiple layers.
  • the input data of the i-th convolutional layer is feature map data.
  • the feature map data represents the calculation result of a convolutional layer, which is an intermediate calculation result for the entire convolutional neural network.
  • the input data is the feature map data of the i-1th convolution layer, that is, the convolution result of the i-1th convolution layer.
  • the feature map data is the convolution result of the fourth convolution layer. That is, the convolution result of the fourth-layer convolution layer (the feature map data of the fourth-layer convolution layer) calculated by the convolution module 120.
  • the fourth convolution layer can be a quantized convolution layer or a non-quantized convolution layer.
  • the quantization module 110 quantizes the input data input to the i-th convolution layer (that is, quantizes the feature map of the i-1th convolution layer). , You can use the quantization formula shown in formula (4) for quantization.
  • Q (FM i ) represents the input data of the quantized i-th convolution layer
  • FM i-1 represents the input data of the i-th convolution layer (the Convolution result)
  • ⁇ i represents the quantization parameter of the feature map data of the i-th convolution layer
  • ⁇ j represents the quantization parameter of the feature map data of the j-th convolution layer
  • the quantization parameter is a parameter used in the quantization process.
  • [alpha] i represents the i-th layer quantization parameter convolution layer weight, the right convolution layer W i i-th layer, Heavy, convolutional bias Bias i representative of the i-layer of the second layer.
  • the quantization module 110 processes and uses the above formula (4) to quantize the input data of the i-th layer convolution layer (characteristic map of the i-1th layer convolution layer), and can also use other formulas to quantize the i-th
  • the input data of the convolutional layer is quantized.
  • any modified formula or the like of formula (4) is used.
  • the embodiments of the present application are not limited herein.
  • the convolution module 120 includes:
  • the multiplier 121 is configured to perform a multiplication operation on the quantized input data of the i-th convolution layer and the quantized weight.
  • the adder 122 is configured to add an output result of the multiplier and the quantized offset to obtain a convolution result of the i-th convolution layer.
  • the convolution module 120 includes a multiplier 121 for multiplying the input data of the quantized convolution layer and the quantized weight to obtain a calculation result.
  • the adder 122 is configured to add an output result of the multiplier 121 and the quantized offset to obtain a convolution result of the i-th convolution layer. For example, shown in formula (2).
  • the multiplier 121 first multiplies the quantized weight a and the quantized input data x to obtain a result, and then the adder 122 adds the calculated result of the multiplier 121 and the quantized offset b to obtain a first The convolution result of the i-layer convolution layer.
  • the convolution result of the i-th convolution layer can be used as the output of the entire convolution layer, and the output result can be used as other processing of the convolution neural network.
  • Layer (such as the activation function layer) to enable the convolutional neural network to perform subsequent calculations and the like.
  • the embodiments of the present application are not limited herein.
  • the quantization module 110 is further configured to:
  • the convolution model of the i-th convolution layer is quantized (quantization of weights and offsets) before the convolution calculation of the input data of the i-th convolution layer.
  • the quantization is used.
  • the subsequent model performs convolution calculations. Therefore, after the convolution result of the i-th convolution layer is obtained, in order to ensure the range of the convolution result (characteristic map data) of the i-th convolution layer and the output of the convolution calculation using a model without quantization The range of the results lies in the same range.
  • the convolution result of the i-th layer of the dequantized convolution layer is the input data of the i + 1th layer of the convolution layer.
  • the data of the i + 1th layer convolution calculation includes the input data of the i + 1th layer convolution layer, the weight of the i + 1th layer convolution layer, and the offset.
  • the quantization module 110 also needs to quantize the input data and quantize the weight and offset of the (i + 1) th convolutional layer.
  • the convolution module 120 also needs to perform convolution calculations on the quantized input data of the i + 1th layer convolution layer, the quantized weight, and the quantized offset to obtain the volume of the i + 1th convolution layer. Product results. This can improve the accuracy and precision of the convolutional neural network.
  • the quantization module 110 performs model inverse quantization on the convolution result of the i-layer convolutional layer (inverse quantization corresponding to quantization of weights and offset quantization), the inverse of the formula shown in formula (5) Quantification:
  • MODEL_quantize_reverse convolution result representative of the i-layer model convolutional layer is inversely quantized result
  • i [alpha] i represents the weight of the layer right convolution layer quantization parameter
  • W is the representation of the i-th layer
  • Q (IMG) represents the original input picture of the quantized first-layer convolutional layer
  • Bias i represents the offset of the i- th convolutional layer
  • Q (FM i-1 ) represents the quantized i-1-th convolution Feature map data of the layer (also known as the quantized convolution result of the i-1th layer convolution layer)
  • ⁇ i represents the quantization parameter of the feature map of the i-th convolution layer
  • ⁇ j represents the j-th layer Quantization parameters of the feature map data of the convolutional layer.
  • the key to the inverse quantization of the model is to cancel the quantization parameters (also known as the magnification) of the model and maintain the accuracy
  • This process is a process in which the quantization module 110 quantizes the input data of the i + 1th convolution layer.
  • Q (FM i + 1) represents the quantized data of the i + 1 input layer
  • ⁇ i represents the i-th layer weight of the layer right convolution quantization parameter
  • W i represents the i-th layer convolution layer Weight
  • Q (IMG) represents the input data of the quantized layer 1 convolution layer
  • Bias i represents the offset of the ith layer convolution layer
  • Q (FM i ) represents the characteristics of the quantized layer i convolution layer Map data (convolution result of the quantized convolution layer of the i-th layer)
  • ⁇ j represents the quantization parameter of the feature map of the i-th convolution layer
  • ⁇ j represents the quantization parameter of the feature map data of the j-th convolution layer
  • N represents the total number of convolutional layers.
  • Formula (7) is a formula for quantizing the input data of the i + 1th convolutional layer.
  • MODEL_quantize_reverse represents the convolution result of the ith convolutional layer and performs the model inverse quantization result.
  • Q (FM i + 1 ) represents The input data of the quantized convolution layer of the (i + 1) th layer, ⁇ i represents the quantization parameter of the feature map of the convolution layer of the i-th layer.
  • the quantization module 110 may also use other formulas to quantize the input data of the i + 1th convolution layer. .
  • the embodiments of the present application are not limited herein.
  • the quantization module 110 is further configured to:
  • the convolution module 120 is further configured to perform a convolution calculation on the result of inverse quantization of the feature map, the weight of the i + 1th layer of the convolution layer, and the offset of the i + 1th layer of the convolution layer to obtain the The convolution result of the i + 1th convolution layer.
  • the convolution model of the i-th layer of the convolutional layer is quantized (quantization of weights and offsets) before the convolution calculation of the i-th layer of convolutional layers.
  • the model performs convolution calculations. Therefore, after obtaining the convolution result of the i-th convolution layer, in order to ensure the range of the output result (characteristic map data) of the i-th convolution layer and the output of the convolution calculation using a model without quantization The range of the results lies in the same range, and to maintain the reversibility of the quantization, it is necessary to perform model inverse quantization on the output result of the i-th convolution layer.
  • the model inverse quantization of the output result of the i-th convolution layer is similar to the inverse quantization process when the i + 1th convolution layer is the convolution layer to be quantized, and is not described here again.
  • the data calculated by the i + 1th convolution is not the data to be quantized (the i + 1th convolution layer is not the convolution layer to be quantized) That is, the i + 1th convolution layer does not need to be quantized.
  • the input data of the i-th convolution layer is also quantified.
  • the convolution module 120 performs a convolution calculation on the result of inverse quantization of the feature map, the weight of the i + 1th convolution layer, and the offset of the i + 1th convolution layer to obtain the i + 1th Convolution result of layer convolution.
  • the quantization module 110 when the quantization module 110 performs feature map inverse quantization on the result of inverse quantization of the model of the i-th convolution layer, the following formula (8) may be used to perform feature map inverse quantization:
  • ⁇ i represents the quantization parameter of the i-th convolution layer weight
  • W i represents the weight of the i-th convolution layer
  • Bias i represents the bias of the i- th convolution layer
  • FM i represents the The i-layer convolutional layer performs the inverse quantization of the feature map.
  • Q (FM i-1 ) represents the quantized feature map data of the i-1th convolution layer
  • ⁇ j represents the quantization parameter of the input data of the ith convolution layer.
  • ⁇ j represents the quantization parameter of the feature map data of the j-th convolutional layer.
  • the j-th convolutional layer is a convolutional layer that needs to be quantized continuously with the i-th convolutional layer before the i-th convolutional layer.
  • quantization parameters when performing feature map dequantization on the result of inverse quantization of the model of the i-th convolution layer, in addition to taking into account the quantization parameters of the input data of the i-th convolution layer, quantization parameters must be combined.
  • the quantization parameter when the convolution layer that needs to be quantized continuously with the i-th layer convolution layer needs to quantize the input data. For example, suppose the i-th convolution layer is the fifth convolution layer, and the fourth convolution layer and the third convolution layer are convolution layers that need to be quantized.
  • the feature map inverse quantization needs to be combined with the quantization parameters of the input data of the fourth and third convolutional layers. That is, the values of j are 4 and 3 respectively. If both the 4th and 3rd convolutional layers are convolutional layers that need to be quantized, and the 3rd convolutional layer is a convolutional layer that does not require quantization, then feature maps are used for the 5th convolutional layer. During inverse quantization, the quantization parameters of the input data of the fourth layer of the convolution layer need to be used to perform inverse quantization of the feature map, where j is 4 here.
  • the quantization module 110 may also use other formulas to perform feature map inverse quantization.
  • the embodiments of the present application are not limited herein.
  • system 100 further includes:
  • a quantization parameter obtaining module 130 is configured to obtain quantization parameters of input data of the i-th convolution layer, quantization parameters of weights of the i-th convolution layer, and the offset quantization parameter;
  • the quantization module 110 is specifically configured to: quantize the input data of the i-th convolution layer according to the quantization parameters of the input data of the i-th convolution layer; The weight is quantized, and the offset is quantized according to the quantization parameter of the offset of the i-th convolution layer.
  • data statistical analysis can be performed on all weights and biases in the model.
  • the corresponding maximum value can be found for the weight and the offset, respectively, and a power of 2 can be found, so that the weight or the offset multiplied by the power of 2 can approach the preset range of values to the greatest extent.
  • the range of the value range is between -128 and 127, and the corresponding Max Shift value is calculated, that is, the shift length in binary, and then according to the MaShift value Determine quantization parameters.
  • the quantization parameter obtaining module 130 is configured to obtain quantization parameters of input data of the i-th convolution layer, quantization parameters of weights of the i-th convolution layer, and the offset quantization parameter.
  • the quantization module 110 quantizes the input data of the i-th convolution layer according to the quantization parameters of the input data of the i-th convolution layer, and quantizes the weight according to the quantization parameters of the weight of the i-th convolution layer.
  • the offset is quantized according to the offset quantization parameter of the i-th convolution layer.
  • the quantization parameters of the input data of the i-th convolution layer (the input data of the i-th convolution calculation), the quantization parameters of the weights of the i-th convolution layer, and the quantization parameters of the offset will be described in detail.
  • W is the convolution representation layer i i-th layer weight
  • k represents the target bit width
  • the quantization multiplier is a real number and is not suitable for shift operations in the calculation process. Considering that the shift operation in the computer is a power of 2 multiplication and division, it is more A good way is to reduce the quantization multiplier to a power that is less than the maximum power of 2. Then, the following formula (10) is used to calculate the quantization parameter ⁇ i of the i-th convolution layer weight:
  • ⁇ i represents the quantization parameter of the i-th convolution layer weight.
  • Max represents the maximum value
  • abs represents the absolute value
  • b i represents the weight of the i-th layer
  • k represents the target bit width.
  • ⁇ i represents the quantization parameter of the i-th layer convolution layer offset.
  • ⁇ i represents the quantization parameter of the feature map of the i-th convolution layer
  • floor represents the rounding down
  • FM i represents the feature map of the i-th convolution layer.
  • the weighted data of the i-th convolution layer is Q (W i ) (take 8 as a specific point quantization as an example), which can be calculated by formula (16):
  • Q (W i ) represents the weighted data of the i-th convolution layer after quantization, round represents rounding, W i ⁇ ⁇ i represents left shift calculation, and the meaning represents shifting W i to the left by ⁇ i bits.
  • the quantization parameter of the bias of the i-th convolution layer is similar to quantizing the bias. Based on the offset quantization parameter and shift and rounding operation, the offset quantized data of the i-th convolution layer is Q ( ⁇ i ) (take 8-point specific point quantization as an example), which can be calculated by formula (17) :
  • Q (Bias i ) represents the quantized offset data of the i-th convolution layer, round represents rounding, Bias i ⁇ ⁇ i represents left shift calculation, and the meaning represents that Bias i is shifted to the left ⁇ i Bits.
  • the above-mentioned obtaining of quantization parameters and quantization of weights and biases may be performed in a training process (offline process).
  • the process of the convolution calculation performed by the convolution module 120 is performed in an actual process (online process) of analyzing the input data.
  • each quantization can also be obtained according to other methods or formulas Parameters, and quantification of weights and offsets.
  • the embodiments of the present application are not limited herein.
  • the quantization module 110 is further configured to:
  • the convolution module 120 is specifically configured to perform a convolution calculation using the modified offset to obtain the convolution result.
  • the convolution module 120 may perform convolution calculations on the input data of the quantized convolution layer, the quantized weight, and the modified offset to obtain the i-th layer. The convolution result of the convolution layer.
  • the quantization module 110 may correct the quantized offset. To correct the quantized offset, the weighted quantization parameter, the offset quantization parameter, and the quantization parameter of the feature map can be used to correct the quantized offset.
  • the quantization module 110 may perform the quantized offset of the current layer according to the quantization parameter of the current layer weight, the quantization parameter of the current layer offset, and the quantization parameter of the feature map of one or more consecutive convolution layers that need to be quantized before the current layer. Amended.
  • One or more consecutive convolutional layers that need to be quantized before the current layer can be understood as: assuming that the current layer is a convolutional layer of the fifth layer. Convolutional layer.
  • the second convolutional layer is a convolutional layer that does not need to be quantized.
  • One or more consecutive convolutional layers that need to be quantized before the current layer are a fourth-level convolutional layer and a low-level convolutional layer. If both the 4th and 2nd convolutional layers are convolutional layers that need to be quantized, and the 3rd convolutional layer is a convolutional layer that does not require quantization, then one or more consecutive needs before the current layer
  • the quantized convolution layer is the fourth convolution layer.
  • the quantization module 110 corrects the quantized offset includes correcting the quantization parameters of the weights and correcting the quantization parameters of the feature map.
  • Q (W i ) represents the weighted data of the i- th convolution layer after quantization
  • FM i-1 represents the feature map data of the i-1 convolution layer
  • the i-1 convolution layer can be quantized
  • the convolution layer may be a convolution layer that does not require quantization. The value of i is greater than 1.
  • FM i-1 Equation (18) becomes the input - img, offset data layer i layer after convolution Q (Bias i) representative of the quantization, W i represents the i-th layer Convolution Layer weight, Bias i represents the bias of the i- th convolution layer, ⁇ i represents the quantization parameter of the i-th convolution layer weight, and ⁇ i represents the quantization parameter of the i-th convolution layer offset.
  • the quantization parameters of Q (W i ) and Q (Bias i ) are different, which is not feasible for uniformly performing a linear transformation of the entire convolution, and it is necessary to align the multiplier factors in their quantization process.
  • the modified bias and weight sharing multiplier factors can be used to divide the inverse quantization, so that the convolution process meets the reversibility, and the accuracy and accuracy of the model is guaranteed.
  • the quantization parameter of the weight can be directly used to quantize the bias. That is, the weight and offset share a quantization parameter. Satisfy
  • the offset also needs to be corrected.
  • the first layer of convolutional layer and the second layer of convolutional layer are convolution layers that need to be quantified.
  • FM 1 ⁇ W 1 * input - img + Bias 1 (20)
  • IMG represents the original input picture input to the first layer of the convolutional layer
  • FM 1 represents the feature map data of the first layer of the convolutional layer.
  • ⁇ 1 represents the quantization parameter of the first convolution layer weight
  • W 1 represents the weight of the first convolution layer
  • Bias 1 represents the bias of the first convolution layer
  • Q (IMG) represents the quantized first layer Input data for the convolutional layer.
  • Q (FM 1 ) represents the feature map data of the quantized first-layer convolution layer (quantized first-layer convolution layer Convolution results), as shown in formula (22):
  • Equation (22) representative of the quantization parameter gamma] 1 of the first layer convolutional layer is characterized by the data of FIG. FM 1 represents the feature map data of the first convolution layer.
  • ⁇ 2 represents the quantization parameter of the convolution layer weight of the second layer
  • W 2 represents the weight of the convolution layer of the second layer
  • Bias 2 represents the offset of the second convolution layer
  • FM 1 represents the first
  • Q (FM 1 ) represents the feature map data of the first layer of the convolution layer after quantization
  • ⁇ 1 represents the quantization parameter of the feature map of the first layer of the convolution layer. It can still maintain the quantization multiplier factor consistent with the multiplied and accumulated feature map. It is necessary to modify the offset quantization parameters again and to modify and merge the offset quantization parameters for the first time to obtain the quantized Q 2 ), as shown in formula (24):
  • ⁇ 2 represents the quantization parameter of the feature map data of the second convolution layer
  • Q (FM 2 ) represents the feature map data of the second convolution layer after quantization.
  • the quantization module 110 may also use other formulas to correct the offset quantization parameter.
  • the embodiments of the present application are not limited herein.
  • inverse quantization of the feature map may also be performed by the feature map inverse quantizer.
  • the feature map quantization can also be performed by the feature map quantizer, the model inverse quantization can be performed by the model inverse quantizer, and the quantization parameter correction can be performed by the quantization parameter modifier, the feature map inverse quantizer.
  • the feature map quantizer, model inverse quantizer, and quantization parameter modifier can be integrated in the quantization module or can be set separately.
  • the quantization module may further include a model quantizer and a quantization parameter acquisition module for quantizing the model, which are not limited in the embodiment of the present application.
  • the system 200 includes a picture quantizer 211, a model quantizer 212, a multiplier 213, an adder 214, a model inverse quantizer 215, a feature map inverse quantizer 216, and a feature map quantizer 217.
  • the picture quantizer 211, the model quantizer 212, and the feature map quantizer 217 may be integrated in one quantization module, or may be separately provided.
  • the model inverse quantizer 215 and the feature map inverse quantizer 216 may also be integrated in one inverse quantization module, or may be separately provided.
  • the multiplier 213 and the adder 214 may also be integrated in one convolution module, or may be set separately, which is not limited in the embodiment of the present application.
  • FIG. 4 shows a processing flow of the system 200 for the current layer convolution layer to be a convolution layer that needs to be quantized.
  • the model quantizer 212 first quantizes the weight and offset of each convolutional layer to be quantized, and obtains the quantized weight and offset of each convolutional layer. Assume that the data calculated by the first layer of convolution is the data to be quantized.
  • the picture quantizer 211 performs picture quantization on the original input picture, and obtains The picture, combined with the quantized weight and offset of the first layer of the convolution layer, performs the convolution calculation.
  • the multiplier 213 multiplies the quantized picture and the quantized weight of the first layer of the convolutional layer, and the adder 214 adds the output of the multiplier and the quantized offset of the first layer of the convolutional layer to obtain the first The convolution result of the 1-layer convolution layer.
  • the model inverse quantizer 215 performs model inverse quantization on the convolution result of the first layer. The processing flow is determined according to whether the second layer convolution layer is a convolution layer that needs to be quantified.
  • the second layer convolution layer is a convolution layer that needs to be quantized, that is, if the data calculated by the second layer convolution is the data to be quantized
  • the feature map quantizer 217 performs the inverse quantization result of the model inverse quantizer 215.
  • the feature map is quantized, and the result of the feature map quantization is input to the multiplier 213.
  • the multiplier 213 multiplies the result of the feature map quantization with the weight of the second-layer convolution layer, and the adder 214 adds the second
  • the quantized offset of the layer convolution layer is added to the output result of the multiplier 211 to obtain the feature map data of the second layer convolution layer. Then use the same processing flow to process subsequent convolutional layers that need to be quantized.
  • the feature map dequantizer 216 dequantizes the model dequantizer 215.
  • the result is dequantized by the feature map, and the result of the dequantized feature map is input to the multiplier 213.
  • the multiplier 213 multiplies the result of the dequantized feature map with the weight of the second convolution layer, and the adder 214
  • the offset of the second convolution layer is added to the output of the multiplier 213 to obtain the feature map data of the second convolution layer. Then use the same processing flow to process subsequent convolutional layers that do not require quantization.
  • the system further includes a quantization corrector 218, which can correct the quantized offset, and the offset of the input adder 214 can also be an offset corrected by the quantization corrector 218.
  • the quantization corrector 218 may be integrated in the model quantizer 212 or may be set separately.
  • the model dequantizer 215 first performs the model dequantization on the convolution result of the last convolution layer, The result obtained by the feature map inverse quantizer 216 after inverse quantizing the result of the model inverse quantizer 215 is the output results of all convolutional layers. If the last convolution layer is a convolution layer that does not require quantization, the feature map data input to the last layer from the second to last layer needs to be determined according to whether the second to last layer is a convolution layer that needs to be quantized.
  • the input multiplier 221 is a quantized picture.
  • the input multiplier 213 is the output result of the feature map quantizer 217 or the feature map dequantizer 216. Output. That is, for the convolution calculation of an original picture, the picture quantizer 211 has only one input and output, that is, for the first layer of the convolution layer, the input is a quantized picture. For other convolutional layers. The picture quantizer 211 does not output. The dotted line in FIG. 4 indicates that data is input to the multiplier 213 only for the first-layer convolution layer picture quantizer 211, and no data is input to the multiplier 213 for the other convolution layer picture quantizer 211.
  • the adder 214 quantizes the offset of the 10th convolution layer and the output of the multiplier 213.
  • the model inverse quantizer 215 After adding the feature map data (convolution result) of the convolution layer of the tenth layer, the model inverse quantizer 215 performs model inverse quantization on the convolution result of the tenth layer.
  • the feature map inverse quantizer 216 performs the feature map inverse quantization on the result of the model inverse quantizer 215 to obtain the convolution result of the entire convolution layer.
  • the convolutional neural network system uses a quantization module to quantify the weight, offset of each convolution layer in the convolution layer that needs to be quantized in the convolution layer, and the input data input to the convolution layer.
  • the convolution module uses the quantized input data, the quantized weight, and the quantized offset to perform convolution calculations to obtain the calculation results of each convolution layer. And quantize the quantized model and feature map, so that the quantization meets the reversibility. Guaranteed the accuracy of quantization,
  • the application also provides a method for quantizing a convolutional neural network.
  • the method 300 for quantizing a convolutional neural network may be applied to a convolutional neural network system (device).
  • the convolutional neural network system may be provided by the above application.
  • the convolutional neural network system may also be an existing convolutional neural network system, which is not limited in the embodiment of the present application.
  • FIG. 6 shows a schematic flowchart of a convolutional neural network quantization method 300 provided in the present application.
  • the method 300 may be executed by a chip, and the chip may include a quantization module, a convolution module, and the like. Or it can also be executed by a computer system.
  • the computer system may include a quantization module, a convolution module, and the like.
  • the chip or the computer system may be a convolutional neural network system (apparatus) provided by the present application. This application is not limited here.
  • the method 300 includes:
  • S310 Quantify the input data of the i-th convolution layer of the convolutional neural network, the weight and the offset of the i-th convolution layer, and i is a positive integer.
  • the data calculated by the i-th layer convolution is data to be quantized.
  • S320 Perform convolution calculation on the quantized input data of the i-th layer convolution layer, the quantized weight, and the quantized offset to obtain a convolution result of the i-th layer convolution layer.
  • the quantization method of a convolutional neural network quantizes the weights and offsets of the convolutional layer in the convolutional layer and the input data input to the convolutional layer, and uses the quantized input data and quantization. Convolution calculations are performed on the weights and quantized offsets to obtain the calculation results of each convolution layer.
  • the calculation result obtained can be made more accurate, the calculation amount of the convolutional neural network can be reduced, and the amount of stored data of the convolutional neural network model and the calculation result of the convolutional calculation can be reduced, thereby improving the quantization accuracy of the convolutional neural network.
  • the pair of quantized input data of the i-th convolution layer, the quantized weight, and the quantized offset are convolved.
  • Calculation, to obtain the convolution result of the i-th convolution layer including:
  • S321 Perform a multiplication operation on the quantized input data of the i-th convolution layer and the quantized weight.
  • the input data of the i-th convolution layer when i is equal to 1, is the original input picture; or when i is greater than 1, the input data of the i-th convolution layer is a feature Graph data.
  • the method 300 when the data to be performed for the i + 1th-layer convolution calculation is data to be quantized, the method 300 further includes:
  • the method 300 further includes:
  • the method 300 further includes:
  • the pair of quantized input data of the i-th convolution layer, the quantized weight, and the quantized offset are subjected to convolution calculation, including:
  • Convolution calculation is performed on the quantized input data of the i-th convolution layer, the quantized weight, and the modified offset to obtain the convolution result of the i-th convolution layer.
  • the method 300 further includes:
  • the input data of the i-th convolutional layer of the convolutional neural network, the weight and the bias of the i-th convolutional layer are quantified respectively, including:
  • the input data of the i-th convolution layer is quantized according to the quantization parameters of the input data of the i-th convolution layer, and the weight is quantized according to the quantization parameters of the weight of the i-th convolution layer.
  • the offset quantization parameter of the i-layer convolution layer quantizes the offset.
  • An embodiment of the present application further provides a computer-readable medium for storing computer program code, where the computer program includes instructions for performing a method of quantization of a convolutional neural network in the embodiment of the method 300 in the foregoing application.
  • the readable medium may be a read-only memory (ROM) or a random access memory (RAM), which is not limited in the embodiment of the present application.
  • the present application also provides a computer program product.
  • the computer program product includes instructions that, when the instructions are executed, cause a device to perform operations corresponding to the foregoing methods.
  • the present application also provides a computer system including a chip or a device for performing a method for quantizing a convolutional neural network according to an embodiment of the present application.
  • the chip or the device may be a convolutional neural network system provided by the present application.
  • An embodiment of the present application further provides a system chip.
  • the system chip includes a processing unit and a communication unit.
  • the processing unit may be, for example, a processor.
  • the communication unit may be, for example, an input / output interface, a pin, or a circuit.
  • the processing unit can execute computer instructions to cause a chip in the communication device to execute any of the convolutional neural network quantization methods provided by the embodiments of the present application.
  • the computer instructions are stored in a storage unit.
  • the storage unit is a storage unit in the chip, such as a register, a cache, etc.
  • the storage unit may also be a storage unit located outside the chip in the terminal, such as a ROM or other device that can store static information and instructions. Type of static storage device, RAM, etc.
  • the processor mentioned in any of the above may be a CPU, a microprocessor, an ASIC, or one or more integrated circuits executed by a program for controlling the above-mentioned method of convolutional neural network quantization.
  • the processing unit and the storage unit can be decoupled and respectively set on different physical devices, and the respective functions of the processing unit and the storage unit can be realized by wired or wireless connection, so as to support the system chip to implement the foregoing embodiments.
  • the processing unit and the memory may be coupled on the same device.
  • the disclosed systems, devices, and methods may be implemented in other ways.
  • the device embodiments described above are only schematic.
  • the division of the unit is only a logical function division.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, which may be electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit.
  • the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially a part that contributes to the existing technology or a part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present application.
  • the foregoing storage media include: U disks, mobile hard disks, read-only memories (ROMs), random access memories (RAMs), magnetic disks or compact discs and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Complex Calculations (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

一种卷积神经网络系统(100)和卷积神经网络量化的方法(300),该系统(100)包括:量化模块(110),用于对该卷积神经网络的第i层卷积层的输入数据、该第i层卷积层的权重以及偏置分别进行量化,i为正整数(S310);卷积模块(120),用于对量化后的该第i层卷积层的输入数据、量化后的该权重以及量化后的该偏置进行卷积计算,得到该第i层卷积层的卷积结果(S320)。所述卷积神经网络系统(100),通过对卷积层中需要量化的卷积层的权重、偏置以及输入该卷积层的输入数据进行量化,利用量化后的输入数据、量化后的权重以及量化后的偏置进行卷积计算,得到每一层卷积层的计算结果。降低卷积神经网络的计算量,提高了卷积神经网络量化的精度。

Description

卷积神经网络系统和卷积神经网络量化的方法 技术领域
本申请涉及卷积神经网络领域,更为具体的,涉及一种卷积神经网络系统和卷积神经网络量化的方法。
背景技术
深度卷积神经网络在训练完成后拥有几百甚至上千万的参数,例如,卷积神经网络模型参数中包括的权重参数和偏置参数,还有每一层卷积层的特征图参数等。并且模型参数和特征图参数的存储都是基于32位比特进行的。由于参数较多并且数据量较大,整个卷积计算过程需要消耗大量的存储和计算资源。而深度卷积神经网络的发展朝着“更深、更大、更复杂”的方向发展,就深度卷积神经网络的模型尺寸来说,根本无法移植到手机端或嵌入式芯片当中,就算是想通过网络传输,较高的带宽占用率也往往成为工程实现的难题。
目前,对于在不降低卷积神经网络精度的前提下降低卷积神经网络的复杂度的解决方案主要是利用对卷积神经网络的参数进行量化的方法实现。但是目前量化的方法会造成卷积神经网络的精度下降,影响用户体验。
发明内容
本申请提供了一种卷积神经网络系统和卷积神经网络量化的方法,通过对卷积层中需要量化的卷积层的权重、偏置以及输入该卷积层的输入数据进行量化,利用量化后的输入数据、量化后的权重以及量化后的偏置进行卷积计算,得到每一层卷积层的计算结果。降低卷积神经网络的计算量,提高了卷积神经网络量化的精度。
第一方面,提供了一种卷积神经网络系统,包括:量化模块,用于对该系统的第i层卷积层的输入数据、该第i层卷积层的权重以及偏置分别进行量化,i为正整数;卷积模块,用于对量化后的该第i层卷积层的输入数据、量化后的该权重以及量化后的该偏置进行卷积计算,得到该第i层卷积层的卷积结果。
第一方面提供的卷积神经网络系统,通过量化模块对卷积层中需要量化的卷积层的权重、偏置以及输入该卷积层的输入数据进行量化,卷积模块利用量化后的输入数据、量化后的权重以及量化后的偏置进行卷积计算,得到每一层卷积层的计算结果。可以使得利用该系统得到的计算结果更加准确,降低卷积神经网络的计算量,同时可以降低卷积神经网络模型以及卷积计算结果的存储数据量,提高了卷积神经网络量化的精度。便于硬件设计上的实现。对于模型压缩中的目标检测过程,提高了目标检测的精度和效率。
在第一方面的一种可能的实现方式中,该卷积模块包括:乘法器,用于对该量化后的该第i层卷积层的输入数据以及该量化后的该权重进行乘法运算;加法器,用于对该乘法器的输出结果与该量化后的该偏置进行加法运算,得到该第i层卷积层的卷积结果。
在第一方面的一种可能的实现方式中,当i等于1时,该第i层卷积层的输入数据为原始输入图片;或者,当i大于1时,该第i层卷积层的输入数据为特征图数据。
在第一方面的一种可能的实现方式中,当待执行第i+1层卷积计算的数据为待量化的 数据时,该量化模块还用于:将该第i层卷积层的卷积结果进行与该权重的量化以及该偏置的量化对应的反量化,其中,该反量化后的该第i层卷积层的卷积结果为该第i+1层卷积层的输入数据。在该实现方式中,保持了量化的可逆性和模型的准确性,进一步的提高该卷积神经网络的准确性和精度。
在第一方面的一种可能的实现方式中,当待执行的第i+1层卷积计算的数据不是待量化的数据时,该量化模块还用于:将该第i层卷积层的卷积结果进行与该权重的量化以及该偏置的量化对应的反量化;将该反量化后的得到的结果进行特征图反量化;该卷积模块还用于:对该特征图反量化后的结果、该第i+1层卷积层的权重以及该第i+1层卷积层的偏置进行卷积计算,得到该第i+1层卷积层的卷积结果。在该实现方式中,保持量化的可逆性,进一步的提高该卷积神经网络的准确性和精度。
在第一方面的一种可能的实现方式中,该量化模块还用于:对量化后的该偏置进行修正;该卷积模块具体用于:对量化后的该第i层卷积层的输入数据、量化后的该权重以及该修正后的该偏置进行卷积计算,得到该第i层卷积层的卷积结果。在该实现方式中,在保证了量化的可逆性的基础上,进一步的提高量化的精度。保证模型的精度和准确性。
在第一方面的一种可能的实现方式中,该系统还包括:量化参数获取模块,用于获取该第i层卷积层的输入数据的量化参数、该第i层卷积层的该权重的量化参数以及该偏置的量化参数;该量化模块具体用于:根据该第i层卷积层的输入数据的量化参数对该第i层卷积层的输入数据进行量化,根据该第i层卷积层的权重的量化参数对该权重进行量化,根据该第i层卷积层的偏置的量化参数对该偏置进行量化。
第二方面,提供了一种卷积神经网络量化的方法,包括:对该卷积神经网络的第i层卷积层的输入数据、该第i层卷积层的权重以及偏置分别进行量化,i为正整数;对量化后的该第i层卷积层的输入数据、量化后的该权重以及量化后的该偏置进行卷积计算,得到该第i层卷积层的卷积结果。
第二方面提供的卷积神经网络量化的方法,通过对该卷积层中需要量化的卷积层的权重、偏置以及输入该卷积层的输入数据进行量化,利用量化后的输入数据、量化后的权重以及量化后的偏置进行卷积计算,得到每一层卷积层的计算结果。可以使得得到的计算结果更加准确,降低卷积神经网络的计算量,同时可以降低卷积神经网络模型以及卷积计算结果的存储数据量,提高了卷积神经网络量化的精度。便于硬件设计上的实现。
在第二方面的一种可能的实现方式中,该对量化后的该第i层卷积层的输入数据、量化后的该权重以及量化后的该偏置进行卷积计算,得到该第i层卷积层的卷积结果,包括:对该量化后的该第i层卷积层的输入数据以及该量化后的该权重进行乘法运算;对该乘法运算的结果与该量化后的该偏置进行加法运算,得到该第i层卷积层的卷积结果。
在第二方面的一种可能的实现方式中,当i等于1时,该第i层卷积层的输入数据为原始输入图片;或者,当i大于1时,该第i层卷积层的输入数据为特征图数据。
在第二方面的一种可能的实现方式中,当待执行第i+1层卷积计算的数据为待量化的数据时,该方法还包括:将该第i层卷积层的卷积结果进行与该权重的量化以及该偏置的量化对应的反量化,其中,该反量化后的该第i层卷积层的卷积结果为该第i+1层卷积层的输入数据。
在第二方面的一种可能的实现方式中,当待执行的第i+1层卷积计算的数据不是待量 化的数据时,该方法还包括:将该第i层卷积层的卷积结果进行与该权重的量化以及该偏置的量化对应的反量化;将该反量化后的得到的结果进行特征图反量化;对该特征图反量化后的结果、该第i+1层卷积层的权重以及该第i+1层卷积层的偏置进行卷积计算,得到该第i+1层卷积层的卷积结果。
在第二方面的一种可能的实现方式中,该方法还包括:对量化后的该偏置进行修正;该对量化后的该第i层卷积层的输入数据、量化后的该权重以及量化后的该偏置进行卷积计算,包括:对量化后的该第i层卷积层的输入数据、量化后的该权重以及该修正后的该偏置进行卷积计算,得到该第i层卷积层的卷积结果。
在第二方面的一种可能的实现方式中,该方法还包括:获取该第i层卷积层的输入数据的量化参数、该第i层卷积层的该权重的量化参数以及该偏置的量化参数;该对该卷积神经网络的第i层卷积层的输入数据、该第i层卷积层的权重以及偏置分别进行量化,包括:根据该第i层卷积层的输入数据的量化参数对该第i层卷积层的输入数据进行量化,根据该第i层卷积层的权重的量化参数对该权重进行量化,根据该第i层卷积层的偏置的量化参数对该偏置进行量化。
第三方面,提供一种芯片,该芯片包括量化模块和卷积模块,用于支持该新芯片执行上述方法中相应的功能。
第四方面,提供了一种计算机系统,该计算机系统包括量化模块和卷积模块,用于支持该新芯片执行上述方法中相应的功能。
第五方面,提供了一种计算机可读存储介质,用于存储计算机程序,该计算机程序包括用于执行上述第二方面或第二方面中的任一种可能的实现方式的方法的指令。
第六方面,提供一种计算机程序产品,该产品包括用于执行上述第一方面至第四方面、或第二方面或者第二方面中的任一种可能的实现方式的方法的指令。
附图说明
图1是本申请一个实施例的卷积神经网络系统结构的示意性框图。
图2是本申请另一个实施例的卷积神经网络系统结构的示意性框图。
图3是本申请又一个实施例的卷积神经网络系统结构的示意性框图。
图4是本申请另一个实施例的卷积神经网络系统结构的示意性框图。
图5是本申请一个实施例的卷积神经网络量化的方法的示意性流程图。
图6是本申请另一个实施例的卷积神经网络量化的方法的示意性流程图。
图7是本申请又一个实施例的卷积神经网络量化的方法的示意性流程图。
图8是本申请另一个实施例的卷积神经网络量化的方法的示意性流程图。
图9是本申请另一个实施例的卷积神经网络量化的方法的示意性流程图。
具体实施方式
下面将结合附图,对本申请中的技术方案进行描述。
首先,对本申请涉及的一些术语进行解释说明。
量化:量化是将一组原始值域范围内的数,通过一个数学变换将原始值域映射到另一个目标值域范围的过程。可采用的方法如查表、移位、截位等。其中往往采用线性变换, 通常使用乘法完成这个变换。
反量化:对量化过后的数,基于之前的线性变换(量化过程),反变换到原来的值域范围的过程。反量化能够保证系统采用量化后的数据按照某一个计算规则进行计算,在反量化后,其结果仍然能和利用原始值域范围内的数据按照相同的计算规则进行计算的计算结果保持非常相近的值域范围,基于这一点使得卷积神经网络精度的损失较小。
可逆性:在量化和反量化过程中,满足量化和反量化能够互为反变换。即经过量化后的数据,在反量化后能够保持数据与原始数据近似相等。
量化计算的可逆性:数据经过量化过后,每一层数据都得到一个放大乘数,带着这个放大乘数进行卷积后的乘累加输出需要除去同样一个放大乘数(量化参数),以保证整个计算过程的值域可逆以及数值范围近似。而可逆计算的前提是基于卷积计算乘累加是线性过程。
再训练(retrain)方法:是指在已经训练好的卷积神经网络模型(下文简称为“模型”)基础上,由于卷积神经网络某些特性需要微调,故在稍微修改网络特性后的模型基础上再次训练的过程,属于网络训练上的微调。
超参数:深度学习中的通用概念,就是指模型训练过程的配置参数。
深度卷积神经网络中的卷积层一般包括多层。在神经网络训练完成得到的模型中,存在几百甚至上千万的参数,该参数可以包括卷积神经网络模型中每一层卷积层的权重参数、偏置参数等,还可以包括每一层卷积层的特征图参数等。由于参数较多并且数据量较大,整个卷积计算过程需要消耗大量的存储和计算资源。目前,对于在不降低卷积神经网络精度的前提下降低卷积神经网络的复杂度的解决方案主要是利用对卷积神经网络的参数进行量化(quantize)的方法实现。
在进行性量化后,利用量化后的数据进行计算的结果会与利用原始数据进行计算的结果发生偏差,使得计算结果的精度降低。因此需要进行反量化的过程,即需要满足量化过程的可逆性。目前,模型量化的可逆性的计算公式(1)所示:
Figure PCTCN2019090660-appb-000001
公式(1)中,OP代表溢出保护(Overflow Protect,OP),round代表四舍五入,FM代表特征图(Feature Map,FM),W代表权重(weight),Bias代表偏置,α和β分别对应权重(weight)和偏置(bias)的量化参数。
可以看到,在公式(1)中,乘累加∑OP(round(αW))*FM和量化后的OP(round(βBias))分别除去了对应当前卷积层的量化参数α和β。其中使用了约等号,因为在量化权重和偏置过程中存在四舍五入和溢出保护操作(例如8bit定点溢出保护)。
这种近似等价的形式称为该量化满足数学上的可逆性。后续的量化设计都是基于这样一种思路进行。
现有技术中对于模型进行量化的方案主要是针对权重和特征图的量化设计方案来达到量化的目的。现有技术中的量化方案没有涉及对偏置的量化,而偏置是模型中非常重要的一个参数。会对整个模型的精度产生非常大的影响。例如,对于卷积神经网络中的模型压缩,模型压缩的目的主要是为了降低模型的复杂度,降低模型的尺寸。模型压缩主要包括分类过程和目标检测过程。分类过程是区分图片中的这个物体是什么,而目标检测过程 是首先要从图片中找出物体位置,然后再判断是这个物体什么。如果按照现有技术的量化流程进行量化,对于分类过程而言,偏置的影响相对较小,但是对于目标检测过程,偏置对于结果的影响非常大,这种量化流程并不能解决目标检测过程中的量化。并且,零均值方法量化在实际特征图量化中会带来加减法,不适合硬件高效设计。
目前还有其他量化方法,但是都没有涉及对偏置的量化。使得卷积神经网络经过量化后的精度降低,影响了卷积神经网络量化后的精度,影响用户体验。
基于上述问题,本申请提供了一种卷积神经网络系统,该系统可以支持对偏置的量化,提高该系统的计算的精度,降低卷积神经网络的计算量,同时可以降低卷积神经网络模型以及卷积计算结果的存储数据量,提高用户体验。
图1是本申请提供的卷积神经网络系统的示意性框图,如图1所示,该系统100包括量化模块110和卷积模块120。
量化模块110,用于对该系统的第i层卷积层的输入数据(该系统中第i层卷积计算的输入数据)、该第i层卷积层的权重以及偏置分别进行量化,i为正整数,该第i层卷积计算的数据为待量化的数据。
卷积模块120,用于对量化后的该第i层卷积层的输入数据、量化后的权重以及量化后的偏置进行卷积计算,得到该第i层卷积层的卷积结果。
本申请提供的卷积神经网络系统,通过量化模块对卷积层中需要量化的卷积层的权重、偏置以及输入该卷积层的输入数据进行量化,卷积模块利用量化后的输入数据、量化后的权重以及量化后的偏置进行卷积计算,得到每一层卷积层的计算结果。可以使得利用该系统得到的计算结果更加准确,降低卷积神经网络的计算量,同时可以降低卷积神经网络模型以及卷积计算结果的存储数据量,提高了卷积神经网络量化的精度。便于硬件设计上的实现。对于模型压缩中的目标检测过程,提高了目标检测的精度和效率。
具体而言,对于卷积神经网络,从逻辑上而言,一般情况下卷积层包括多层,每一层卷积层都有自己的卷积模型,即每一层卷积层都有自己的权重和偏置值,卷积模型可以理解为计算模型。举例来说明,例如,假设第i层卷积层(第i层卷积层为逻辑上的卷积层的概念)的卷积模型为公式(2)所示的模型:
y=∑ax+b   (2)
公式(2)中,a可以理解为权重,b可以理解为偏置,x可以理解为输入到第i层卷积层的输入数据。通过公式(2)的计算,可以得到第i层卷积层的卷积结果。
量化模块110可以对该卷积神经网络中多个卷积层进行量化处理。对第i层卷积层分别进行权重、偏置以及输入数据的量化。其中,该第i层卷积计算的数据为待量化的数据。在本申请实施例中,也可以将第i层卷积层称为待量化的卷积层。第i层卷积计算的数据包括第i层卷积层的输入数据、第i层卷积层的权重以及偏置。第i层卷积层的输入数据也可以称为第i层卷积计算的输入数据,即对第i层卷积层的输入数据的量化也可称为对第i层卷积计算的输入数据的量化。量化模块110通过对该第i层卷积计算的数据的量化,分别得到第i层卷积层量化后的输入数据、量化后的权重以及量化后的偏置。对权重和偏置的量化可以看成对该第i层卷积层的卷积模型的量化。卷积模块120可以对该量化后的多个卷积层中每一层卷积层进行卷积计算,分别得到每一层卷积层的计算结果。
应理解,由于卷积神经网络的卷积层一般包括多层,但并不意味着卷积神经网络包括 所有卷积层都需要进行量化处理。即在本申请提供的卷积神经网络中,可以是部分卷积层需要进行量化处理。需要进行量化处理部分卷积层可以是连续的卷积层,也可以是不连续的卷积层。例如,假设卷积层总共有10层,可以是第2至第6层需要进行量化处理,或者,也可以是2、4、7、9层卷积层需要进行量化处理,本申请实施例在此不作限制。
还应理解,卷积模块120除了可以对量化后的卷积层进行卷积计算外,还可以对不经过量化的卷积层进行卷积计算,得到卷积结果。本申请实施例在此不作限制。
还应理解,上述的公式(2)只是示例性的,仅仅是为了说明第i层卷积层的权重、偏置以及输入数据,而不应对第i层卷积层的卷积模型产生任何限制。
还应理解,该系统100还可以包括其他模块,例如,输入模块以及池化模块等。用于支持该系统完成卷积神经网络的其他功能。本申请实施例在此不作限制。
还应理解,该系统100可以是一种芯片或者装置,该芯片或者装置可以包括量化模块110和卷积模块120等。本申请实施例在此不作限制。
可选的,作为一个实施例,当i等于1时,该第i层卷积层的输入数据为原始输入图片。
具体而言,当该第i层卷积层为第一层卷积层时,该第1层卷积层的输入数据为输入该系统的原始输入图片。即量化模块110需要对输入该系统的原始输入图片进行量化,并且需要对该第1层卷积层的权重和偏置进行量化。卷积模块120对该第1层卷积层量化后的原始输入图片、量化后的权重以及量化后的偏置进行卷积计算,得到该第1层卷积层的卷积计算结果。
可选的,作为一个实施例,量化模块110对输入该系统的第1层卷积层的原始输入图片(image)进行量化时,可以采用公式(3)所示的量化公式进行量化。
Figure PCTCN2019090660-appb-000002
在公式(3)中,IMG表示输入到第1层卷积层的原始输入图片,input -img是表示原始输入图片IMG的矩阵。Q(IMG)代表量化后的第1层卷积层的输入数据,γ 1代表第1层特征图数据的量化参数,量化参数是在量化过程中使用的参数,相当于放大乘数。α 1代表第1层卷积层权重的量化参数,W 1代表第1层卷积层的权重,Bias 1代表第1层卷积层的偏置。
应理解,量化模块110除了利用的上述的公式(3)对第1层卷积层的原始输入图片进行量化外,还可以利用其他公式对第1层卷积层的原始输入图片进行量化。例如,利用公式(3)的任何变形后的公式等。本申请实施例在此不作限制。
可选的,作为一个实施例,当i大于1时,该第i层卷积层的输入数据为特征图数据。
具体而言,卷积神经网路的卷积层包括多层,当i大于1时,第i层卷积层的输入数据为特征图数据。特征图数据表示某一层卷积层的计算结果,对于整个卷积神经网络而言是一个中间计算结果。对于第i层卷积层的输入数据而言,该输入数据为第i-1层卷积层的特征图数据,即为第i-1层卷积层的卷积结果。例如,假设i为5,第5层卷积层的输入数据为特征图数据。该特征图数据为第4层卷积层的卷积结果。即卷积模块120计算得到的第4层卷积层的卷积结果(第4层卷积层的特征图数据)。第4层卷积层可以为经过量 化后的卷积层,也可以是没有经过量化后的卷积层。
可选的,作为一个实施例,当i大于1时,量化模块110对输入该第i层卷积层的输入数据进行量化(即对第i-1层卷积层的特征图进行量化)时,可以采用公式(4)所示的量化公式进行量化。
Figure PCTCN2019090660-appb-000003
在公式(4)中,Q(FM i)代表量化后的第i层卷积层的输入数据,FM i-1代表第i层卷积层的输入数据(第i-1层卷积层的卷积结果),γ i代表第i层卷进层的特征图数据的量化参数,γ j代表第j层卷积层的特征图数据的量化参数,量化参数是在量化过程中使用的参数,相当于放大乘数。α i代表第i层卷积层权重的量化参数,W i代表第i层卷积层的权重,Bias i代表第i层卷积层的偏置。
应理解,量化模块110处理利用的上述的公式(4)对第i层卷积层的输入数据(第i-1层卷积层的特征图)进行量化外,还可以利用其他公式对第i层卷积层的输入数据进行量化。例如,利用公式(4)的任何变形后的公式等。本申请实施例在此不作限制。
可选的,作为一个实施例,如图2所示,该卷积模块120包括:
乘法器121,用于对该量化后的该第i层卷积层的输入数据以及该量化后的权重进行乘法运算。
加法器122,用于对该乘法器的输出结果与该量化后的偏置进行加法运算,得到该第i层卷积层的卷积结果。
具体而言,卷积模块120执行的卷积过程的实质为乘累加的过程。因此该卷积模块120包括乘法器121,用于该量化后的该第i层卷积层的输入数据以及该量化后的权重进行乘法运算,得到计算结果。该加法器122用于对该乘法器121的输出结果与该量化后的偏置进行加法运算,得到该第i层卷层积层的卷积结果。例如公式(2)所示的。乘法器121先将量化后的权重a和量化后的输入数据x进行乘法运算,得到结果,然后加法器122将乘法器121的计算得到的结果与量化后的偏置b进行加法运算,得到第i层卷积层的卷积结果。
应理解,当第i层卷积层为最后一层卷积层时,可以将第i层卷积层的卷积结果作为整个卷积层的输出,该输出结果可以作为卷积神经网络其他处理层(例如激活函数层)的输入,以使卷积神经网络进行后续的计算等。本申请实施例在此不作限制。
可选的,作为一个实施例,当待执行第i+1层卷积计算的数据为待量化的数据时,也就是说,当第i+1层卷积层为待量化的卷积层时,该量化模块110还用于:
将该第i层卷积层的卷积结果进行与该权重的量化以及该偏置的量化对应的反量化,其中,该反量化后的该第i层卷积层的卷积结果为该第i+1层卷积层的输入数据。
具体而言,由于在对第i层卷积层的输入数据进行卷积计算之前,对第i层卷积层的卷积模型(model)进行了量化(权重和偏置的量化),利用量化后的模型进行卷积计算。因此,在得到第i层卷积层的卷积结果后,为了保证第i层卷积层的卷积结果(特征图数据)所在的值域范围和利用没有量化的模型进行卷积计算的输出结果所在的值域范围是一致的,保持量化的可逆性和模型的准确性,需要对第i层卷积层的卷积结果进行模型反量 化,保证第i层卷积层的卷积结果的准确性和精度。该反量化后的第i层卷积层的卷积结果为该第i+1层卷积层的输入数据。第i+1层卷积计算的数据包括第i+1层卷积层的输入数据、第i+1层卷积层的权重以及偏置。由于第i+1层卷积计算的数据为待量化的数据(第i+1层卷积层也为需要量化的卷积层),其处理流程和第i层卷积层类似。对于第i+1层卷积层而言,量化模块110也需要对输入数据进行量化、对第i+1层卷积层的权重和偏置进行量化。卷积模块120也需要对量化后的第i+1层卷积层量化后的输入数据、量化后的权重以及量化后的偏置进行卷积计算,得到第i+1层卷积层的卷积结果。这样可以提高该卷积神经网络的准确性和精度。
可选的,量化模块110对该i层卷积层的卷积结果进行模型反量化(权重的量化以及偏置的量化对应的反量化)时,可以采用公式(5)所示的公式进行反量化:
Figure PCTCN2019090660-appb-000004
公式(5)中,MODEL_quantize_reverse代表该i层卷积层的卷积结果进行模型反量化结果,α i代表第i层卷积层权重的量化参数,W i代表第i层卷积层的权重,Q(IMG)代表量化后的第1层卷积层的原始输入图片,Bias i代表第i层卷积层的偏置,Q(FM i-1)代表量化后的第i-1层卷积层的特征图数据(也可以称为,量化后的第i-1层卷积层的卷积结果),γ i代表第i层卷积层的特征图的量化参数,γ j代表第j层卷积层的特征图数据的量化参数。模型的反量化关键在于约去模型的量化参数(也可以称为,放大倍数),保持模型计算结果的准确性。
在对该i层卷积层的卷积结果进行模型反量化后,得到了第i+1层卷积层的输入数据。经过模型反量化后的特征图数据,其数值范围往往参差不齐,因此,需要对模型反量化后的输入数据进行量化。这个过程即为量化模块110对第i+1层卷积层的输入数据进行量化的过程。
对于第i+1层卷积层的输入数据进行量化时,结合上述的公式(4)以及公式(5),对于i+1层卷积层,可以利用公式(6)对第i+1层卷积层的输入数据进行量化。
Figure PCTCN2019090660-appb-000005
公式(6)中,Q(FM i+1)表示量化后的第i+1层的输入数据,α i代表第i层卷积层权重的量化参数,W i代表第i层卷积层的权重,Q(IMG)代表量化后的第1层卷积层的输入数据,Bias i代表第i层卷积层的偏置,Q(FM i)代表量化后的第i层卷积层的特征图数据(量化后的第i层卷积层的卷积结果),γ j代表第i层卷积层的特征图的量化参数,γ j代表第j层卷积层的特征图数据的量化参数,n代表总的卷积层数。
带入上述的公式(5)中的模型反量化,得到公式(7):
Figure PCTCN2019090660-appb-000006
公式(7)为对第i+1层卷积层的输入数据进行量化时的公式,MODEL_quantize_reverse代表该第i层卷积层的卷积结果进行模型反量化结果,Q(FM i+1)代表量化后的第i+1层卷积层的输入数据,γ i代表第i层卷积层的特征图的量化参数。
应理解,量化模块110除了利用的上述的公式(7)对第i+1层卷积层的输入数据进行量化外,还可以利用其他公式对第i+1层卷积层的输入数据进行量化。本申请实施例在此不作限制。
可选的,作为一个实施例,当待执行的第i+1层卷积计算的数据不是待量化的数据,也就是说,当第i+1层卷积层不是待量化的卷积层时,该量化模块110还用于:
将该第i层卷积层的卷积结果进行与该权重的量化以及该偏置的量化对应的反量化;
将该反量化后的得到的结果进行特征图反量化;
该卷积模块120还用于:对该特征图反量化后的结果、该第i+1层卷积层的权重以及该第i+1层卷积层的偏置进行卷积计算,得到该第i+1层卷积层的卷积结果。
具体而言,由于在对第i层卷层积层进行卷积计算之前,对第i层卷层积层的卷积模型(model)进行了量化(权重和偏置的量化),利用量化后的模型进行卷积计算。因此,在得到第i层卷层积层的卷积结果后,为了保证第i层卷层积的输出结果(特征图数据)所在的值域范围和利用没有量化的模型进行卷积计算的输出结果所在的值域范围是一致的,保持量化的可逆性,需要对第i层卷积层的输出结果进行模型反量化。对第i层卷积层的输出结果进行模型反量化与上述的第i+1层卷积层为待量化的卷积层时的反量化过程类似,这里不再赘述。
在对第i层卷积层的卷积结果进行反量化后,由于第i+1层卷积计算的数据不是待量化的数据(第i+1层卷积层不是待量化的卷积层),即第i+1层卷积层不需要进行量化处理。但是由于在第i层卷积层进行卷积计算时,还对第i层卷积层的输入数据进行了量化。因此,为了保证第i层卷积层的输出结果(第i层卷积层的特征图数据)所在的值域范围和利用没有量化的输入数据进行卷积计算的输出结果所在的值域范围是一致的,保持量化的可逆性,需要对第i层卷积层的模型反量化后的结果进行特征图反量化(输入数据反量化)。该卷积模块120对该特征图反量化后的结果、该第i+1层卷积层的权重以及该第i+1层卷积层的偏置进行卷积计算,得到该第i+1层卷积层的卷积结果。
可选的,量化模块110对第i层卷积层的模型反量化后的结果进行特征图反量化时,可以利用下述的公式(8)进行特征图反量化:
Figure PCTCN2019090660-appb-000007
公式(8)中,α i代表第i层卷积层权重的量化参数,W i代表第i层卷积层的权重,Bias i代表第i层卷积层的偏置,FM i表示对第i层卷积层进行特征图反量化的结果。Q(FM i-1)代表量化后的第i-1层卷积层的特征图数据,γ j代表第i层卷积层的输入数据的量化参数。γ j代表第j层卷积层的特征图数据的量化参数,第j层卷积层为第i层卷积层之前,与第i 层卷积层连续的需要量化的卷积层。
具体而言,在对第i层卷积层的模型反量化后的结果进行特征图反量化时,除了需要考虑到第i层卷积层的输入数据量化的是的量化参数外,还要结合第i层卷积层之前,与第i层卷积层连续的需要量化的卷积层进行输入数据量化时的量化参数。例如,假设第i层卷积层为第5层卷积层,第4层卷积层和第3层卷积层都为需要量化的卷积层。则在对第5层卷积层进行特征图反量化时,需要结合第4层卷积层和第3层卷积层的输入数据的量化参数进行特征图反量化。即这里的j的取值分别为4和3。如果第4层卷积层和第3层卷积层都为需要量化的卷积层,第3层卷积层为不需要量化的卷积层,则在对第5层卷积层进行特征图反量化时,需要结合第4层卷积层的输入数据的量化参数进行特征图反量化,即这里的j为4。
还应理解,量化模块110除了利用的上述的公式(8)对第i层卷积层的模型反量化后的结果进行特征图反量化之外,还可以利用其他公式进行特征图反量化。本申请实施例在此不作限制。
可选的,作为一个实施例,如图3所示,该系统100还包括:
量化参数获取模块130,用于获取该第i层卷积层的输入数据的量化参数、该第i层卷积层的权重的量化参数以及该偏置的量化参数;
该量化模块110具体用于:根据该第i层卷积层的输入数据的量化参数对该第i层卷积层的输入数据进行量化,根据该第i层卷积层的权重的量化参数对该权重进行量化,根据该第i层卷积层的偏置的量化参数对该偏置进行量化。
具体而言,在对该卷积神经网络进行模型训练的过程中,可以对模型中所有的权重和偏置进行数据统计分析。例如,可以分别针对权重和偏置寻找其对应极大值,并寻找一个2的幂次,使得权重或者偏置乘以这个2的幂次能够最大限度的接近预设的值域范围。例如,如果给定8比特量化,则值域范围的区间为-128至127之间,并通过计算得到相应的最大移位(Max Shift)值,即二进制下的移位长度,然后根据MaShift值确定量化参数。这种方式所依据的原理是因为在训练过程中由于加入了归一化层,所以所有的权重都是小于1的32比特浮点数,因此可以采用给这些小数乘以一个相对较大的2的幂次放大到一个预订的量化值域范围之间,例如在-128至127的尺度之间,即用定点8比特表示。
因此,该量化参数获取模块130用于获取该第i层卷积层的输入数据的量化参数、该第i层卷积层的权重的量化参数以及该偏置的量化参数。量化模块110根据该第i层卷积层的输入数据的量化参数对该第i层卷积层的输入数据进行量化,根据该第i层卷积层的权重的量化参数对该权重进行量化,根据该第i层卷积层的偏置的量化参数对该偏置进行量化。
下面将具体说明获取该第i层卷积层的输入数据(第i层卷积计算的输入数据)的量化参数、该第i层卷积层的权重的量化参数以及该偏置的量化参数。
1、获取权重量化参数:
对第i层卷积模型中所有的权重进行数据统计分析。对于第i层卷积层权重的最大值Max(abs(W i)),基于权重量化的目标位宽k,得到给定值域范围2 k。首先依据公式(9)求出第i层卷积层的权重的量化乘数
Figure PCTCN2019090660-appb-000008
Figure PCTCN2019090660-appb-000009
Max代表取最大值,abs代表取绝对值,W i代表第i层卷积层的权重,k代表目标位宽,
Figure PCTCN2019090660-appb-000010
为权重量化乘数。
因为量化本身就是将全流程计算全部用整数替代,那么该量化乘数是一个实数并不适合计算过程中进行移位操作,考虑到计算机中移位操作是进行2的幂次乘除法,所以较好的办法是将该量化乘数化简成小于该值的最大的2的幂次,那么有以下公式(10)计算第i层卷积层权重的量化参数α i为:
Figure PCTCN2019090660-appb-000011
floor代表向下取整,α i代表第i层卷积层权重的量化参数。
将公式展开,即第i层卷积层权重的量化参数α i的计算公式为:
Figure PCTCN2019090660-appb-000012
2、获取偏置量化参数:
对于第i层卷积层偏置的最大值Max(abs(b i)),基于权重量化的目标位宽k,得到给定值域范围2 k。首先依据公式(12)求出第i层卷积层的权重的量化乘数
Figure PCTCN2019090660-appb-000013
Figure PCTCN2019090660-appb-000014
Max代表取最大值,abs代表取绝对值,b i代表第i层权重,k代表目标位宽,
Figure PCTCN2019090660-appb-000015
为偏置量化乘数。
和计算权重的量化参数类似。那么有以下公式(13)计算当前第i层卷积层偏置的量化参数β i为:
Figure PCTCN2019090660-appb-000016
floor代表向下取整,β i代表第i层卷积层偏置的量化参数。
将公式(13)展开,即第i层卷积层偏置的量化参数β i的计算公式为:
Figure PCTCN2019090660-appb-000017
3、获取输入数据(特征图)量化参数:
对第i层卷积层中所有的特征图进行数据统计分析。对于第i层卷积层特征图的最大值Max(abs(FM i)),基于量化的目标位宽k,得到给定值域范围2 k,第i层卷积层特征图的量化参数γ i为公式(15)所示:
Figure PCTCN2019090660-appb-000018
公式(15)中,γ i代表第i层卷积层特征图的量化参数,floor代表向下取整,FM i代表第i层卷积层的特征图。
下面将具体说明根据该第i层卷积层的权重的量化参数对该权重进行量化,根据该第i层卷积层的偏置的量化参数对该偏置进行量化的过程。
4、根据该第i层卷积层的权重的量化参数对该权重进行量化:
基于权重的量化参数和移位以及四舍五入操作,第i层卷积层的权重量化后的数据为Q(W i)(以8比特定点量化为例),可以由公式(16)计算得到:
Figure PCTCN2019090660-appb-000019
Q(W i)代表量化后的第i层卷积层的权重数据,round代表四舍五入,W i<<α i代表左移计算,含义代表将W i左移α i比特。
5、根据该第i层卷积层的偏置的量化参数对该偏置进行量化:
对第i层卷积层的偏置的量化参数对该偏置进行量化与对权重进行量化类似。基于偏置的量化参数和移位以及四舍五入操作,第i层卷积层的偏置量化后的数据为Q(β i)(以8比特定点量化为例),可以由公式(17)计算得到:
Figure PCTCN2019090660-appb-000020
公式(17)中,Q(Bias i)代表量化后的第i层卷积层的偏置数据,round代表四舍五入,Bias i<<β i代表左移计算,含义代表将Bias i左移β i比特。
应理解,上述的获取量化参数以及对权重和偏置的量化可以在是在训练的过程(离线过程)中进行的。卷积模块120卷积计算的过程是在实际进行对输入数据进行分析的过程(在线过程)进行的。
还应理解,除了利用的上述方式中的公式获取输入数据的量化参数、权重的量化参数、偏置的量化参数以及对权重和偏置进行量化外,还可以根据其他的方式或者公式获取各个量化参数以及对权重和偏置进行量化等。本申请实施例在此不作限制。
可选的,作为一个实施例,该量化模块110还用于:
对量化后的偏置进行修正;
该卷积模块120具体用于:利用该修正后的偏置进行卷积计算,得到该卷积结果。
具体而言,在实际进行卷积计算的过程中,由于卷积神经网络某些特性需要微调以保证卷积神经网络的精度以及保证量化过程的可逆性。由于卷积计算实际上是一个乘累加的过程。权重与特征图相乘后与偏置进行累加。在满足量化可逆性的条件下,需要对量化后的偏置进行修正,使得乘法项和加法项具有相同的乘数因子,这样在进行反量化的过程中,便可以同除以这个乘数因子,在保证了量化的可逆性的基础上,进一步的提高量化的精度。保证模型的精度和准确性。在对量化后的偏置进行修正后,卷积模块120可以对量化后的第i层卷积层的输入数据、量化后的权重以及该修正后的偏置进行卷积计算,得到第i层卷积层的卷积结果。
量化模块110可以对该量化后的偏置进行修正。对量化后的偏置进行修正可以利用该权重的量化参数、偏置的量化参数以及特征图的量化参数对量化后的偏置进行修正。量化模块110可以根据当前层权重的量化参数、当前层偏置的量化参数以及当前层之前的一个或者多个连续的需要量化的卷积层的特征图量化参数对当前层量化后的偏置进行修正。当前层之前的一个或者多个连续的需要量化的卷积层可以理解为:假设当前为第5层卷积层, 如果第4层卷积层和第3层卷积层都为需要量化的卷积层,第2层卷积层为不需要量化的卷积层,则当前层之前的一个或者多个连续的需要量化的卷积层为第4层卷积层和低层卷积层。如果第4层卷积层和第2层卷积层都为需要量化的卷积层,第3层卷积层为不需要量化的卷积层,则当前层之前的一个或者多个连续的需要量化的卷积层为第4层卷积层。量化模块110对量化后的偏置进行修正包括对权重的量化参数进行修正以及对特征图的量化参数进行修正。
下面将具体说明对权重的量化参数进行修正以及对特征图的量化参数进行修正。
对权重的量化参数进行修正:
在量化处理过后,第i层卷积层的卷积表达式为公式(18)所示(省略其中的四舍五入等计算):
Figure PCTCN2019090660-appb-000021
Q(W i)代表量化后的第i层卷积层的权重数据,FM i-1代表第i-1层卷积层的特征图数据,第i-1层卷积层可以为需要量化的卷积层,也可以为不需要量化的卷积层。i的值大于1。当i等于1时,公式(18)中的FM i-1变为input -img,Q(Bias i)代表量化后的第i层卷积层的偏置数据,W i代表第i层卷积层的权重,Bias i代表第i层卷积层的偏置,α i代表第i层卷积层权重的量化参数,β i代表第i层卷积层偏置的量化参数。其中,Q(W i)和Q(Bias i)的量化参数是不一样的,这对于统一对整个卷积进行线性变换是不可行的,需要对齐他们的量化过程中的乘数因子。
因此,在已量化Bias i进入计算后,使用
Figure PCTCN2019090660-appb-000022
对其进行一次修正,如公式(19)所示:
Figure PCTCN2019090660-appb-000023
可以看出经过修正后的偏置和权重共享乘数因子
Figure PCTCN2019090660-appb-000024
这样在进行反量化时可以同除以这个乘数因子,使得卷积过程满足可逆性,保证了模型的精度和准确性。
在某些情况中,例如,在对偏置的输入不限制其位宽(或限制较松),则可以直接利用权重的量化参数对偏置进行量化。即权重和偏置共用一个量化参数。即满足
Figure PCTCN2019090660-appb-000025
对特征图的量化参数进行修正:
对于特征图的量化,偏置同样需要进行修正。以第1层卷积层和第2层卷积层为例进行说明,第一层卷积层和第二层卷积层都为需要量化的卷积层。
对于第1层卷积层,其卷积公式如公式(20)所示:
FM 1=∑W 1*input -img+Bias 1  (20)
公式(20)中,IMG表示输入到第1层卷积层的原始输入图片,FM 1表示第1层卷积层的特征图数据,在经过输入数据(原始输入图片)量化、权重量化以及偏置量化后,卷积公式如公式(21)所示:
Figure PCTCN2019090660-appb-000026
α 1代表第1层卷积层权重的量化参数,W 1代表第1层卷积层的权重,Bias 1代表第1层卷积层的偏置,Q(IMG)代表量化后的第1层卷积层的输入数据。
在进行模型反量化以及特征图量化后,得到量化后的Q(FM 1),Q(FM 1)代表量化后的第1层卷积层的特征图数据(量化后的第1层卷积层的卷积结果),如公式(22)所示:
Figure PCTCN2019090660-appb-000027
公式(22)中,γ 1代表第1层卷积层的特征图数据的量化参数。FM 1表示第1层卷积层的特征图数据。
对于第二层卷积层,在模型参数量化以及偏置量化参数修正后,其卷积表达式如公式(23)所示:
Figure PCTCN2019090660-appb-000028
公式(23)中,α 2代表第2层卷积层权重的量化参数,W 2代表第2层卷积层的权重,Bias 2代表第2层卷积层的偏置,FM 1表示第1层卷积层的特征图数据,Q(FM 1)代表量化后的第1层卷积层的特征图数据,γ 1代表第1层卷积层特征图的量化参数,为了使得量化后的偏置仍然能保持和乘累加后的特征图一致的量化乘数因子,需要对偏置的量化参数再次进行修正,和第一次对偏置的量化参数进行修正合并,得到量化后的Q(FM 2),如公式(24)所示:
Figure PCTCN2019090660-appb-000029
γ 2代表第2层卷积层特征图数据的量化参数,Q(FM 2)代表量化后的第2层卷积层的特征图数据。
应理解,量化模块110除了利用的上述的公式对偏置的量化参数进行修正外,还可以利用其他公式对偏置的量化参数进行修正。本申请实施例在此不作限制。
应理解,在本申请实施例中,对特征图进行反量化也可以是由特征图反量化器来执行。对特征图进行量化也可以是由特征图量化器来执行,对模型反量化可以是由模型反量化器来执行,对量化参数进行修正可以是有量化参数修正器来执行,特征图反量化器、特征图量化器、模型反量化器以及量化参数修正器可以集成在量化模块中,也可以分开设置。量化模块中还可以包括用于对模型进行量化的模型量化器以及量化参数获取模块等,本申请实施例在此不作限制。
下面将结合图4说明本申请提供的卷积神经网络系统200。如图4所示,该系统200包括图片量化器211、模型量化器212、乘法器213、加法器214,模型反量化模器215,特征图反量化器216,特征图量化器217。其中,图片量化器211、模型量化器212以及特征图量化器217可以集成在一个量化模块中,也可以分开设置。模型反量化模器215和特征图反量化器216也可以集成在一个反量化模块中,也可以分开设置。乘法器213和加法器214也可以集成在一个卷积模块中,也可以分开设置,本申请实施例在此不作限制。
如图4所示。图4所示的为系统200对当前层卷积层为需要量化的卷积层的处理流程。模型量化器212通过训练,先对需要量化的每一层卷积层的权重和偏置进行量化,得到每一层卷积层量化后的权重和偏置。假设第1层卷积计算的数据为待量化的数据,则对于输入到卷积神经网络第1层卷积层的原始输入图片,图片量化器211对该原始输入图片进行图片量化,得到量化的图片,结合第1层卷积层量化后的权重和偏置,进行卷积计算。乘 法器213将量化后的图片和第1层卷积层量化后的权重进行乘法运算,加法器214将乘法器的输出结果与第1层卷积层量化后的偏置进行加法运算,得到第1层卷积层的卷积结果。模型反量化器215将第1层的卷积结果进行模型反量化。根据第2层卷积层是否为需要量化的卷积层确定处理流程。
如果第2层卷积层为需要量化的卷积层,也就是说,如果第2层卷积计算的数据为待量化的数据,特征图量化器217将模型反量化器215的反量化结果进行特征图量化,并将特征图量化后的结果输入到乘法器213中,乘法器213将特征图量化后的结果与第2层卷积层量化后的权重进行乘法运算,加法器214将第2层卷积层量化后的偏置与乘法器211的输出结果进行相加,得到第2层卷积层的特征图数据。接着利用相同的处理流程处理后序的需要量化的卷积层。
如果第2层卷积层为不需要量化的卷积层,也就是说,如果第2层卷积计算的数据不是待量化的数据,特征图反量化器216将模型反量化器215的反量化结果进行特征图反量化,并将特征图反量化后的结果输入到乘法器213中,乘法器213将特征图反量化后的结果与第2层卷积层的权重进行乘法运算,加法器214将第2层卷积层的偏置与乘法器213的输出结果进行相加,得到第2层卷积层的特征图数据。接着利用相同的处理流程处理后序的不需要量化的卷积层。
可选的,该系统还包括量化修正器218,量化修正器218可以对量化后的偏置进行修正,输入加法器214的偏置还可以是经过量化修正器218修正后的偏置。量化修正器218可以集成在模型量化器212中,也可以单独设置。
对于当前层为最后一层卷积层的情况,如果最后一层卷积层为需要量化的卷积层,模型反量化器215将最后一层卷积层的卷积结果先进行模型反量化、特征图反量化器216在将模型反量化器215结果进行特征图反量化后得到的结果便为所有卷积层的输出结果。如果最后一层卷积层为不需要量化的卷积层,则需要根据倒数第二层是否为需要量化的卷积层来确定倒数第二层输入到最后一层的特征图数据。
对于第1层卷积层而言,输入乘法器221的为量化后的图片,对于其他层卷积层而言,输入乘法器213为特征图量化器217的输出结果或者特征图反量化器216的输出结果。也就是说,对于一幅原始的图片的卷积计算,图片量化器211只有一次输入和输出,即对于第1层卷积层而言输入的为量化后的图片。对于其他卷积层。图片量化器211不进行输出。图4中的虚线所示的为仅对于第1层卷积层图片量化器211向乘法器213输入数据,对于其他卷积层图片量化器211不向乘法器213输入数据。
例如,假设总共有10层卷积层,如果第10层卷积层为需要量化的卷积层,在加法器214将第10层卷积层量化后的偏置与乘法器213的输出结果进行相加,得到第10层卷积层的特征图数据(卷积结果)后,模型反量化器215将第10层的卷积结果进行模型反量化。特征图反量化器216将模型反量化器215的结果进行特征图反量化后便得到整个卷积层的卷积结果。
同理,对于需要量化的第i层卷积层的处理流程和上述的相似。为了简洁,这里不再赘述。
本申请提供的卷积神经网络系统,通过量化模块对该卷积层中需要量化的卷积层中每一层卷积层的权重、偏置以及输入该卷积层的输入数据进行量化,卷积模块利用量化后的 输入数据、量化后的权重以及量化后的偏置进行卷积计算,得到每一层卷积层的计算结果。并且对量化的模型以及特征图进行反量化,使得量化满足可逆性。保证了量化的精度,
可以使得利用该系统得到的计算结果更加准确,提高了卷积神经网络系统的精度和准确性。降低卷积神经网络的计算量,同时可以降低卷积神经网络模型以及卷积计算结果的存储数据量。
本申请还提供一种卷积神经网络量化的方法,该卷积神经网络量化的方法300可以应用在卷积神经网络系统(装置)中,该卷积神经网络系统中可以是上述的申请提供的卷积神经网络系统,还可以是现有的卷积神经网络系统,本申请实施例在此不作限制。图6示出了本申请提供的卷积神经网络量化的方法300的示意性流程图。该方法300可以由芯片执行,该芯片可以包括量化模块和卷积模块等。或者也可以由计算机系统执行,该计算机系统可以包括量化模块和卷积模块等,该芯片或者该计算机系统可以是上述本申请提供的卷积神经网络系统(装置)。本申请在此不作限制。如图5所示,该方法300包括:
S310,对该卷积神经网络的第i层卷积层的输入数据、该第i层卷积层的权重以及偏置分别进行量化,i为正整数。该第i层卷积计算的数据为待量化的数据。
S320,对量化后的该第i层卷积层的输入数据、量化后的该权重以及量化后的该偏置进行卷积计算,得到该第i层卷积层的卷积结果。
本申请提供的卷积神经网络量化的方法,通过对该卷积层中需要量化的卷积层的权重、偏置以及输入该卷积层的输入数据进行量化,利用量化后的输入数据、量化后的权重以及量化后的偏置进行卷积计算,得到每一层卷积层的计算结果。可以使得得到的计算结果更加准确,降低卷积神经网络的计算量,同时可以降低卷积神经网络模型以及卷积计算结果的存储数据量,提高了卷积神经网络量化的精度。便于硬件设计上的实现。
可选的,作为一个实施例,如图6所示,在S320中,该对量化后的该第i层卷积层的输入数据、量化后的该权重以及量化后的该偏置进行卷积计算,得到该第i层卷积层的卷积结果,包括:
S321,对该量化后的该第i层卷积层的输入数据以及该量化后的该权重进行乘法运算。
S322,对该乘法运算的结果与该量化后的该偏置进行加法运算,得到该第i层卷积层的卷积结果。
可选的,作为一个实施例,当i等于1时,该第i层卷积层的输入数据为原始输入图片;或者,当i大于1时,该第i层卷积层的输入数据为特征图数据。
可选的,作为一个实施例,如图7所示,当待执行第i+1层卷积计算的数据为待量化的数据时,该方法300还包括:
S330,将该第i层卷积层的卷积结果进行与该权重的量化以及该偏置的量化对应的反量化,其中,该反量化后的该第i层卷积层的卷积结果为该第i+1层卷积层的输入数据。
可选的,作为一个实施例,如图8所示,当待执行第i+1层卷积计算的数据不是待量化的数据时,该方法300还包括:
S340,将该第i层卷积层的卷积结果进行与该权重的量化以及该偏置的量化对应的反量化。
S350,将该反量化后的得到的结果进行特征图反量化。
S360,对该特征图反量化后的结果、该第i+1层卷积层的权重以及该第i+1层卷积层 的偏置进行卷积计算,得到该第i+1层卷积层的卷积结果。
可选的,作为一个实施例,如图9所示,该方法300还包括:
S311,对量化后的该偏置进行修正。
在S320中,该对量化后的该第i层卷积层的输入数据、量化后的该权重以及量化后的该偏置进行卷积计算,包括:
对量化后的该第i层卷积层的输入数据、量化后的该权重以及该修正后的该偏置进行卷积计算,得到该第i层卷积层的卷积结果。
可选的,作为一个实施例,该方法300还包括:
获取该第i层卷积层的输入数据的量化参数、该第i层卷积层的该权重的量化参数以及该偏置的量化参数;
在S310中。该对该卷积神经网络的第i层卷积层的输入数据、该第i层卷积层的权重以及偏置分别进行量化,包括:
根据该第i层卷积层的输入数据的量化参数对该第i层卷积层的输入数据进行量化,根据该第i层卷积层的权重的量化参数对该权重进行量化,根据该第i层卷积层的偏置的量化参数对该偏置进行量化。
应理解,方法300中各个实施的具体的步骤的描述可以参考上述的卷积神经网络系统相应的描述。例如,方法300中的各个实施例的量化或者反量化的公式可以利用上述的卷积神经网络系统相应的公式等。为避免重复,这里不再赘述。
还应理解,上述只是为了帮助本领域技术人员更好地理解本申请实施例,而非要限制本申请实施例的范围。本领域技术人员根据所给出的上述示例,显然可以进行各种等价的修改或变化,或者可以新加入某些步骤等。或者上述任意两种或者任意多种实施例的组合。这样的修改、变化或者组合后的方案也落入本申请实施例的范围内。
还应理解,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
还应理解,上文对本申请实施例的描述着重于强调各个实施例之间的不同之处,未提到的相同或相似之处可以互相参考,为了简洁,这里不再赘述。
本申请实施例还提供了一种计算机可读介质,用于存储计算机程序代码,该计算机程序包括用于执行上述方法300中本申请实施例的卷积神经网络量化的方法的指令。该可读介质可以是只读存储器(read-only memory,ROM)或随机存取存储器(random access memory,RAM),本申请实施例对此不做限制。
本申请还提供了一种计算机程序产品,所述计算机程序产品包括指令,当所述指令被执行时,以使得装置执行对应于上述方法中的操作。
本申请还提供了一种计算机系统,该计算机系统包括用于执行本申请实施例的卷积神经网络量化的方法的芯片或者装置。该芯片或者该装置可以是上述本申请提供的卷积神经网络系统。
本申请实施例还提供了一种系统芯片,该系统芯片包括:处理单元和通信单元,该处理单元,例如可以是处理器,该通信单元例如可以是输入/输出接口、管脚或电路等。该处理单元可执行计算机指令,以使该通信装置内的芯片执行上述本申请实施例提供的任一种卷积神经网络量化的方法。
可选地,该计算机指令被存储在存储单元中。
可选地,该存储单元为该芯片内的存储单元,如寄存器、缓存等,该存储单元还可以是该终端内的位于该芯片外部的存储单元,如ROM或可存储静态信息和指令的其他类型的静态存储设备,RAM等。其中,上述任一处提到的处理器,可以是一个CPU,微处理器,ASIC,或一个或多个用于控制上述的卷积神经网络量化的方法的程序执行的集成电路。该处理单元和该存储单元可以解耦,分别设置在不同的物理设备上,通过有线或者无线的方式连接来实现该处理单元和该存储单元的各自的功能,以支持该系统芯片实现上述实施例中的各种功能。或者,该处理单元和该存储器也可以耦合在同一个设备上。
应理解,上文对本申请实施例的描述着重于强调各个实施例之间的不同之处,未提到的相同或相似之处可以互相参考,为了简洁,这里不再赘述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (15)

  1. 一种卷积神经网络系统,其特征在于,包括:
    量化模块,用于对所述系统的第i层卷积层的输入数据、所述第i层卷积层的权重以及偏置分别进行量化,i为正整数;
    卷积模块,用于对量化后的所述第i层卷积层的输入数据、量化后的所述权重以及量化后的所述偏置进行卷积计算,得到所述第i层卷积层的卷积结果。
  2. 根据权利要求1所述的系统,其特征在于,所述卷积模块包括:
    乘法器,用于对所述量化后的所述第i层卷积层的输入数据以及所述量化后的所述权重进行乘法运算;
    加法器,用于对所述乘法器的输出结果与所述量化后的所述偏置进行加法运算,得到所述第i层卷积层的卷积结果。
  3. 根据权利要求1或2所述的系统,其特征在于:
    当i等于1时,所述第i层卷积层的输入数据为原始输入图片;或者,
    当i大于1时,所述第i层卷积层的输入数据为特征图数据。
  4. 根据权利要求1至3中任一项所述的系统,其特征在于,当待执行第i+1层卷积计算的数据为待量化的数据时,
    所述量化模块还用于:
    将所述第i层卷积层的卷积结果进行与所述权重的量化以及所述偏置的量化对应的反量化,其中,所述反量化后的所述第i层卷积层的卷积结果为所述第i+1层卷积层的输入数据。
  5. 根据权利要求1至3中任一项所述的系统,其特征在于,当待执行第i+1层卷积计算的数据不是待量化的数据时,所述量化模块还用于:
    将所述第i层卷积层的卷积结果进行与所述权重的量化以及所述偏置的量化对应的反量化;
    将所述反量化后的得到的结果进行特征图反量化;
    所述卷积模块还用于:对所述特征图反量化后的结果、所述第i+1层卷积层的权重以及所述第i+1层卷积层的偏置进行卷积计算,得到所述第i+1层卷积层的卷积结果。
  6. 根据权利要求1至5中任一项所述的系统,其特征在于,所述量化模块还用于:
    对量化后的所述偏置进行修正;
    所述卷积模块具体用于:对量化后的所述第i层卷积层的输入数据、量化后的所述权重以及所述修正后的所述偏置进行卷积计算,得到所述第i层卷积层的卷积结果。
  7. 根据权利要求6所述的系统,其特征在于,所述系统还包括:
    量化参数获取模块,用于获取所述第i层卷积层的输入数据的量化参数、所述第i层卷积层的所述权重的量化参数以及所述偏置的量化参数;
    所述量化模块具体用于:根据所述第i层卷积层的输入数据的量化参数对所述第i层卷积层的输入数据进行量化,根据所述第i层卷积层的权重的量化参数对所述权重进行量化,根据所述第i层卷积层的偏置的量化参数对所述偏置进行量化。
  8. 一种卷积神经网络量化的方法,其特征在于,包括:
    对所述卷积神经网络的第i层卷积层的输入数据、所述第i层卷积层的权重以及偏置分别进行量化,i为正整数;
    对量化后的所述第i层卷积层的输入数据、量化后的所述权重以及量化后的所述偏置进行卷积计算,得到所述第i层卷积层的卷积结果。
  9. 根据权利要求8所述的方法,其特征在于,所述对量化后的所述第i层卷积层的输入数据、量化后的所述权重以及量化后的所述偏置进行卷积计算,得到所述第i层卷积层的卷积结果,包括:
    对所述量化后的所述第i层卷积层的输入数据以及所述量化后的所述权重进行乘法运算;
    对所述乘法运算的结果与所述量化后的所述偏置进行加法运算,得到所述第i层卷积层的卷积结果。
  10. 根据权利要求8或9所述的方法,其特征在于:
    当i等于1时,所述第i层卷积层的输入数据为原始输入图片;或者,
    当i大于1时,所述第i层卷积层的输入数据为特征图数据。
  11. 根据权利要求8至10中任一项所述的方法,其特征在于,当待执行第i+1层卷积计算的数据为待量化的数据时,所述方法还包括:
    将所述第i层卷积层的卷积结果进行与所述权重的量化以及所述偏置的量化对应的反量化,其中,所述反量化后的所述第i层卷积层的卷积结果为所述第i+1层卷积层的输入数据。
  12. 根据权利要求8至10中任一项所述的方法,其特征在于,当待执行第i+1层卷积计算的数据不是待量化的数据时,所述方法还包括:
    将所述第i层卷积层的卷积结果进行与所述权重的量化以及所述偏置的量化对应的反量化;
    将所述反量化后的得到的结果进行特征图反量化;
    对所述特征图反量化后的结果、所述第i+1层卷积层的权重以及所述第i+1层卷积层的偏置进行卷积计算,得到所述第i+1层卷积层的卷积结果。
  13. 根据权利要求8至12中任一项所述的方法,其特征在于,所述方法还包括:
    对量化后的所述偏置进行修正;
    所述对量化后的所述第i层卷积层的输入数据、量化后的所述权重以及量化后的所述偏置进行卷积计算,包括:
    对量化后的所述第i层卷积层的输入数据、量化后的所述权重以及所述修正后的所述偏置进行卷积计算,得到所述第i层卷积层的卷积结果。
  14. 根据权利要求13所述的方法,其特征在于,所述方法还包括:
    获取所述第i层卷积层的输入数据的量化参数、所述第i层卷积层的所述权重的量化参数以及所述偏置的量化参数;
    所述对所述卷积神经网络的第i层卷积层的输入数据、所述第i层卷积层的权重以及偏置分别进行量化,包括:
    根据所述第i层卷积层的输入数据的量化参数对所述第i层卷积层的输入数据进行量化,根据所述第i层卷积层的权重的量化参数对所述权重进行量化,根据所述第i层卷积层的偏置的量化参数对所述偏置进行量化。
  15. 一种计算机可读存储介质,用于存储计算机程序,其特征在于,所述计算机程序用于执行根据权利要求8至14中任一项所述的卷积神经网络量化的方法的指令。
PCT/CN2019/090660 2018-06-12 2019-06-11 卷积神经网络系统和卷积神经网络量化的方法 WO2019238029A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810603231.XA CN110598839A (zh) 2018-06-12 2018-06-12 卷积神经网络系统和卷积神经网络量化的方法
CN201810603231.X 2018-06-12

Publications (1)

Publication Number Publication Date
WO2019238029A1 true WO2019238029A1 (zh) 2019-12-19

Family

ID=68842786

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/090660 WO2019238029A1 (zh) 2018-06-12 2019-06-11 卷积神经网络系统和卷积神经网络量化的方法

Country Status (2)

Country Link
CN (1) CN110598839A (zh)
WO (1) WO2019238029A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111914996A (zh) * 2020-06-30 2020-11-10 华为技术有限公司 一种提取数据特征的方法和相关装置
CN112580492A (zh) * 2020-12-15 2021-03-30 深兰人工智能(深圳)有限公司 车辆检测方法及装置
CN113468935A (zh) * 2020-05-08 2021-10-01 上海齐感电子信息科技有限公司 人脸识别方法
CN113780513A (zh) * 2020-06-10 2021-12-10 杭州海康威视数字技术股份有限公司 网络模型量化、推理方法、装置、电子设备及存储介质

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111258839B (zh) * 2020-02-16 2022-11-29 苏州浪潮智能科技有限公司 一种基于ResNet50网络的AI加速卡仿真测试系统及其工作方法
CN111368972B (zh) * 2020-02-21 2023-11-10 华为技术有限公司 一种卷积层量化方法及其装置
CN113762500B (zh) * 2020-06-04 2024-04-02 合肥君正科技有限公司 一种卷积神经网络在量化时提高模型精度的训练方法
CN113762497B (zh) * 2020-06-04 2024-05-03 合肥君正科技有限公司 一种卷积神经网络模型低比特推理优化的方法
US20220114413A1 (en) * 2020-10-12 2022-04-14 Black Sesame International Holding Limited Integer-based fused convolutional layer in a convolutional neural network
CN114698395A (zh) * 2020-10-30 2022-07-01 华为技术有限公司 神经网络模型的量化方法和装置、数据处理的方法和装置
CN112990438B (zh) * 2021-03-24 2022-01-04 中国科学院自动化研究所 基于移位量化操作的全定点卷积计算方法、系统及设备
CN113850374A (zh) * 2021-10-14 2021-12-28 安谋科技(中国)有限公司 神经网络模型的量化方法、电子设备及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203624A (zh) * 2016-06-23 2016-12-07 上海交通大学 基于深度神经网络的矢量量化系统及方法
CN106951962A (zh) * 2017-03-22 2017-07-14 北京地平线信息技术有限公司 用于神经网络的复合运算单元、方法和电子设备
CN107239826A (zh) * 2017-06-06 2017-10-10 上海兆芯集成电路有限公司 在卷积神经网络中的计算方法及装置
CN107256422A (zh) * 2017-06-06 2017-10-17 上海兆芯集成电路有限公司 数据量化方法及装置
CN107657316A (zh) * 2016-08-12 2018-02-02 北京深鉴科技有限公司 通用处理器与神经网络处理器的协同系统设计

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10373050B2 (en) * 2015-05-08 2019-08-06 Qualcomm Incorporated Fixed point neural network based on floating point neural network quantization
CN105760933A (zh) * 2016-02-18 2016-07-13 清华大学 卷积神经网络的逐层变精度定点化方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203624A (zh) * 2016-06-23 2016-12-07 上海交通大学 基于深度神经网络的矢量量化系统及方法
CN107657316A (zh) * 2016-08-12 2018-02-02 北京深鉴科技有限公司 通用处理器与神经网络处理器的协同系统设计
CN106951962A (zh) * 2017-03-22 2017-07-14 北京地平线信息技术有限公司 用于神经网络的复合运算单元、方法和电子设备
CN107239826A (zh) * 2017-06-06 2017-10-10 上海兆芯集成电路有限公司 在卷积神经网络中的计算方法及装置
CN107256422A (zh) * 2017-06-06 2017-10-17 上海兆芯集成电路有限公司 数据量化方法及装置

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113468935A (zh) * 2020-05-08 2021-10-01 上海齐感电子信息科技有限公司 人脸识别方法
CN113468935B (zh) * 2020-05-08 2024-04-02 上海齐感电子信息科技有限公司 人脸识别方法
CN113780513A (zh) * 2020-06-10 2021-12-10 杭州海康威视数字技术股份有限公司 网络模型量化、推理方法、装置、电子设备及存储介质
CN113780513B (zh) * 2020-06-10 2024-05-03 杭州海康威视数字技术股份有限公司 网络模型量化、推理方法、装置、电子设备及存储介质
CN111914996A (zh) * 2020-06-30 2020-11-10 华为技术有限公司 一种提取数据特征的方法和相关装置
CN112580492A (zh) * 2020-12-15 2021-03-30 深兰人工智能(深圳)有限公司 车辆检测方法及装置

Also Published As

Publication number Publication date
CN110598839A (zh) 2019-12-20

Similar Documents

Publication Publication Date Title
WO2019238029A1 (zh) 卷积神经网络系统和卷积神经网络量化的方法
CN110363279B (zh) 基于卷积神经网络模型的图像处理方法和装置
JP6977864B2 (ja) 推論装置、畳み込み演算実行方法及びプログラム
CN109002889B (zh) 自适应迭代式卷积神经网络模型压缩方法
CN110610237A (zh) 模型的量化训练方法、装置及存储介质
CN112508125A (zh) 一种图像检测模型的高效全整数量化方法
WO2021164269A1 (zh) 基于注意力机制的视差图获取方法和装置
WO2021135715A1 (zh) 一种图像压缩方法及装置
Fan et al. Neural sparse representation for image restoration
CN111240746B (zh) 一种浮点数据反量化及量化的方法和设备
WO2022111002A1 (zh) 用于训练神经网络的方法、设备和计算机可读存储介质
CN110647974A (zh) 深度神经网络中的网络层运算方法及装置
CN114677548B (zh) 基于阻变存储器的神经网络图像分类系统及方法
CN111105017A (zh) 神经网络量化方法、装置及电子设备
CN111144457A (zh) 图像处理方法、装置、设备及存储介质
CN111461302B (zh) 一种基于卷积神经网络的数据处理方法、设备及存储介质
CN113780549A (zh) 溢出感知的量化模型训练方法、装置、介质及终端设备
CN110651273B (zh) 一种数据处理方法及设备
WO2019128248A1 (zh) 一种信号处理方法及装置
Li et al. Real-time image enhancement with efficient dynamic programming
CN111126557A (zh) 神经网络量化、应用方法、装置和计算设备
CN112418388A (zh) 一种实现深度卷积神经网络处理的方法及装置
US20220405576A1 (en) Multi-layer neural network system and method
WO2021083154A1 (en) Method and apparatus for quantization of neural networks post training
CN115705486A (zh) 量化模型的训练方法、装置、电子设备和可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19819993

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19819993

Country of ref document: EP

Kind code of ref document: A1