CN110363281A - A kind of convolutional neural networks quantization method, device, computer and storage medium - Google Patents

A kind of convolutional neural networks quantization method, device, computer and storage medium Download PDF

Info

Publication number
CN110363281A
CN110363281A CN201910489092.7A CN201910489092A CN110363281A CN 110363281 A CN110363281 A CN 110363281A CN 201910489092 A CN201910489092 A CN 201910489092A CN 110363281 A CN110363281 A CN 110363281A
Authority
CN
China
Prior art keywords
quantization
full precision
network
model
neural networks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910489092.7A
Other languages
Chinese (zh)
Inventor
宋利
周逸伦
陈立
张文军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN201910489092.7A priority Critical patent/CN110363281A/en
Publication of CN110363281A publication Critical patent/CN110363281A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The present invention provides a kind of convolutional neural networks quantization method, in which: the full precision model of training convolutional neural networks to be quantified calculates the standard deviation of described every layer of weight of full precision model and response distribution;The scale factor of the full precision model parameter and feature is estimated according to the standard deviation and hyper parameter of described every layer of weight of full precision model and response distribution;To convolutional neural networks to be optimized, the quantization modules comprising forward calculation and backward gradient communication function based on scale factor are established, obtain quantifying network accordingly;Training is finely adjusted to quantization network, determines the best proportion factor;The quantization network that the re -training best proportion factor generates, obtains final quantization neural network model.The present invention also provides a kind of convolutional neural networks quantization device, computer and storage mediums.Present invention improves existing model quantization methods to realize complicated, the high problem of computation complexity.

Description

A kind of convolutional neural networks quantization method, device, computer and storage medium
Technical field
The present invention relates to a kind of deep neural network compression methods, specifically, being quantified what is involved is a kind of by model Method, apparatus, computer equipment and the storage medium that mode is compressed convolutional neural networks and accelerated.
Background technique
In current computer vision and other technical fields, deep learning has proven to a kind of quite useful side Method all achieves good effect in the tasks such as image classification, target detection, semantic segmentation.Currently, with theoretical continuous Perfect, deep neural network model has the tendency that more, network is deeper, the development of calculation amount more general orientation to joining.With this Meanwhile depth learning technology is also gradually applied to concrete scene by industry, this is just to the volume of model, calculated performance, power consumption Etc. indexs propose strict requirements.
In recent years, the application of neural network has penetrated into many aspects, although it is accurate to improve processing to a certain extent Property, but since neural network includes plurality of layers and quantity of parameters, need very big calculating cost and memory space.Depth nerve Under the premise of the compression and acceleration of network are meant to ensure that existing deep neural network performance is basically unchanged, using model beta pruning, amount The methods of change reduces model storage volume and calculation amount.
Through retrieving, Chinese invention patent application number 201811284341.0, a kind of neural network pressure of the disclosure of the invention Contracting method reduces the connection number of model, that is, reduces in conjunction with pruning method and weight quantization method to neural network model beta pruning The number of parameters of model;Neural network model is quantified, the memory space that weight occupies inside model is reduced.But it is above-mentioned special Benefit in model deployment needs that the weight of quantization is first restored to full precision floating-point without reference to the processing of the quantization to feature Number, then carry out operation.Therefore, computation complexity when above-mentioned patent cannot reduce neural network model reasoning, cannot reach plus The effect of speed.
Summary of the invention
The object of the present invention is to provide a kind of convolutional neural networks quantization method, device and computer equipments, improve existing Convolutional neural networks quantization method realizes complicated, the high problem of computation complexity.
The first object of the present invention provides a kind of convolutional neural networks quantization method, comprising:
S1, the full precision model of training convolutional neural networks to be quantified, calculates the mark of described every layer of weight of full precision model The standard deviation of quasi- difference and described every layer of full precision model response distribution;
S2, according to the standard deviation of described every layer of weight of full precision model, the mark of every layer of the full precision model response distribution Quasi- difference and hyper parameter α, estimate the scale factor s of the full precision Model Weight and responseα;The hyper parameter α is normal greater than 0 Number;
S3: it to convolutional neural networks to be optimized, establishes comprising being based on scale factor sαForward calculation and backward gradient pass The quantization modules of multicast function obtain quantifying network accordingly;
The forward calculation, comprising:
To each full precision floating point values x of the full precision model of convolutional neural networks to be optimized, obtained using quantization function Q To corresponding low accuracy value Q (x);
The backward gradient communication function, comprising:
Using the straight-through estimator of customized gradient, so that gradient is matched with the gradient of the quantization function;
S4: being finely adjusted training to the quantization network that S3 is obtained, and for different hyper parameter α, measures after selection fine tuning Change the best s of neural network accuracyαAs best proportion factor s*
S5: re -training best proportion factor s*The quantization network of generation obtains final quantization neural network model.
Optionally, the best proportion factor s* is determined by sampling to α:
Wherein: α be it needs to be determined that hyper parameter, obtained by way of sampling, be greater than 0 constant;σ is full precision mould The standard deviation of the response distribution of the weight or every layer of full precision model of every layer of type, is calculated by corresponding full precision model; T1It is threshold value relevant with quantization digit;
A series of scale factor s is obtained by way of sampling in a certain range to α, then according to s to full precision Network is quantified, the network of training quantization, determines optimal scale factor s according to the accuracy index of quantization network*
It is optionally, described that corresponding low accuracy value Q (x) is obtained using quantization function Q, in which:
Wherein: x is each full precision floating point values;S is the scale factor of quantization, for original x is zoomed to one The range properly quantified;Round () is floor operation, and a floating number is converted to integer;Clip () is break-in operation, energy The range of limitation quantization postfixed point number.
Optionally, the quantization modules, backward gradient circulation way are truncation:
Wherein: α be it needs to be determined that hyper parameter, obtained by way of sampling, be greater than 0 constant;σ is full precision mould The standard deviation of the response distribution of every layer of every layer of weight of type or full precision model, is calculated by corresponding full precision model;X is Each full precision floating point values;S is the scale factor of above-mentioned quantization, for original x is zoomed to the model properly quantified It encloses.
The above method of the present invention is suitable for the weight and feature of neural network simultaneously, can reduce weight inside model and occupies Memory space, also, low bit fixed-point number operation may be implemented under conditions of hardware supported, to realize acceleration effect.
The second object of the present invention provides a kind of convolutional neural networks quantization device, comprising:
Scale factor estimation module, the module are used to train the full precision model of convolutional neural networks to be quantified, calculate complete The standard deviation of the response distribution of every layer of every layer of weight of accuracy model and full precision model, is weighed according to described every layer of full precision model The standard deviation and hyper parameter α of heavy, described every layer of full precision model of response, estimate the ratio of the full precision Model Weight and response Factor sα;The hyper parameter α is the constant greater than 0;
Quantization modules, the module include following two submodule:
Forward calculation submodule, each full precision of the module to the full precision model of convolutional neural networks to be optimized Floating point values x obtains corresponding low accuracy value Q (x) using quantization function Q;
Backward gradient propagates submodule, this uses the straight-through estimator of customized gradient, so that gradient and the quantization letter Several gradient matchings;
The quantization modules are realized to convolutional neural networks to be optimized and are based on scale factor sαForward calculation and backward ladder Communication function is spent, obtains quantifying network accordingly;
Best proportion factor computing module, the module to the quantization modules establish the quantization network that module obtains into Row fine tuning training, for different hyper parameter α, the best s of quantization neural network accuracy after selection fine tuningαAs best proportion factor s*
Network training module, module re -training best proportion factor s*The quantization network of generation, obtains final amount Change neural network model.
The third object of the present invention, provides a kind of computer, and the computer includes: memory, processor and is stored in On memory and the computer program that can run on a processor, the processor can be used for executing described when executing described program Convolutional neural networks quantization method.
The fourth object of the present invention provides a kind of computer readable storage medium, stores computer program thereon, wherein The program can be used for executing the convolutional neural networks quantization method when being executed by processor.
Compared with prior art, the present invention have it is following at least one the utility model has the advantages that
The embodiment of the present invention can consider the realizability of efficient fixed-point number operation, by floating-point by the way of uniform quantization Quantity turns to low level fixed-point number, and only needs an additional zoom operations.Compared to other methods, the embodiment of the present invention is taken into account The compression effectiveness and acceleration effect of depth convolutional network model.
Compared with now widely used method, the embodiment of the present invention can ensure preferably theoretical acceleration effect, meanwhile, While guaranteeing acceleration effect, convolutional neural networks model performance after quantization is also ensured.
Detailed description of the invention
Upon reading the detailed description of non-limiting embodiments with reference to the following drawings, other feature of the invention, Objects and advantages will become more apparent upon:
Fig. 1 is the flow chart of the method for one embodiment of the invention;
Fig. 2 is the schematic diagram of the forward direction quantization function and reversed gradient function in one embodiment of the invention, wherein the left side is Include the case where positive value and negative value, the right is the case where only including nonnegative value;
Fig. 3 is entirely to quantify network structure in one embodiment of the invention;
Fig. 4 is the system module block diagram in one embodiment of the invention.
Specific embodiment
The present invention is described in detail combined with specific embodiments below.Following embodiment will be helpful to the technology of this field Personnel further understand the present invention, but the invention is not limited in any way.It should be pointed out that the ordinary skill of this field For personnel, without departing from the inventive concept of the premise, various modifications and improvements can be made.These belong to the present invention Protection scope.
Shown in referring to Fig.1, the flow chart of the compression of convolutional neural networks and accelerated method in the embodiment of the present invention specifically may be used Referring to following steps:
S1, the full precision model of training convolutional neural networks to be quantified, calculates the standard deviation of every layer of weight of full precision model With the standard deviation of every layer of full precision model response distribution;Wherein full precision model is full precision deep neural network model, specifically Refer to the convolutional neural networks model of full precision floating number.
In this step, when calculating the standard deviation of every layer of weight of full precision model, the data in full precision model are directly read It is calculated.When calculating the standard deviation of every layer of full precision model response distribution, by the way that one or more groups of training datas are input to In above-mentioned full precision model, the sampled data of above-mentioned every layer of full precision model response is obtained, is then calculated according to sampled data complete The standard deviation of the response distribution of every layer of accuracy model.
S2, according to the standard deviation of every layer of weight of full precision model, every layer of full precision model response distribution standard deviation and Hyper parameter α estimates the scale factor s of the full precision Model Weight and responseα
Following formula (1) can be used in this step:
In above formula, σ is the standard deviation of every layer of weight of full precision model or response distribution, can pass through corresponding full precision model It is calculated.α be it needs to be determined that hyper parameter, optimal α can be determined by way of sampling.
Enable α=Δα,2Δα,…,kΔα, calculate separately corresponding scale factor s.Wherein, ΔαFor the resolution ratio of sampling, k For the number of sampling.
S3 establishes comprising being based on scale factor s convolutional neural networks to be optimizedαForward calculation and backward gradient pass The quantization modules of multicast function obtain quantifying network accordingly;
In this step, forward calculation includes: each full precision to the full precision model of convolutional neural networks to be optimized Floating point values x obtains corresponding low accuracy value Q (x) using quantization function Q.
Specifically, in order to estimate corresponding low level fixed-point number from full precision floating number, for each full precision floating point values X obtains corresponding low accuracy value Q (x) using quantization function Q.Following quantization function Q is used in the embodiment of the present invention, referring to Following formula (2):
Wherein, s is the scale factor of quantization, for original x is zoomed to the range properly quantified.round(.) It is floor operation, a floating number is converted into integer.Clip () is break-in operation, can limit the range of integer after quantization. Wherein:
Wherein, T1And T2Value need by actual conditions discussion.If the parameter and feature after quantization are by nbit integer table Show.Then under normal conditions, T1=2n-1- 1, T2=1-2n-1, the range being worth after quantization is 1-2n-1To 2n-1-1.And as network includes ReLU layers, in the case that model parameter is nonnegative value, T can be enabled1=2n- 1, T2=0.The range being worth after quantization is 0 to 2n- 1。
In the present embodiment, backward gradient communication function, comprising: using the straight-through estimator of customized gradient, so that gradient It is matched with the gradient of quantization function;
Wherein: α be it needs to be determined that hyper parameter, obtained by way of sampling, be greater than 0 constant;σ is full precision mould Every layer of weight of type or the standard deviation of every layer of full precision model response distribution, are calculated by corresponding full precision model;X is every One full precision floating point values.
The above-mentioned quantization function of the embodiment of the present invention is suitable for the quantization of neural network weight and response simultaneously.Therefore, originally Inventive embodiments can reduce the memory space that weight occupies inside model.Also, it may be implemented under conditions of hardware supported Low bit fixed-point number operation, to realize acceleration effect.Meanwhile the quantization function of the embodiment of the present invention is using uniform quantization Mode, it is ensured that the efficiency of quantization and the precision of low bit fixed-point number operation.
Since the round operation of quantization function can not be led, when training quantifies network, can not be asked using automatic gradient It leads, needs customized gradient.Common method is using STE (StraightThroughEstimator leads directly to estimator):
Wherein: Q is example quantization function of the present invention.X is the input of quantization function Q.It is quantization function Q for inputting X Partial derivative.
Since quantization function is operated comprising clip, for that, by the X of Clip () operation truncation, can be generated in quantization function Gradient mismatch problem.It in backpropagation, needs that gradient is truncated, so specifically adopting above-mentioned formula (4) Lai Shixian, grasp It is referred to shown in Fig. 2.
S4: being finely adjusted training to the quantization network that S3 is obtained, and for different hyper parameter α, quantifies net after selection fine tuning The best s of network precisionαAs best proportion factor s*
In this step, the corresponding quantization modules of each scale factor s are inserted into full precision model to be quantified, are quantified Network.The less fine tuning training of number is iterated to quantization network and obtains quantitative model.Above-mentioned quantitative model is assessed, The best quantitative model of effect and corresponding best proportion factor s can be obtained*
Specifically, can be using minimum quantization error EQMethod estimate optimal scale factor s*.According to this implementation The aforementioned quantization function of example, it is known that quantization error is operated by round and clip operates generation:
Wherein, ERBring error, E are operated for roundCBring error is operated for clip.To make final quantization error Minimum needs to balance between round error and clip error, and the error for guaranteeing that any one operation generates is not excessive.By In the actual Unknown Distribution of model parameter, distribution approximation can be regarded as Gaussian Profile and analyzed.Such as by unknown distribution of model parameters There is identical distribution, then it is believed that above-mentioned minimization problem reaches identical truncation ratio (clipping rate) in every layer of distribution When reach minimum.Have P α σ≤x≤α σ for any α > 0 for 0 mean value Gaussian Profile) it is definite value.α σ is indicated herein Interceptive value needs the maximum value greater than quantized value
ασ≥max(Q(x))
Specific quantization function is brought into, is obtained
When taking equal sign, available optimal scale factor s*.
The value of α is such as 1,2,3,4,5,6 in embodiment, is finely tuned by a small amount of finetune and restores quantitative model Precision, while saving operand can effect substantially after de-quantization, then more preliminary result obtains optimal s again*
The present embodiment obtains a series of s by way of sampling in a certain range to α, then according to s to full precision Network is quantified, the network of training quantization.Optimal scale factor s is determined according to the accuracy index of quantization network*.This reality The best proportion factor s* for applying example determines that method can effectively reduce the complexity of neural network quantization parameter.Corresponding quantization ginseng The optimal hyper parameter α that an optimal zoom factor is reduced to search whole network is searched for from every layer in number search space.Meanwhile this Embodiment can guarantee that the quantization error of quantitative model is minimum, to guarantee higher quantification effect.
S5: re -training best proportion factor s*The quantization network of generation obtains final quantization neural network model.Tool Body, it can be by above-mentioned s*Corresponding quantization modules are inserted into network to be quantified, are completely trained, it is best to finally obtain effect Quantization neural network model.
As shown in figure 4, corresponding to above-mentioned method, the embodiment of the present invention also provides a kind of convolutional neural networks quantization dress It sets, comprising:
Scale factor estimation module, the module are used to train the full precision model of convolutional neural networks to be quantified, calculate complete The standard deviation of every layer of accuracy model estimates that the full precision model is joined according to every layer of full precision model of standard deviation and hyper parameter α Several and feature scale factor sα;Hyper parameter α is the constant greater than 0;
Quantization modules, the module include following two submodule:
Forward calculation submodule, each full precision of the module to the full precision model of convolutional neural networks to be optimized Floating point values x obtains corresponding low accuracy value Q (x) using quantization function Q;
Backward gradient propagates submodule, this uses the straight-through estimator of customized gradient, so that gradient and quantization function Gradient matching;
Quantization modules are realized to convolutional neural networks to be optimized and are based on scale factor sαForward calculation and backward gradient pass Multicast function obtains quantifying network accordingly;
Best proportion factor computing module, the module establish the quantization network that module obtains to quantization modules and are finely adjusted instruction Practice, for different hyper parameter α, the best s of quantization neural network accuracy after selection fine tuningαAs best proportion factor s*
Network training module, module re -training best proportion factor s*The quantization network of generation, obtains final amount Change neural network model.
Technology in above-described embodiment in each module of convolutional neural networks quantization device can use above-mentioned convolutional Neural net Network quantization method corresponds to the realization of the technology in step, and details are not described herein.
In an alternative embodiment of the invention, a kind of computer equipment is also provided, the computer equipment include: memory, Processor and storage on a memory and the computer program that can run on a processor, when the processor execution described program It can be used for executing the convolutional neural networks quantization method in above-described embodiment.
In an alternative embodiment of the invention, a kind of computer readable storage medium is also provided, stores computer program thereon, The program can be used for executing convolutional neural networks quantization method in above-described embodiment when being executed by processor.It is retouched based on above-mentioned It states, in a concrete application embodiment of the invention, VGG-Small network is quantified using the above method:
VGG-Small used in the embodiment of the present invention includes 6 convolutional layers and a full articulamentum.The port number of convolutional layer Respectively 128,128,256,256,512,512.Convolution kernel size is 3x3.Every two convolutional layer executes a pond MAX, And it added batch normalization (BatchNormalization) operation and characteristic range constrained.Embodiment is to 5 after network Convolutional layer is quantified, and keeps full precision to the 1st convolutional layer and full articulamentum.Entire quantization network structure is as shown in Figure 3. Wherein Data indicates that input data layer, Conv indicate that convolutional layer, BatchNorm indicate batch normalization layer, QuantizationConv indicates to apply the convolutional layer of quantization method of the present invention.MaxPool indicates maximum pond layer, InnerProduct indicates full articulamentum.K after convolutional layer, pond layer is convolution kernel size, and s indicates stride size.Convolution N after layer indicates output channel number.N after full articulamentum indicates output node number.
Above-mentioned implementation condition and outcome evaluation:
Code is realized in embodiment can be completed by C++ and python, and deep learning frame uses Caffe.Training process In, the batch size of each iteration of selection is 100, and optimization method selects SGD+momentum, and learning rate is originated by 0.01, with The increase of the number of iterations constantly reduce, reach 2 × 10 after 100,000 iteration-5.Parameter setting in objective function, λ are set It is set to 104, α is set as 105
The evaluation index of deep neural network model compression generally uses the evaluation index of former network, to reflect depth nerve The situation of change of network model performance after being compressed.Meanwhile can also using deep neural network model compression multiple or Bit number after model compression reflects compression and the acceleration effect of deep neural network model.In general, depth nerve net The acceleration effect of network model more attracts attention, therefore uses the compressed bit number of deep neural network model and quantization here Network performance is as evaluation index.
The Contrast on effect of 1 embodiment of the present invention of table and existing method on VGG-Small network, Cifar-10 data set
The embodiment of the present invention improves existing model quantization method and realizes complexity it can be seen from above-described embodiment effect, The high problem of computation complexity has taken into account the compression effectiveness and acceleration effect of depth convolutional network model.
It should be noted that the step in the method provided by the invention, can use corresponding mould in described device The step of block, unit etc. are achieved, and the technical solution that those skilled in the art are referred to the system realizes the method Process, that is, the embodiment of described device can be regarded as realizing the preference of the method, and it will not be described here.
One skilled in the art will appreciate that in addition to realizing device provided by the invention in a manner of pure computer readable program code In addition, completely can by by method and step carry out programming in logic come so that each device provided by the invention with logic gate, open The form of pass, specific integrated circuit, programmable logic controller (PLC) and embedded microcontroller etc. realizes identical function.Institute A kind of hardware component is considered with, device provided by the invention, and to including for realizing various functions in it Device can also be considered as the structure in hardware component;It can also will be considered as realizing the device of various functions either realizing The software module of method can be the structure in hardware component again.
Specific embodiments of the present invention are described above.It is to be appreciated that the invention is not limited to above-mentioned Particular implementation, those skilled in the art can make a variety of changes or modify within the scope of the claims, this not shadow Ring substantive content of the invention.

Claims (10)

1. a kind of convolutional neural networks quantization method characterized by comprising
S1, the full precision model of training convolutional neural networks to be quantified, calculates the standard deviation of described every layer of weight of full precision model With the standard deviation of described every layer of full precision model response distribution;
S2, according to the standard deviation of described every layer of weight of full precision model, the standard deviation of every layer of the full precision model response distribution With hyper parameter α, the scale factor s of the full precision Model Weight and response is estimatedα;The hyper parameter α is the constant greater than 0;
S3 establishes comprising being based on scale factor s convolutional neural networks to be optimizedαForward calculation and backward gradient communication function Quantization modules, obtain quantifying network accordingly;
The forward calculation, comprising: to each full precision floating point values x of the full precision model of convolutional neural networks to be optimized, Corresponding low accuracy value Q (x) is obtained using quantization function Q;
The backward gradient communication function, comprising: using the straight-through estimator of customized gradient, so that gradient and the quantization letter Several gradient matchings;
S4: being finely adjusted training to the quantization network that S3 is obtained, and for different hyper parameter α, quantifies net after selection fine tuning The best s of network precisionαAs best proportion factor s*
S5: re -training best proportion factor s*The quantization network of generation obtains final quantization neural network model.
2. convolutional neural networks quantization method according to claim 1, it is characterised in that: the best proportion factor s* is logical It crosses to sample α and determine:
Wherein: α be it needs to be determined that hyper parameter, obtained by way of sampling, be greater than 0 constant;σ is that full precision model is every Layer weight or the standard deviation of every layer of full precision model response distribution, are calculated by corresponding full precision model;T1It is and measures Change the relevant threshold value of digit.
3. convolutional neural networks quantization method according to claim 2, it is characterised in that: by α in a certain range The mode of sampling obtains a series of scale factor s, is then quantified according to s to full precision network, the network of training quantization, Optimal scale factor s is determined according to the accuracy index of quantization network*
4. convolutional neural networks quantization method according to claim 1, it is characterised in that: described to be obtained using quantization function Q To corresponding low accuracy value Q (x), in which:
Wherein: x is each full precision floating point values;S is the scale factor of quantization;Round () is floor operation, floating by one Points are converted to integer;Clip () is break-in operation, can limit the range of quantization postfixed point number.
5. convolutional neural networks model quantization method according to claim 1, it is characterised in that: the quantization modules, Backward gradient circulation way is truncation:
Wherein: α be it needs to be determined that hyper parameter, obtained by way of sampling, be greater than 0 constant;σ is that full precision model is every Layer weight or the standard deviation of every layer of full precision model response distribution;X is each full precision floating point values;S is the ratio of above-mentioned quantization The example factor.
6. a kind of convolutional neural networks quantization device, it is characterised in that: include:
Scale factor estimation module, the module are used to train the full precision model of convolutional neural networks to be quantified, calculate full precision The standard deviation of every layer of model, according to the standard deviation of described every layer of weight of full precision model, every layer of the full precision model response point The standard deviation and hyper parameter α of cloth, estimate the scale factor s of the full precision Model Weight and responseα;The hyper parameter α is greater than 0 Constant;
Quantization modules, the module include following two submodule:
Forward calculation submodule, each the full precision floating-point of the module to the full precision model of convolutional neural networks to be optimized Value x obtains corresponding low accuracy value Q (x) using quantization function Q;
Backward gradient propagates submodule, this uses the straight-through estimator of customized gradient, so that gradient and the quantization function Gradient matching;
The quantization modules are realized to convolutional neural networks to be optimized and are based on scale factor sαForward calculation and backward gradient pass Multicast function obtains quantifying network accordingly;
Best proportion factor computing module, it is micro- which establishes the quantization network progress that module obtains to the quantization modules Training is adjusted, for different hyper parameter α, the best s of quantization neural network accuracy after selection fine tuningαAs best proportion factor s*
Network training module, module re -training best proportion factor s*The quantization network of generation obtains final quantization nerve Network model.
7. convolutional neural networks quantization device according to claim 6, it is characterised in that: the best proportion factor calculates Module, wherein best proportion factor s* is determined by sampling to α:
Wherein: α be it needs to be determined that hyper parameter, obtained by way of sampling, be greater than 0 constant;σ is that full precision model is every Layer weight or the standard deviation of every layer of full precision model response distribution, are calculated by corresponding full precision model;T1 is and measures Change the relevant threshold value of digit;
A series of scale factor s is obtained by way of sampling in a certain range to α, then according to s to full precision network Quantified, the network of training quantization, optimal scale factor s is determined according to the accuracy index of quantization network*
8. convolutional neural networks quantization device according to claim 6, it is characterised in that: the quantization modules, in which:
The forward calculation submodule obtains corresponding low accuracy value Q (x) using quantization function Q, in which:
Wherein: x is each full precision floating point values;S is the scale factor of quantization;Round () is floor operation, floating by one Points are converted to integer;Clip () is break-in operation, can limit the range of quantization postfixed point number;
The backward gradient propagates submodule, and backward gradient circulation way is truncation:
Wherein: α be it needs to be determined that hyper parameter, obtained by way of sampling, be greater than 0 constant;σ is that full precision model is every Layer weight or the standard deviation of every layer of full precision model response distribution;X is each full precision floating point values;S is the ratio of above-mentioned quantization The example factor.
9. a kind of computer, comprising: memory, processor and storage are on a memory and the computer that can run on a processor Program, it is characterised in that: the processor can be used for perform claim and require any one of 1-5 convolution mind when executing described program Through network quantization method.
10. a kind of computer readable storage medium, stores computer program thereon, it is characterised in that: the program is held by processor It can be used for perform claim when row and require any one of the 1-5 convolutional neural networks quantization method.
CN201910489092.7A 2019-06-06 2019-06-06 A kind of convolutional neural networks quantization method, device, computer and storage medium Pending CN110363281A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910489092.7A CN110363281A (en) 2019-06-06 2019-06-06 A kind of convolutional neural networks quantization method, device, computer and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910489092.7A CN110363281A (en) 2019-06-06 2019-06-06 A kind of convolutional neural networks quantization method, device, computer and storage medium

Publications (1)

Publication Number Publication Date
CN110363281A true CN110363281A (en) 2019-10-22

Family

ID=68215700

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910489092.7A Pending CN110363281A (en) 2019-06-06 2019-06-06 A kind of convolutional neural networks quantization method, device, computer and storage medium

Country Status (1)

Country Link
CN (1) CN110363281A (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110889204A (en) * 2019-11-06 2020-03-17 北京中科胜芯科技有限公司 Neural network model parameter compression method
CN110942148A (en) * 2019-12-11 2020-03-31 北京工业大学 Adaptive asymmetric quantization deep neural network model compression method
CN111275059A (en) * 2020-02-26 2020-06-12 腾讯科技(深圳)有限公司 Image processing method and device and computer readable storage medium
CN111582229A (en) * 2020-05-21 2020-08-25 中国科学院空天信息创新研究院 Network self-adaptive semi-precision quantized image processing method and system
CN111612147A (en) * 2020-06-30 2020-09-01 上海富瀚微电子股份有限公司 Quantization method of deep convolutional network
CN111768002A (en) * 2020-07-10 2020-10-13 南开大学 Deep neural network quantization method based on elastic significance
CN111814448A (en) * 2020-07-03 2020-10-23 苏州思必驰信息科技有限公司 Method and device for quantizing pre-training language model
CN112101524A (en) * 2020-09-07 2020-12-18 上海交通大学 Method and system for on-line switching bit width quantization neural network
CN112381205A (en) * 2020-09-29 2021-02-19 北京清微智能科技有限公司 Neural network low bit quantization method
CN112396042A (en) * 2021-01-20 2021-02-23 鹏城实验室 Real-time updated target detection method and system, and computer-readable storage medium
CN112580805A (en) * 2020-12-25 2021-03-30 三星(中国)半导体有限公司 Method and device for quantizing neural network model
CN112651500A (en) * 2020-12-30 2021-04-13 深圳金三立视频科技股份有限公司 Method for generating quantization model and terminal
CN112884146A (en) * 2021-02-25 2021-06-01 香港理工大学深圳研究院 Method and system for training model based on data quantization and hardware acceleration
CN113269320A (en) * 2020-02-14 2021-08-17 阿里巴巴集团控股有限公司 Processing unit, computing device, system on chip, data center and related methods
CN113496274A (en) * 2020-03-20 2021-10-12 郑桂忠 Quantification method and system based on operation circuit architecture in memory
CN113537511A (en) * 2021-07-14 2021-10-22 中国科学技术大学 Automatic gradient quantization federal learning framework and method
CN113762496A (en) * 2020-06-04 2021-12-07 合肥君正科技有限公司 Method for reducing inference operation complexity of low-bit convolutional neural network
CN113762499A (en) * 2020-06-04 2021-12-07 合肥君正科技有限公司 Method for quantizing weight by channels
WO2022012233A1 (en) * 2020-07-15 2022-01-20 安徽寒武纪信息科技有限公司 Method and computing apparatus for quantification calibration, and computer-readable storage medium
WO2022021834A1 (en) * 2020-07-29 2022-02-03 北京迈格威科技有限公司 Neural network model determination method and apparatus, and electronic device, and medium, and product
CN114677548A (en) * 2022-05-26 2022-06-28 之江实验室 Neural network image classification system and method based on resistive random access memory
CN117196418A (en) * 2023-11-08 2023-12-08 江西师范大学 Reading teaching quality assessment method and system based on artificial intelligence

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110889204A (en) * 2019-11-06 2020-03-17 北京中科胜芯科技有限公司 Neural network model parameter compression method
CN110889204B (en) * 2019-11-06 2021-07-30 北京中科胜芯科技有限公司 Neural network model parameter compression method
CN110942148A (en) * 2019-12-11 2020-03-31 北京工业大学 Adaptive asymmetric quantization deep neural network model compression method
CN113269320A (en) * 2020-02-14 2021-08-17 阿里巴巴集团控股有限公司 Processing unit, computing device, system on chip, data center and related methods
CN111275059A (en) * 2020-02-26 2020-06-12 腾讯科技(深圳)有限公司 Image processing method and device and computer readable storage medium
CN113496274A (en) * 2020-03-20 2021-10-12 郑桂忠 Quantification method and system based on operation circuit architecture in memory
CN111582229A (en) * 2020-05-21 2020-08-25 中国科学院空天信息创新研究院 Network self-adaptive semi-precision quantized image processing method and system
CN113762499A (en) * 2020-06-04 2021-12-07 合肥君正科技有限公司 Method for quantizing weight by channels
CN113762499B (en) * 2020-06-04 2024-04-02 合肥君正科技有限公司 Method for quantizing weights by using multiple channels
CN113762496A (en) * 2020-06-04 2021-12-07 合肥君正科技有限公司 Method for reducing inference operation complexity of low-bit convolutional neural network
CN111612147A (en) * 2020-06-30 2020-09-01 上海富瀚微电子股份有限公司 Quantization method of deep convolutional network
CN111814448B (en) * 2020-07-03 2024-01-16 思必驰科技股份有限公司 Pre-training language model quantization method and device
CN111814448A (en) * 2020-07-03 2020-10-23 苏州思必驰信息科技有限公司 Method and device for quantizing pre-training language model
CN111768002A (en) * 2020-07-10 2020-10-13 南开大学 Deep neural network quantization method based on elastic significance
WO2022012233A1 (en) * 2020-07-15 2022-01-20 安徽寒武纪信息科技有限公司 Method and computing apparatus for quantification calibration, and computer-readable storage medium
WO2022021834A1 (en) * 2020-07-29 2022-02-03 北京迈格威科技有限公司 Neural network model determination method and apparatus, and electronic device, and medium, and product
CN112101524A (en) * 2020-09-07 2020-12-18 上海交通大学 Method and system for on-line switching bit width quantization neural network
CN112381205A (en) * 2020-09-29 2021-02-19 北京清微智能科技有限公司 Neural network low bit quantization method
CN112580805A (en) * 2020-12-25 2021-03-30 三星(中国)半导体有限公司 Method and device for quantizing neural network model
CN112651500A (en) * 2020-12-30 2021-04-13 深圳金三立视频科技股份有限公司 Method for generating quantization model and terminal
CN112396042A (en) * 2021-01-20 2021-02-23 鹏城实验室 Real-time updated target detection method and system, and computer-readable storage medium
CN112884146A (en) * 2021-02-25 2021-06-01 香港理工大学深圳研究院 Method and system for training model based on data quantization and hardware acceleration
CN112884146B (en) * 2021-02-25 2024-02-13 香港理工大学深圳研究院 Method and system for training model based on data quantization and hardware acceleration
CN113537511A (en) * 2021-07-14 2021-10-22 中国科学技术大学 Automatic gradient quantization federal learning framework and method
CN113537511B (en) * 2021-07-14 2023-06-20 中国科学技术大学 Automatic gradient quantization federal learning device and method
CN114677548A (en) * 2022-05-26 2022-06-28 之江实验室 Neural network image classification system and method based on resistive random access memory
CN114677548B (en) * 2022-05-26 2022-10-14 之江实验室 Neural network image classification system and method based on resistive random access memory
CN117196418A (en) * 2023-11-08 2023-12-08 江西师范大学 Reading teaching quality assessment method and system based on artificial intelligence
CN117196418B (en) * 2023-11-08 2024-02-02 江西师范大学 Reading teaching quality assessment method and system based on artificial intelligence

Similar Documents

Publication Publication Date Title
CN110363281A (en) A kind of convolutional neural networks quantization method, device, computer and storage medium
Ding et al. Regularizing activation distribution for training binarized deep networks
US11270187B2 (en) Method and apparatus for learning low-precision neural network that combines weight quantization and activation quantization
JP2019513265A (en) Method and apparatus for automatic multi-threshold feature filtering
WO2020119318A1 (en) Self-adaptive selection and design method for convolutional-layer hardware accelerator
CN111612147A (en) Quantization method of deep convolutional network
Qin et al. Distribution-sensitive information retention for accurate binary neural network
Raposo et al. Positnn: Training deep neural networks with mixed low-precision posit
WO2024021624A1 (en) Deep learning-based method and apparatus for skyline query cardinality estimation
CN113935489A (en) Variational quantum model TFQ-VQA based on quantum neural network and two-stage optimization method thereof
CN110852417A (en) Single-depth neural network model robustness improving method for application of Internet of things
CN115238893A (en) Neural network model quantification method and device for natural language processing
CN114239799A (en) Efficient target detection method, device, medium and system
WO2022241932A1 (en) Prediction method based on non-intrusive attention preprocessing process and bilstm model
Yu et al. Boosted dynamic neural networks
CN113743593B (en) Neural network quantization method, system, storage medium and terminal
CN116579408A (en) Model pruning method and system based on redundancy of model structure
US20220207374A1 (en) Mixed-granularity-based joint sparse method for neural network
CN113920124A (en) Brain neuron iterative segmentation method based on segmentation and error guidance
CN113158134A (en) Method and device for constructing non-invasive load identification model and storage medium
CN112668639A (en) Model training method and device, server and storage medium
CN115034388B (en) Determination method and device for quantization parameters of ranking model and electronic equipment
CN113160795B (en) Language feature extraction model training method, device, equipment and storage medium
Ma et al. AxBy-ViT: Reconfigurable Approximate Computation Bypass for Vision Transformers
CN115543911B (en) Method for calculating computing capacity of heterogeneous computing equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191022

RJ01 Rejection of invention patent application after publication