CN110363281A - A kind of convolutional neural networks quantization method, device, computer and storage medium - Google Patents
A kind of convolutional neural networks quantization method, device, computer and storage medium Download PDFInfo
- Publication number
- CN110363281A CN110363281A CN201910489092.7A CN201910489092A CN110363281A CN 110363281 A CN110363281 A CN 110363281A CN 201910489092 A CN201910489092 A CN 201910489092A CN 110363281 A CN110363281 A CN 110363281A
- Authority
- CN
- China
- Prior art keywords
- quantization
- full precision
- network
- model
- neural networks
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Abstract
The present invention provides a kind of convolutional neural networks quantization method, in which: the full precision model of training convolutional neural networks to be quantified calculates the standard deviation of described every layer of weight of full precision model and response distribution;The scale factor of the full precision model parameter and feature is estimated according to the standard deviation and hyper parameter of described every layer of weight of full precision model and response distribution;To convolutional neural networks to be optimized, the quantization modules comprising forward calculation and backward gradient communication function based on scale factor are established, obtain quantifying network accordingly;Training is finely adjusted to quantization network, determines the best proportion factor;The quantization network that the re -training best proportion factor generates, obtains final quantization neural network model.The present invention also provides a kind of convolutional neural networks quantization device, computer and storage mediums.Present invention improves existing model quantization methods to realize complicated, the high problem of computation complexity.
Description
Technical field
The present invention relates to a kind of deep neural network compression methods, specifically, being quantified what is involved is a kind of by model
Method, apparatus, computer equipment and the storage medium that mode is compressed convolutional neural networks and accelerated.
Background technique
In current computer vision and other technical fields, deep learning has proven to a kind of quite useful side
Method all achieves good effect in the tasks such as image classification, target detection, semantic segmentation.Currently, with theoretical continuous
Perfect, deep neural network model has the tendency that more, network is deeper, the development of calculation amount more general orientation to joining.With this
Meanwhile depth learning technology is also gradually applied to concrete scene by industry, this is just to the volume of model, calculated performance, power consumption
Etc. indexs propose strict requirements.
In recent years, the application of neural network has penetrated into many aspects, although it is accurate to improve processing to a certain extent
Property, but since neural network includes plurality of layers and quantity of parameters, need very big calculating cost and memory space.Depth nerve
Under the premise of the compression and acceleration of network are meant to ensure that existing deep neural network performance is basically unchanged, using model beta pruning, amount
The methods of change reduces model storage volume and calculation amount.
Through retrieving, Chinese invention patent application number 201811284341.0, a kind of neural network pressure of the disclosure of the invention
Contracting method reduces the connection number of model, that is, reduces in conjunction with pruning method and weight quantization method to neural network model beta pruning
The number of parameters of model;Neural network model is quantified, the memory space that weight occupies inside model is reduced.But it is above-mentioned special
Benefit in model deployment needs that the weight of quantization is first restored to full precision floating-point without reference to the processing of the quantization to feature
Number, then carry out operation.Therefore, computation complexity when above-mentioned patent cannot reduce neural network model reasoning, cannot reach plus
The effect of speed.
Summary of the invention
The object of the present invention is to provide a kind of convolutional neural networks quantization method, device and computer equipments, improve existing
Convolutional neural networks quantization method realizes complicated, the high problem of computation complexity.
The first object of the present invention provides a kind of convolutional neural networks quantization method, comprising:
S1, the full precision model of training convolutional neural networks to be quantified, calculates the mark of described every layer of weight of full precision model
The standard deviation of quasi- difference and described every layer of full precision model response distribution;
S2, according to the standard deviation of described every layer of weight of full precision model, the mark of every layer of the full precision model response distribution
Quasi- difference and hyper parameter α, estimate the scale factor s of the full precision Model Weight and responseα;The hyper parameter α is normal greater than 0
Number;
S3: it to convolutional neural networks to be optimized, establishes comprising being based on scale factor sαForward calculation and backward gradient pass
The quantization modules of multicast function obtain quantifying network accordingly;
The forward calculation, comprising:
To each full precision floating point values x of the full precision model of convolutional neural networks to be optimized, obtained using quantization function Q
To corresponding low accuracy value Q (x);
The backward gradient communication function, comprising:
Using the straight-through estimator of customized gradient, so that gradient is matched with the gradient of the quantization function;
S4: being finely adjusted training to the quantization network that S3 is obtained, and for different hyper parameter α, measures after selection fine tuning
Change the best s of neural network accuracyαAs best proportion factor s*;
S5: re -training best proportion factor s*The quantization network of generation obtains final quantization neural network model.
Optionally, the best proportion factor s* is determined by sampling to α:
Wherein: α be it needs to be determined that hyper parameter, obtained by way of sampling, be greater than 0 constant;σ is full precision mould
The standard deviation of the response distribution of the weight or every layer of full precision model of every layer of type, is calculated by corresponding full precision model;
T1It is threshold value relevant with quantization digit;
A series of scale factor s is obtained by way of sampling in a certain range to α, then according to s to full precision
Network is quantified, the network of training quantization, determines optimal scale factor s according to the accuracy index of quantization network*。
It is optionally, described that corresponding low accuracy value Q (x) is obtained using quantization function Q, in which:
Wherein: x is each full precision floating point values;S is the scale factor of quantization, for original x is zoomed to one
The range properly quantified;Round () is floor operation, and a floating number is converted to integer;Clip () is break-in operation, energy
The range of limitation quantization postfixed point number.
Optionally, the quantization modules, backward gradient circulation way are truncation:
Wherein: α be it needs to be determined that hyper parameter, obtained by way of sampling, be greater than 0 constant;σ is full precision mould
The standard deviation of the response distribution of every layer of every layer of weight of type or full precision model, is calculated by corresponding full precision model;X is
Each full precision floating point values;S is the scale factor of above-mentioned quantization, for original x is zoomed to the model properly quantified
It encloses.
The above method of the present invention is suitable for the weight and feature of neural network simultaneously, can reduce weight inside model and occupies
Memory space, also, low bit fixed-point number operation may be implemented under conditions of hardware supported, to realize acceleration effect.
The second object of the present invention provides a kind of convolutional neural networks quantization device, comprising:
Scale factor estimation module, the module are used to train the full precision model of convolutional neural networks to be quantified, calculate complete
The standard deviation of the response distribution of every layer of every layer of weight of accuracy model and full precision model, is weighed according to described every layer of full precision model
The standard deviation and hyper parameter α of heavy, described every layer of full precision model of response, estimate the ratio of the full precision Model Weight and response
Factor sα;The hyper parameter α is the constant greater than 0;
Quantization modules, the module include following two submodule:
Forward calculation submodule, each full precision of the module to the full precision model of convolutional neural networks to be optimized
Floating point values x obtains corresponding low accuracy value Q (x) using quantization function Q;
Backward gradient propagates submodule, this uses the straight-through estimator of customized gradient, so that gradient and the quantization letter
Several gradient matchings;
The quantization modules are realized to convolutional neural networks to be optimized and are based on scale factor sαForward calculation and backward ladder
Communication function is spent, obtains quantifying network accordingly;
Best proportion factor computing module, the module to the quantization modules establish the quantization network that module obtains into
Row fine tuning training, for different hyper parameter α, the best s of quantization neural network accuracy after selection fine tuningαAs best proportion factor s*;
Network training module, module re -training best proportion factor s*The quantization network of generation, obtains final amount
Change neural network model.
The third object of the present invention, provides a kind of computer, and the computer includes: memory, processor and is stored in
On memory and the computer program that can run on a processor, the processor can be used for executing described when executing described program
Convolutional neural networks quantization method.
The fourth object of the present invention provides a kind of computer readable storage medium, stores computer program thereon, wherein
The program can be used for executing the convolutional neural networks quantization method when being executed by processor.
Compared with prior art, the present invention have it is following at least one the utility model has the advantages that
The embodiment of the present invention can consider the realizability of efficient fixed-point number operation, by floating-point by the way of uniform quantization
Quantity turns to low level fixed-point number, and only needs an additional zoom operations.Compared to other methods, the embodiment of the present invention is taken into account
The compression effectiveness and acceleration effect of depth convolutional network model.
Compared with now widely used method, the embodiment of the present invention can ensure preferably theoretical acceleration effect, meanwhile,
While guaranteeing acceleration effect, convolutional neural networks model performance after quantization is also ensured.
Detailed description of the invention
Upon reading the detailed description of non-limiting embodiments with reference to the following drawings, other feature of the invention,
Objects and advantages will become more apparent upon:
Fig. 1 is the flow chart of the method for one embodiment of the invention;
Fig. 2 is the schematic diagram of the forward direction quantization function and reversed gradient function in one embodiment of the invention, wherein the left side is
Include the case where positive value and negative value, the right is the case where only including nonnegative value;
Fig. 3 is entirely to quantify network structure in one embodiment of the invention;
Fig. 4 is the system module block diagram in one embodiment of the invention.
Specific embodiment
The present invention is described in detail combined with specific embodiments below.Following embodiment will be helpful to the technology of this field
Personnel further understand the present invention, but the invention is not limited in any way.It should be pointed out that the ordinary skill of this field
For personnel, without departing from the inventive concept of the premise, various modifications and improvements can be made.These belong to the present invention
Protection scope.
Shown in referring to Fig.1, the flow chart of the compression of convolutional neural networks and accelerated method in the embodiment of the present invention specifically may be used
Referring to following steps:
S1, the full precision model of training convolutional neural networks to be quantified, calculates the standard deviation of every layer of weight of full precision model
With the standard deviation of every layer of full precision model response distribution;Wherein full precision model is full precision deep neural network model, specifically
Refer to the convolutional neural networks model of full precision floating number.
In this step, when calculating the standard deviation of every layer of weight of full precision model, the data in full precision model are directly read
It is calculated.When calculating the standard deviation of every layer of full precision model response distribution, by the way that one or more groups of training datas are input to
In above-mentioned full precision model, the sampled data of above-mentioned every layer of full precision model response is obtained, is then calculated according to sampled data complete
The standard deviation of the response distribution of every layer of accuracy model.
S2, according to the standard deviation of every layer of weight of full precision model, every layer of full precision model response distribution standard deviation and
Hyper parameter α estimates the scale factor s of the full precision Model Weight and responseα;
Following formula (1) can be used in this step:
In above formula, σ is the standard deviation of every layer of weight of full precision model or response distribution, can pass through corresponding full precision model
It is calculated.α be it needs to be determined that hyper parameter, optimal α can be determined by way of sampling.
Enable α=Δα,2Δα,…,kΔα, calculate separately corresponding scale factor s.Wherein, ΔαFor the resolution ratio of sampling, k
For the number of sampling.
S3 establishes comprising being based on scale factor s convolutional neural networks to be optimizedαForward calculation and backward gradient pass
The quantization modules of multicast function obtain quantifying network accordingly;
In this step, forward calculation includes: each full precision to the full precision model of convolutional neural networks to be optimized
Floating point values x obtains corresponding low accuracy value Q (x) using quantization function Q.
Specifically, in order to estimate corresponding low level fixed-point number from full precision floating number, for each full precision floating point values
X obtains corresponding low accuracy value Q (x) using quantization function Q.Following quantization function Q is used in the embodiment of the present invention, referring to
Following formula (2):
Wherein, s is the scale factor of quantization, for original x is zoomed to the range properly quantified.round(.)
It is floor operation, a floating number is converted into integer.Clip () is break-in operation, can limit the range of integer after quantization.
Wherein:
Wherein, T1And T2Value need by actual conditions discussion.If the parameter and feature after quantization are by nbit integer table
Show.Then under normal conditions, T1=2n-1- 1, T2=1-2n-1, the range being worth after quantization is 1-2n-1To 2n-1-1.And as network includes
ReLU layers, in the case that model parameter is nonnegative value, T can be enabled1=2n- 1, T2=0.The range being worth after quantization is 0 to 2n-
1。
In the present embodiment, backward gradient communication function, comprising: using the straight-through estimator of customized gradient, so that gradient
It is matched with the gradient of quantization function;
Wherein: α be it needs to be determined that hyper parameter, obtained by way of sampling, be greater than 0 constant;σ is full precision mould
Every layer of weight of type or the standard deviation of every layer of full precision model response distribution, are calculated by corresponding full precision model;X is every
One full precision floating point values.
The above-mentioned quantization function of the embodiment of the present invention is suitable for the quantization of neural network weight and response simultaneously.Therefore, originally
Inventive embodiments can reduce the memory space that weight occupies inside model.Also, it may be implemented under conditions of hardware supported
Low bit fixed-point number operation, to realize acceleration effect.Meanwhile the quantization function of the embodiment of the present invention is using uniform quantization
Mode, it is ensured that the efficiency of quantization and the precision of low bit fixed-point number operation.
Since the round operation of quantization function can not be led, when training quantifies network, can not be asked using automatic gradient
It leads, needs customized gradient.Common method is using STE (StraightThroughEstimator leads directly to estimator):
Wherein: Q is example quantization function of the present invention.X is the input of quantization function Q.It is quantization function Q for inputting X
Partial derivative.
Since quantization function is operated comprising clip, for that, by the X of Clip () operation truncation, can be generated in quantization function
Gradient mismatch problem.It in backpropagation, needs that gradient is truncated, so specifically adopting above-mentioned formula (4) Lai Shixian, grasp
It is referred to shown in Fig. 2.
S4: being finely adjusted training to the quantization network that S3 is obtained, and for different hyper parameter α, quantifies net after selection fine tuning
The best s of network precisionαAs best proportion factor s*;
In this step, the corresponding quantization modules of each scale factor s are inserted into full precision model to be quantified, are quantified
Network.The less fine tuning training of number is iterated to quantization network and obtains quantitative model.Above-mentioned quantitative model is assessed,
The best quantitative model of effect and corresponding best proportion factor s can be obtained*。
Specifically, can be using minimum quantization error EQMethod estimate optimal scale factor s*.According to this implementation
The aforementioned quantization function of example, it is known that quantization error is operated by round and clip operates generation:
Wherein, ERBring error, E are operated for roundCBring error is operated for clip.To make final quantization error
Minimum needs to balance between round error and clip error, and the error for guaranteeing that any one operation generates is not excessive.By
In the actual Unknown Distribution of model parameter, distribution approximation can be regarded as Gaussian Profile and analyzed.Such as by unknown distribution of model parameters
There is identical distribution, then it is believed that above-mentioned minimization problem reaches identical truncation ratio (clipping rate) in every layer of distribution
When reach minimum.Have P α σ≤x≤α σ for any α > 0 for 0 mean value Gaussian Profile) it is definite value.α σ is indicated herein
Interceptive value needs the maximum value greater than quantized value
ασ≥max(Q(x))
Specific quantization function is brought into, is obtained
When taking equal sign, available optimal scale factor s*.
The value of α is such as 1,2,3,4,5,6 in embodiment, is finely tuned by a small amount of finetune and restores quantitative model
Precision, while saving operand can effect substantially after de-quantization, then more preliminary result obtains optimal s again*。
The present embodiment obtains a series of s by way of sampling in a certain range to α, then according to s to full precision
Network is quantified, the network of training quantization.Optimal scale factor s is determined according to the accuracy index of quantization network*.This reality
The best proportion factor s* for applying example determines that method can effectively reduce the complexity of neural network quantization parameter.Corresponding quantization ginseng
The optimal hyper parameter α that an optimal zoom factor is reduced to search whole network is searched for from every layer in number search space.Meanwhile this
Embodiment can guarantee that the quantization error of quantitative model is minimum, to guarantee higher quantification effect.
S5: re -training best proportion factor s*The quantization network of generation obtains final quantization neural network model.Tool
Body, it can be by above-mentioned s*Corresponding quantization modules are inserted into network to be quantified, are completely trained, it is best to finally obtain effect
Quantization neural network model.
As shown in figure 4, corresponding to above-mentioned method, the embodiment of the present invention also provides a kind of convolutional neural networks quantization dress
It sets, comprising:
Scale factor estimation module, the module are used to train the full precision model of convolutional neural networks to be quantified, calculate complete
The standard deviation of every layer of accuracy model estimates that the full precision model is joined according to every layer of full precision model of standard deviation and hyper parameter α
Several and feature scale factor sα;Hyper parameter α is the constant greater than 0;
Quantization modules, the module include following two submodule:
Forward calculation submodule, each full precision of the module to the full precision model of convolutional neural networks to be optimized
Floating point values x obtains corresponding low accuracy value Q (x) using quantization function Q;
Backward gradient propagates submodule, this uses the straight-through estimator of customized gradient, so that gradient and quantization function
Gradient matching;
Quantization modules are realized to convolutional neural networks to be optimized and are based on scale factor sαForward calculation and backward gradient pass
Multicast function obtains quantifying network accordingly;
Best proportion factor computing module, the module establish the quantization network that module obtains to quantization modules and are finely adjusted instruction
Practice, for different hyper parameter α, the best s of quantization neural network accuracy after selection fine tuningαAs best proportion factor s*;
Network training module, module re -training best proportion factor s*The quantization network of generation, obtains final amount
Change neural network model.
Technology in above-described embodiment in each module of convolutional neural networks quantization device can use above-mentioned convolutional Neural net
Network quantization method corresponds to the realization of the technology in step, and details are not described herein.
In an alternative embodiment of the invention, a kind of computer equipment is also provided, the computer equipment include: memory,
Processor and storage on a memory and the computer program that can run on a processor, when the processor execution described program
It can be used for executing the convolutional neural networks quantization method in above-described embodiment.
In an alternative embodiment of the invention, a kind of computer readable storage medium is also provided, stores computer program thereon,
The program can be used for executing convolutional neural networks quantization method in above-described embodiment when being executed by processor.It is retouched based on above-mentioned
It states, in a concrete application embodiment of the invention, VGG-Small network is quantified using the above method:
VGG-Small used in the embodiment of the present invention includes 6 convolutional layers and a full articulamentum.The port number of convolutional layer
Respectively 128,128,256,256,512,512.Convolution kernel size is 3x3.Every two convolutional layer executes a pond MAX,
And it added batch normalization (BatchNormalization) operation and characteristic range constrained.Embodiment is to 5 after network
Convolutional layer is quantified, and keeps full precision to the 1st convolutional layer and full articulamentum.Entire quantization network structure is as shown in Figure 3.
Wherein Data indicates that input data layer, Conv indicate that convolutional layer, BatchNorm indicate batch normalization layer,
QuantizationConv indicates to apply the convolutional layer of quantization method of the present invention.MaxPool indicates maximum pond layer,
InnerProduct indicates full articulamentum.K after convolutional layer, pond layer is convolution kernel size, and s indicates stride size.Convolution
N after layer indicates output channel number.N after full articulamentum indicates output node number.
Above-mentioned implementation condition and outcome evaluation:
Code is realized in embodiment can be completed by C++ and python, and deep learning frame uses Caffe.Training process
In, the batch size of each iteration of selection is 100, and optimization method selects SGD+momentum, and learning rate is originated by 0.01, with
The increase of the number of iterations constantly reduce, reach 2 × 10 after 100,000 iteration-5.Parameter setting in objective function, λ are set
It is set to 104, α is set as 105。
The evaluation index of deep neural network model compression generally uses the evaluation index of former network, to reflect depth nerve
The situation of change of network model performance after being compressed.Meanwhile can also using deep neural network model compression multiple or
Bit number after model compression reflects compression and the acceleration effect of deep neural network model.In general, depth nerve net
The acceleration effect of network model more attracts attention, therefore uses the compressed bit number of deep neural network model and quantization here
Network performance is as evaluation index.
The Contrast on effect of 1 embodiment of the present invention of table and existing method on VGG-Small network, Cifar-10 data set
The embodiment of the present invention improves existing model quantization method and realizes complexity it can be seen from above-described embodiment effect,
The high problem of computation complexity has taken into account the compression effectiveness and acceleration effect of depth convolutional network model.
It should be noted that the step in the method provided by the invention, can use corresponding mould in described device
The step of block, unit etc. are achieved, and the technical solution that those skilled in the art are referred to the system realizes the method
Process, that is, the embodiment of described device can be regarded as realizing the preference of the method, and it will not be described here.
One skilled in the art will appreciate that in addition to realizing device provided by the invention in a manner of pure computer readable program code
In addition, completely can by by method and step carry out programming in logic come so that each device provided by the invention with logic gate, open
The form of pass, specific integrated circuit, programmable logic controller (PLC) and embedded microcontroller etc. realizes identical function.Institute
A kind of hardware component is considered with, device provided by the invention, and to including for realizing various functions in it
Device can also be considered as the structure in hardware component;It can also will be considered as realizing the device of various functions either realizing
The software module of method can be the structure in hardware component again.
Specific embodiments of the present invention are described above.It is to be appreciated that the invention is not limited to above-mentioned
Particular implementation, those skilled in the art can make a variety of changes or modify within the scope of the claims, this not shadow
Ring substantive content of the invention.
Claims (10)
1. a kind of convolutional neural networks quantization method characterized by comprising
S1, the full precision model of training convolutional neural networks to be quantified, calculates the standard deviation of described every layer of weight of full precision model
With the standard deviation of described every layer of full precision model response distribution;
S2, according to the standard deviation of described every layer of weight of full precision model, the standard deviation of every layer of the full precision model response distribution
With hyper parameter α, the scale factor s of the full precision Model Weight and response is estimatedα;The hyper parameter α is the constant greater than 0;
S3 establishes comprising being based on scale factor s convolutional neural networks to be optimizedαForward calculation and backward gradient communication function
Quantization modules, obtain quantifying network accordingly;
The forward calculation, comprising: to each full precision floating point values x of the full precision model of convolutional neural networks to be optimized,
Corresponding low accuracy value Q (x) is obtained using quantization function Q;
The backward gradient communication function, comprising: using the straight-through estimator of customized gradient, so that gradient and the quantization letter
Several gradient matchings;
S4: being finely adjusted training to the quantization network that S3 is obtained, and for different hyper parameter α, quantifies net after selection fine tuning
The best s of network precisionαAs best proportion factor s*;
S5: re -training best proportion factor s*The quantization network of generation obtains final quantization neural network model.
2. convolutional neural networks quantization method according to claim 1, it is characterised in that: the best proportion factor s* is logical
It crosses to sample α and determine:
Wherein: α be it needs to be determined that hyper parameter, obtained by way of sampling, be greater than 0 constant;σ is that full precision model is every
Layer weight or the standard deviation of every layer of full precision model response distribution, are calculated by corresponding full precision model;T1It is and measures
Change the relevant threshold value of digit.
3. convolutional neural networks quantization method according to claim 2, it is characterised in that: by α in a certain range
The mode of sampling obtains a series of scale factor s, is then quantified according to s to full precision network, the network of training quantization,
Optimal scale factor s is determined according to the accuracy index of quantization network*。
4. convolutional neural networks quantization method according to claim 1, it is characterised in that: described to be obtained using quantization function Q
To corresponding low accuracy value Q (x), in which:
Wherein: x is each full precision floating point values;S is the scale factor of quantization;Round () is floor operation, floating by one
Points are converted to integer;Clip () is break-in operation, can limit the range of quantization postfixed point number.
5. convolutional neural networks model quantization method according to claim 1, it is characterised in that: the quantization modules,
Backward gradient circulation way is truncation:
Wherein: α be it needs to be determined that hyper parameter, obtained by way of sampling, be greater than 0 constant;σ is that full precision model is every
Layer weight or the standard deviation of every layer of full precision model response distribution;X is each full precision floating point values;S is the ratio of above-mentioned quantization
The example factor.
6. a kind of convolutional neural networks quantization device, it is characterised in that: include:
Scale factor estimation module, the module are used to train the full precision model of convolutional neural networks to be quantified, calculate full precision
The standard deviation of every layer of model, according to the standard deviation of described every layer of weight of full precision model, every layer of the full precision model response point
The standard deviation and hyper parameter α of cloth, estimate the scale factor s of the full precision Model Weight and responseα;The hyper parameter α is greater than 0
Constant;
Quantization modules, the module include following two submodule:
Forward calculation submodule, each the full precision floating-point of the module to the full precision model of convolutional neural networks to be optimized
Value x obtains corresponding low accuracy value Q (x) using quantization function Q;
Backward gradient propagates submodule, this uses the straight-through estimator of customized gradient, so that gradient and the quantization function
Gradient matching;
The quantization modules are realized to convolutional neural networks to be optimized and are based on scale factor sαForward calculation and backward gradient pass
Multicast function obtains quantifying network accordingly;
Best proportion factor computing module, it is micro- which establishes the quantization network progress that module obtains to the quantization modules
Training is adjusted, for different hyper parameter α, the best s of quantization neural network accuracy after selection fine tuningαAs best proportion factor s*;
Network training module, module re -training best proportion factor s*The quantization network of generation obtains final quantization nerve
Network model.
7. convolutional neural networks quantization device according to claim 6, it is characterised in that: the best proportion factor calculates
Module, wherein best proportion factor s* is determined by sampling to α:
Wherein: α be it needs to be determined that hyper parameter, obtained by way of sampling, be greater than 0 constant;σ is that full precision model is every
Layer weight or the standard deviation of every layer of full precision model response distribution, are calculated by corresponding full precision model;T1 is and measures
Change the relevant threshold value of digit;
A series of scale factor s is obtained by way of sampling in a certain range to α, then according to s to full precision network
Quantified, the network of training quantization, optimal scale factor s is determined according to the accuracy index of quantization network*。
8. convolutional neural networks quantization device according to claim 6, it is characterised in that: the quantization modules, in which:
The forward calculation submodule obtains corresponding low accuracy value Q (x) using quantization function Q, in which:
Wherein: x is each full precision floating point values;S is the scale factor of quantization;Round () is floor operation, floating by one
Points are converted to integer;Clip () is break-in operation, can limit the range of quantization postfixed point number;
The backward gradient propagates submodule, and backward gradient circulation way is truncation:
Wherein: α be it needs to be determined that hyper parameter, obtained by way of sampling, be greater than 0 constant;σ is that full precision model is every
Layer weight or the standard deviation of every layer of full precision model response distribution;X is each full precision floating point values;S is the ratio of above-mentioned quantization
The example factor.
9. a kind of computer, comprising: memory, processor and storage are on a memory and the computer that can run on a processor
Program, it is characterised in that: the processor can be used for perform claim and require any one of 1-5 convolution mind when executing described program
Through network quantization method.
10. a kind of computer readable storage medium, stores computer program thereon, it is characterised in that: the program is held by processor
It can be used for perform claim when row and require any one of the 1-5 convolutional neural networks quantization method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910489092.7A CN110363281A (en) | 2019-06-06 | 2019-06-06 | A kind of convolutional neural networks quantization method, device, computer and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910489092.7A CN110363281A (en) | 2019-06-06 | 2019-06-06 | A kind of convolutional neural networks quantization method, device, computer and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110363281A true CN110363281A (en) | 2019-10-22 |
Family
ID=68215700
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910489092.7A Pending CN110363281A (en) | 2019-06-06 | 2019-06-06 | A kind of convolutional neural networks quantization method, device, computer and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110363281A (en) |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110889204A (en) * | 2019-11-06 | 2020-03-17 | 北京中科胜芯科技有限公司 | Neural network model parameter compression method |
CN110942148A (en) * | 2019-12-11 | 2020-03-31 | 北京工业大学 | Adaptive asymmetric quantization deep neural network model compression method |
CN111275059A (en) * | 2020-02-26 | 2020-06-12 | 腾讯科技(深圳)有限公司 | Image processing method and device and computer readable storage medium |
CN111582229A (en) * | 2020-05-21 | 2020-08-25 | 中国科学院空天信息创新研究院 | Network self-adaptive semi-precision quantized image processing method and system |
CN111612147A (en) * | 2020-06-30 | 2020-09-01 | 上海富瀚微电子股份有限公司 | Quantization method of deep convolutional network |
CN111768002A (en) * | 2020-07-10 | 2020-10-13 | 南开大学 | Deep neural network quantization method based on elastic significance |
CN111814448A (en) * | 2020-07-03 | 2020-10-23 | 苏州思必驰信息科技有限公司 | Method and device for quantizing pre-training language model |
CN112101524A (en) * | 2020-09-07 | 2020-12-18 | 上海交通大学 | Method and system for on-line switching bit width quantization neural network |
CN112381205A (en) * | 2020-09-29 | 2021-02-19 | 北京清微智能科技有限公司 | Neural network low bit quantization method |
CN112396042A (en) * | 2021-01-20 | 2021-02-23 | 鹏城实验室 | Real-time updated target detection method and system, and computer-readable storage medium |
CN112580805A (en) * | 2020-12-25 | 2021-03-30 | 三星(中国)半导体有限公司 | Method and device for quantizing neural network model |
CN112651500A (en) * | 2020-12-30 | 2021-04-13 | 深圳金三立视频科技股份有限公司 | Method for generating quantization model and terminal |
CN112884146A (en) * | 2021-02-25 | 2021-06-01 | 香港理工大学深圳研究院 | Method and system for training model based on data quantization and hardware acceleration |
CN113269320A (en) * | 2020-02-14 | 2021-08-17 | 阿里巴巴集团控股有限公司 | Processing unit, computing device, system on chip, data center and related methods |
CN113496274A (en) * | 2020-03-20 | 2021-10-12 | 郑桂忠 | Quantification method and system based on operation circuit architecture in memory |
CN113537511A (en) * | 2021-07-14 | 2021-10-22 | 中国科学技术大学 | Automatic gradient quantization federal learning framework and method |
CN113762496A (en) * | 2020-06-04 | 2021-12-07 | 合肥君正科技有限公司 | Method for reducing inference operation complexity of low-bit convolutional neural network |
CN113762499A (en) * | 2020-06-04 | 2021-12-07 | 合肥君正科技有限公司 | Method for quantizing weight by channels |
WO2022012233A1 (en) * | 2020-07-15 | 2022-01-20 | 安徽寒武纪信息科技有限公司 | Method and computing apparatus for quantification calibration, and computer-readable storage medium |
WO2022021834A1 (en) * | 2020-07-29 | 2022-02-03 | 北京迈格威科技有限公司 | Neural network model determination method and apparatus, and electronic device, and medium, and product |
CN114677548A (en) * | 2022-05-26 | 2022-06-28 | 之江实验室 | Neural network image classification system and method based on resistive random access memory |
CN117196418A (en) * | 2023-11-08 | 2023-12-08 | 江西师范大学 | Reading teaching quality assessment method and system based on artificial intelligence |
-
2019
- 2019-06-06 CN CN201910489092.7A patent/CN110363281A/en active Pending
Cited By (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110889204A (en) * | 2019-11-06 | 2020-03-17 | 北京中科胜芯科技有限公司 | Neural network model parameter compression method |
CN110889204B (en) * | 2019-11-06 | 2021-07-30 | 北京中科胜芯科技有限公司 | Neural network model parameter compression method |
CN110942148A (en) * | 2019-12-11 | 2020-03-31 | 北京工业大学 | Adaptive asymmetric quantization deep neural network model compression method |
CN113269320A (en) * | 2020-02-14 | 2021-08-17 | 阿里巴巴集团控股有限公司 | Processing unit, computing device, system on chip, data center and related methods |
CN111275059A (en) * | 2020-02-26 | 2020-06-12 | 腾讯科技(深圳)有限公司 | Image processing method and device and computer readable storage medium |
CN113496274A (en) * | 2020-03-20 | 2021-10-12 | 郑桂忠 | Quantification method and system based on operation circuit architecture in memory |
CN111582229A (en) * | 2020-05-21 | 2020-08-25 | 中国科学院空天信息创新研究院 | Network self-adaptive semi-precision quantized image processing method and system |
CN113762499A (en) * | 2020-06-04 | 2021-12-07 | 合肥君正科技有限公司 | Method for quantizing weight by channels |
CN113762499B (en) * | 2020-06-04 | 2024-04-02 | 合肥君正科技有限公司 | Method for quantizing weights by using multiple channels |
CN113762496A (en) * | 2020-06-04 | 2021-12-07 | 合肥君正科技有限公司 | Method for reducing inference operation complexity of low-bit convolutional neural network |
CN111612147A (en) * | 2020-06-30 | 2020-09-01 | 上海富瀚微电子股份有限公司 | Quantization method of deep convolutional network |
CN111814448B (en) * | 2020-07-03 | 2024-01-16 | 思必驰科技股份有限公司 | Pre-training language model quantization method and device |
CN111814448A (en) * | 2020-07-03 | 2020-10-23 | 苏州思必驰信息科技有限公司 | Method and device for quantizing pre-training language model |
CN111768002A (en) * | 2020-07-10 | 2020-10-13 | 南开大学 | Deep neural network quantization method based on elastic significance |
WO2022012233A1 (en) * | 2020-07-15 | 2022-01-20 | 安徽寒武纪信息科技有限公司 | Method and computing apparatus for quantification calibration, and computer-readable storage medium |
WO2022021834A1 (en) * | 2020-07-29 | 2022-02-03 | 北京迈格威科技有限公司 | Neural network model determination method and apparatus, and electronic device, and medium, and product |
CN112101524A (en) * | 2020-09-07 | 2020-12-18 | 上海交通大学 | Method and system for on-line switching bit width quantization neural network |
CN112381205A (en) * | 2020-09-29 | 2021-02-19 | 北京清微智能科技有限公司 | Neural network low bit quantization method |
CN112580805A (en) * | 2020-12-25 | 2021-03-30 | 三星(中国)半导体有限公司 | Method and device for quantizing neural network model |
CN112651500A (en) * | 2020-12-30 | 2021-04-13 | 深圳金三立视频科技股份有限公司 | Method for generating quantization model and terminal |
CN112396042A (en) * | 2021-01-20 | 2021-02-23 | 鹏城实验室 | Real-time updated target detection method and system, and computer-readable storage medium |
CN112884146A (en) * | 2021-02-25 | 2021-06-01 | 香港理工大学深圳研究院 | Method and system for training model based on data quantization and hardware acceleration |
CN112884146B (en) * | 2021-02-25 | 2024-02-13 | 香港理工大学深圳研究院 | Method and system for training model based on data quantization and hardware acceleration |
CN113537511A (en) * | 2021-07-14 | 2021-10-22 | 中国科学技术大学 | Automatic gradient quantization federal learning framework and method |
CN113537511B (en) * | 2021-07-14 | 2023-06-20 | 中国科学技术大学 | Automatic gradient quantization federal learning device and method |
CN114677548A (en) * | 2022-05-26 | 2022-06-28 | 之江实验室 | Neural network image classification system and method based on resistive random access memory |
CN114677548B (en) * | 2022-05-26 | 2022-10-14 | 之江实验室 | Neural network image classification system and method based on resistive random access memory |
CN117196418A (en) * | 2023-11-08 | 2023-12-08 | 江西师范大学 | Reading teaching quality assessment method and system based on artificial intelligence |
CN117196418B (en) * | 2023-11-08 | 2024-02-02 | 江西师范大学 | Reading teaching quality assessment method and system based on artificial intelligence |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110363281A (en) | A kind of convolutional neural networks quantization method, device, computer and storage medium | |
Ding et al. | Regularizing activation distribution for training binarized deep networks | |
US11270187B2 (en) | Method and apparatus for learning low-precision neural network that combines weight quantization and activation quantization | |
JP2019513265A (en) | Method and apparatus for automatic multi-threshold feature filtering | |
WO2020119318A1 (en) | Self-adaptive selection and design method for convolutional-layer hardware accelerator | |
CN111612147A (en) | Quantization method of deep convolutional network | |
Qin et al. | Distribution-sensitive information retention for accurate binary neural network | |
Raposo et al. | Positnn: Training deep neural networks with mixed low-precision posit | |
WO2024021624A1 (en) | Deep learning-based method and apparatus for skyline query cardinality estimation | |
CN113935489A (en) | Variational quantum model TFQ-VQA based on quantum neural network and two-stage optimization method thereof | |
CN110852417A (en) | Single-depth neural network model robustness improving method for application of Internet of things | |
CN115238893A (en) | Neural network model quantification method and device for natural language processing | |
CN114239799A (en) | Efficient target detection method, device, medium and system | |
WO2022241932A1 (en) | Prediction method based on non-intrusive attention preprocessing process and bilstm model | |
Yu et al. | Boosted dynamic neural networks | |
CN113743593B (en) | Neural network quantization method, system, storage medium and terminal | |
CN116579408A (en) | Model pruning method and system based on redundancy of model structure | |
US20220207374A1 (en) | Mixed-granularity-based joint sparse method for neural network | |
CN113920124A (en) | Brain neuron iterative segmentation method based on segmentation and error guidance | |
CN113158134A (en) | Method and device for constructing non-invasive load identification model and storage medium | |
CN112668639A (en) | Model training method and device, server and storage medium | |
CN115034388B (en) | Determination method and device for quantization parameters of ranking model and electronic equipment | |
CN113160795B (en) | Language feature extraction model training method, device, equipment and storage medium | |
Ma et al. | AxBy-ViT: Reconfigurable Approximate Computation Bypass for Vision Transformers | |
CN115543911B (en) | Method for calculating computing capacity of heterogeneous computing equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191022 |
|
RJ01 | Rejection of invention patent application after publication |