CN112115825A - Neural network quantification method, device, server and storage medium - Google Patents
Neural network quantification method, device, server and storage medium Download PDFInfo
- Publication number
- CN112115825A CN112115825A CN202010934398.1A CN202010934398A CN112115825A CN 112115825 A CN112115825 A CN 112115825A CN 202010934398 A CN202010934398 A CN 202010934398A CN 112115825 A CN112115825 A CN 112115825A
- Authority
- CN
- China
- Prior art keywords
- weight
- quantization
- original
- neural network
- weights
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 57
- 238000000034 method Methods 0.000 title claims abstract description 56
- 238000011002 quantification Methods 0.000 title abstract description 4
- 238000013139 quantization Methods 0.000 claims abstract description 163
- 230000006870 function Effects 0.000 claims description 46
- 238000004590 computer program Methods 0.000 claims description 10
- 238000010606 normalization Methods 0.000 claims description 10
- 238000004364 calculation method Methods 0.000 description 8
- 101100153586 Caenorhabditis elegans top-1 gene Proteins 0.000 description 7
- 101100370075 Mus musculus Top1 gene Proteins 0.000 description 7
- 230000008569 process Effects 0.000 description 7
- 238000012549 training Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 6
- 238000012795 verification Methods 0.000 description 6
- 210000002569 neuron Anatomy 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000011478 gradient descent method Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 238000013434 data augmentation Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000000446 fuel Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000013138 pruning Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The invention discloses a neural network quantification method, a neural network quantification device, a server and a storage medium. The quantization method of the neural network comprises the following steps: initializing a quantization weight by using an original weight; setting an objective function, wherein the objective function comprises an included angle between the quantization weight and the original weight, a shared weight value and a weight distribution index; solving the objective function to minimize an included angle between the quantization weight and the original weight, and obtaining the shared weight value and the weight distribution index; and obtaining the quantization weight according to the sharing weight value and the weight distribution index. The quantization method of the neural network is a method for optimizing the quantization problem of the neural network based on the vector direction, and the quantization weight can store the original weight information as much as possible by minimizing the included angle between the quantization weight and the original weight, so that the information loss caused by quantization is reduced.
Description
Technical Field
The present invention relates to the field of deep neural network technology, and in particular, to a quantization method, apparatus, server and storage medium for a neural network.
Background
In recent years, the development of the field of automatic driving is greatly promoted by a deep neural network, so that people gradually become possible in the past decades. However, the application of the deep neural network to vehicle-mounted intelligent hardware with limited computing resources is limited due to the huge calculation amount required by the deep neural network. To address this bottleneck, many research efforts have been directed to reducing the computational overhead of deep neural networks by quantizing parameters in deep neural network operations, converting floating-point parameters to fixed-point parameters and shortening their bit widths. However, how to reduce the information loss of quantization in the quantization process becomes a technical problem to be solved.
Disclosure of Invention
The embodiment of the invention provides a quantization method, a quantization device, a server and a storage medium of a neural network.
The quantization method of the neural network of the embodiment of the invention comprises the following steps:
initializing a quantization weight by using an original weight;
setting an objective function, wherein the objective function comprises an included angle between the quantization weight and the original weight, a shared weight value and a weight distribution index;
solving the objective function to minimize an included angle between the quantization weight and the original weight, and obtaining the shared weight value and the weight distribution index;
and obtaining the quantization weight according to the sharing weight value and the weight distribution index.
In some embodiments, initializing the quantization weights with the original weights includes:
the quantized weights are initialized using the minimized euclidean distance for the original weights for each layer of the network.
In some embodiments, initializing the quantization weights with the original weights includes:
logarithmic quantization is used on the original weights of each layer network to initialize the quantization weights.
In some embodiments, the quantization method comprises:
batch normalization is added in each layer network to reduce the internal covariance variation.
In some embodiments, the angle between the quantization weight and the original weight is expressed as an inner product of an ith kernel function vector and a quantization weight vector in the original weight divided by the length of the ith kernel function vector and the quantization weight vector.
In some embodiments, solving the objective function comprises:
fixing the sharing weight value, and adjusting the weight distribution index;
and fixing the weight distribution index, and adjusting the sharing weight value.
In some embodiments, obtaining the quantization weight according to the shared weight value and the weight assignment index includes:
and distributing corresponding shared weight values to the original weights of each layer in sequence according to the weight distribution indexes to obtain the quantization weights.
The quantization device of the neural network according to the embodiment of the present invention includes:
an initialization module to initialize the quantization weights with the original weights;
the setting module is used for setting an objective function, and the objective function comprises an included angle between the quantization weight and the original weight, a shared weight value and a weight distribution index;
a solving module for solving the objective function to minimize an included angle between the quantization weight and the original weight, and obtain the shared weight value and the weight distribution index;
an allocation module for obtaining the quantization weight according to the sharing weight value and the weight allocation index.
The server according to an embodiment of the present invention includes a memory storing a computer program and a processor executing the program to implement the quantization method of the neural network according to any one of the above embodiments.
The computer readable storage medium of the embodiments of the present invention stores thereon a computer program that, when executed by a processor, implements the steps of the quantization method of the neural network of any of the above embodiments.
The neural network quantization method, the device, the server and the storage medium optimize the neural network quantization problem based on the vector direction, and enable the quantization weight to store the original weight information as much as possible by minimizing the included angle between the quantization weight and the original weight, thereby reducing the information loss caused by quantization.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1-7 are flow diagrams of a method of quantifying neural networks in accordance with an embodiment of the present invention;
FIG. 8 is a block diagram of a quantization apparatus of a neural network according to an embodiment of the present invention;
fig. 9 is a block diagram of a server according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
Referring to fig. 1, a method for quantizing a neural network according to an embodiment of the present invention includes:
step S12: initializing a quantization weight by using an original weight;
step S14: setting an objective function, wherein the objective function comprises an included angle between a quantization weight and an original weight, a shared weight value and a weight distribution index;
step S16: solving the objective function to minimize an included angle between the quantization weight and the original weight, and obtaining a shared weight value and a weight distribution index;
step S18: and obtaining the quantization weight according to the sharing weight value and the weight distribution index.
The quantization method of the neural network is a method for optimizing the quantization problem of the neural network based on the vector direction, and the quantization weight can store the original weight information as much as possible by minimizing the included angle between the quantization weight and the original weight, so that the information loss caused by quantization is reduced.
In the related art, the deep neural network needs huge calculation amount, and the requirement of the calculation overhead of the deep neural network can be reduced by quantizing parameters in the deep neural network operation. Meanwhile, in order to avoid information loss caused by the quantization process, the euclidean distance (L2-distance) between the quantization weight and the original deep neural network weight is mainly minimized to obtain the quantization weight with less information loss. In order to accelerate the training speed of the deep neural network, Batch Normalization (BN) is usually added to each layer of the network to reduce the internal covariance variation, and the specific calculation formula is as follows:
wherein x isiFor batch B ═ x1...mThe input value before entering the activation function. Mu.sBIs the average of the current input values,is the variance of the input value and the mean value. γ and β are parameters that can be adjusted in training, which are extremely small terms introduced to avoid variance of 0. However, at the input value xiThe Euclidean distance of N times, the mean and variance will beCorrespondingly, the N times of the normalization is enlarged, and the result is not changed after batch normalization.
That is, in the related art, the length information of the input vector does not affect the result, and the euclidean distance between the quantization weight and the original deep neural network weight is minimized, so that the information loss caused by the quantization process cannot be effectively avoided.
The quantization method of the neural network in the embodiment of the invention optimizes the quantization weight by minimizing the included angle between the quantization weight and the original weight by utilizing the characteristic that the information of the input vector mainly exists in the direction of the input vector, so that the original weight information can be stored as much as possible by the quantization weight, and the information loss caused by quantization is reduced.
It is understood that the neural network includes an input layer, an output layer, and a plurality of hidden layers between the input layer and the output layer. The input layer, the output layer, and the hidden layer may each include a plurality of neurons. The input layer, the output layer and the hidden layer are all connected, that is, in any two adjacent layers, any neuron positioned in one layer is necessarily connected with any neuron positioned in the other layer, and an original weight exists between every two connected neurons. By setting and solving the objective function, the included angle between the quantization weight and the original weight is minimized to obtain the quantization weight, and then the neural network is calculated by using the quantization weight, so that a more accurate result can be obtained, and the information loss caused by quantization is reduced.
Specifically, in some embodiments, the angle between the quantization weight and the original weight is expressed as the inner product of the ith kernel function vector and the quantization weight vector in the original weight divided by the length of the ith kernel function vector and the quantization weight vector. Thus, the size of the included angle between the quantization weight and the original weight can be represented by cosine similarity. Further, an angle between the quantization weight and the original weight is minimized, that is, a cosine similarity between the quantization weight and the original weight is maximized. Thus, the objective function may be set to:
wherein the content of the first and second substances,representing the shared weight value after quantization of layer IlThe weight is assigned to the index on behalf of,representing the ith kernel function vector in the original weights,representing the corresponding quantized weight vector, the included angle between the quantized weight and the original weight can be expressed asAndis divided byAndlength of (d). When the objective function value is maximum, the cosine similarity between the quantization weight and the original weight is maximum, the included angle between the corresponding quantization weight and the original weight is minimum, the shared weight value and the weight distribution index at the moment are used as the criterion to quantize the weight, the length influence of the quantization weight vector can be completely eliminated, the quantization weight vector which is most similar to the original weight vector in the direction is found, namely the characteristic direction extracted by the original weight is kept, and the information loss caused by quantization is reduced.
In one example, a large image classification dataset ImageNet is selected for the experiment, wherein the ImageNet dataset is a large color image dataset comprising 1000 classes of objects, and comprises one million and twenty thousand training pictures and fifty thousand verification pictures. The size of the pictures is not uniform due to the ImageNet dataset. To normalize the scale size of the pictures, the short edge dimensions of all the training pictures and the verification pictures are first scaled to 256 to facilitate subsequent processing. In the process of training or fine tuning the network, all images are randomly cropped to 224 x 224 and then randomly horizontally flipped to be fed into the network, except that no additional method is used for data augmentation. When testing the representation on the verification set, only a single picture with the size of 224 multiplied by 224 is cut from the center of the picture of the verification set and sent to a network for verification, and the classification accuracy of the first ranking (Top-1) and the fifth ranking (Top-5) is calculated.
Please refer to Table 1, Table 1 shows the classification accuracy of Top-1 and Top-5 in ImageNet verification by Alexnet network, quantified by different digits. It can be seen that, when 6-bit quantization is performed by using the vector angle, the Top-1 classification accuracy is 59.4%, the Top-5 classification accuracy is 81.3%, and in the 32-bit original network, the Top-1 classification accuracy is 60.1%, and the Top-5 classification accuracy is 81.9%, i.e. the 6-bit quantization by using the vector angle can achieve the performance similar to that of the original 32-bit network.
TABLE 1
Bit number | Top-1 Accuracy | Top-5 Accuracy |
4 | 54.9% | 78.2% |
5 | 57.9% | 80.1% |
6 | 59.4% | 81.3% |
32 | 60.1% | 81.9% |
Please refer to table 2, wherein table 2 is a table comparing the quantization effect of the quantization method according to the embodiment of the present invention and related technologies. Wherein, the original network is an AlexNet network, and the corresponding Top-1 classification accuracy is 60.1%, and the Top-5 classification accuracy is 81.9%. Lines 3-7 are sequentially Depth Compression (DC) method 8/5bit Quantization, weighting-based Quantization (WEBQ) method 5bit Quantization, Incremental Network Quantization (INQ) method 5bit Quantization, method 4bit Quantization and method 5bit Quantization corresponding to Top-1 classification accuracy and Top-5 classification accuracy. In the quantization process of the neural network, the smaller the value of the digit is, the less the calculation amount is, but the information loss is increased, and the classification accuracy is reduced. It can be seen that, also when 5-bit quantization is performed, the classification accuracy of the quantization method according to the embodiment of the present invention is higher, and even when 4-bit quantization is performed to further reduce the amount of calculation, the classification accuracy is still higher than that of the other three methods when 5-bit quantization is performed. Therefore, the method can save the original weight information as much as possible and reduce the information loss caused by quantization.
TABLE 2
[1]Song Han,Huizi Mao,William Dally.Deep compression:compressing deep neural networks with pruning,trained quantization and huffman coding.In ICLR,2016.
[2]Eunhyeok Park,Junwhan Ahn,and Sungjoo Yoo.Weighted-Entropy-based Quantization for Deep Neural Networks.In CVPR,2017.
[3]Aojun Zhou,et al.Incremental network quantization:Towards lossless cnns with low-precision weights.In ICLR,2017.
Referring to fig. 2, in some embodiments, step S12 includes:
step S122: the minimum euclidean distance is used for the original weights for each layer of the network to initialize the quantized weights.
In this manner, initializing quantization weights may be achieved. Specifically, each layer network has an original weight, and the quantization weight can be initialized by minimizing the euclidean distance between the original weight and the quantization weight, and an initial quantization shared weight value, a threshold value for quantization weight index allocation and an allocation index are obtained.
Further, the calculation formula may be:
wherein x isiFor batch B ═ x1...mThe input value before entering the activation function. Mu.sBIs the average of the current input values,is the variance of the input value and the mean value. γ and β are parameters that can be adjusted in training, which are extremely small terms introduced to avoid variance of 0.
Referring to fig. 3, in some embodiments, step S12 includes:
step S123: logarithmic quantization is used for the original weights of each layer of the network to initialize the quantization weights.
In this manner, initializing quantization weights may be achieved. Specifically, logarithmic quantization is used to initialize the quantization weights, that is, the original weights of each layer network are converted into exponential multiples of 2, and then the quantization weights after initialization are calculated by a formula.
Further, the calculation formula may be:
wherein x isiIn order to be the original weight, the weight is,for the quantization weights after initialization, the Quantize function may convert the logarithmic result into integers, i.e., the quantization weights after initialization are integers.
Referring to fig. 4 and 5, in some embodiments, the quantization method includes:
step S124: batch normalization is added in each layer network to reduce the internal covariance variation.
Therefore, the covariance variation in the neural network can be reduced, and the training speed of the deep neural network is accelerated. It can be understood that before the data of each layer network is input, normalization processing is performed, and then the data is input into each layer network after the normalization processing is completed. In the embodiment shown in fig. 4, the minimum euclidean distance is used for the original weights of each layer network to initialize the quantization weights, and the batch normalization may include a gamma parameter and a beta parameter, and the accuracy of the batch normalization may be adjusted by adjusting the gamma parameter and the beta parameter, so as to reduce the covariance variation inside the neural network.
Referring to fig. 6, in some embodiments, step S16 includes:
step S162: fixing a sharing weight value, and adjusting a weight distribution index;
step S164: and fixing the weight distribution index and adjusting the sharing weight value.
Thus, the objective function is solved circularly, and the shared weight value and the weight distribution index corresponding to the included angle between the minimized quantization weight and the original weight are obtained. In one example, with reference to the solving method of the maximum Expectation algorithm (EM), the solving objective function is divided into two steps and the objective function is solved circularly: keeping the shared weight value unchanged, changing a threshold value by using a greedy algorithm, adjusting the weight distribution index, and determining the weight distribution index when the objective function reaches the maximum value; keeping the weight distribution index unchanged, adjusting the shared weight value by using a global gradient descent method, and determining the shared weight value when the target function reaches the maximum value, wherein the global gradient descent method can comprise batch gradient descent, random gradient descent and small batch gradient descent. In this way, a shared weight value and a weight assignment index corresponding to an angle between the minimized quantization weight and the original weight can be obtained.
Referring to fig. 7, in some embodiments, step S18 includes:
step S182: and distributing corresponding shared weight values to each layer of original weights in sequence according to the weight distribution indexes to obtain the quantization weights.
In this way, the final quantization weight can be obtained. It can be understood that the weight assignment index includes a correspondence between an address of each original weight in each layer network and a shared weight value, and the shared weight value is assigned to the address of each original weight in sequence according to the correspondence, so as to obtain a final quantization weight.
Referring to fig. 8, the quantization apparatus 10 of the neural network according to the embodiment of the present invention includes an initialization module 12, a setting module 14, a solving module 16, and an allocation module 18. The initialization module 12 is used to initialize the quantization weights with the original weights. The setting module 14 is configured to set an objective function, where the objective function includes an included angle between the quantization weight and the original weight, a shared weight value, and a weight assignment index. The solving module 16 is configured to solve the objective function to minimize an included angle between the quantization weight and the original weight, and obtain a shared weight value and a weight distribution index. The allocating module 18 is configured to obtain a quantization weight according to the shared weight value and the weight allocation index.
The quantization device 10 of the neural network optimizes the quantization problem of the neural network based on the vector direction, and minimizes the included angle between the quantization weight and the original weight, so that the quantization weight can store the original weight information as much as possible, thereby reducing the information loss caused by quantization.
It should be noted that the above explanation of the embodiment and the advantageous effects of the quantization method of the neural network is also applicable to the quantization apparatus 10 of the neural network of the present embodiment and the server and the computer-readable storage medium of the following embodiments, and is not detailed here to avoid redundancy.
Referring to fig. 9, a server 100 according to an embodiment of the invention includes a memory 102 and a processor 104. The memory 102 stores a computer program, and the processor 104 is used for executing the program to implement the quantization method of the neural network of any one of the above embodiments.
The server 100 optimizes the neural network quantization problem based on the vector direction, and minimizes the included angle between the quantization weight and the original weight, so that the quantization weight can store the original weight information as much as possible, thereby reducing information loss caused by quantization.
The computer readable storage medium of the embodiments of the present invention stores thereon a computer program, which, when executed by a processor, implements the steps of the quantization method of the neural network of any of the above embodiments.
For example, in the case where the program is executed by a processor, the steps of the following control method are implemented:
step S12: initializing a quantization weight by using an original weight;
step S14: setting an objective function, wherein the objective function comprises an included angle between a quantization weight and an original weight, a shared weight value and a weight distribution index;
step S16: solving the objective function to minimize an included angle between the quantization weight and the original weight, and obtaining a shared weight value and a weight distribution index;
step S18: and obtaining the quantization weight according to the sharing weight value and the weight distribution index.
Specifically, the computer-readable storage medium may be provided in a vehicle or a server, and the vehicle may communicate with the server to obtain the corresponding program. Vehicles include, but are not limited to, electric vehicles, hybrid electric vehicles, extended range electric vehicles, fuel vehicles, and the like.
It will be appreciated that the computer program comprises computer program code. The computer program code may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable storage medium may include: any entity or device capable of carrying computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), software distribution medium, and the like. The processor may refer to the processor 104 included in the server 100. The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc.
In the description herein, references to the description of the terms "one embodiment," "some embodiments," "an illustrative embodiment," "an example," "a specific example" or "some examples" or the like mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.
The logic and/or steps represented in the flowcharts or otherwise described herein, such as an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processing module-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of embodiments of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and the program, when executed, includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made in the above embodiments by those of ordinary skill in the art within the scope of the present invention.
Claims (10)
1. A method of quantifying a neural network, comprising:
initializing a quantization weight by using an original weight;
setting an objective function, wherein the objective function comprises an included angle between the quantization weight and the original weight, a shared weight value and a weight distribution index;
solving the objective function to minimize an included angle between the quantization weight and the original weight, and obtaining the shared weight value and the weight distribution index;
and obtaining the quantization weight according to the sharing weight value and the weight distribution index.
2. The quantization method of the neural network of claim 1, wherein initializing quantization weights with original weights comprises:
the quantized weights are initialized using the minimized euclidean distance for the original weights for each layer of the network.
3. The quantization method of the neural network of claim 1, wherein initializing quantization weights with original weights comprises:
logarithmic quantization is used on the original weights of each layer network to initialize the quantization weights.
4. The neural network quantization method of claim 2 or 3, wherein the quantization method comprises:
batch normalization is added in each layer network to reduce the internal covariance variation.
5. The quantization method of the neural network according to claim 1, wherein an angle between the quantization weight and the original weight is represented as an inner product of an ith kernel function vector and a quantization weight vector in the original weight divided by lengths of the ith kernel function vector and the quantization weight vector.
6. The method of claim 1, wherein solving the objective function comprises:
fixing the sharing weight value, and adjusting the weight distribution index;
and fixing the weight distribution index, and adjusting the sharing weight value.
7. The method of claim 1, wherein obtaining the quantization weight according to the shared weight value and the weight assignment index comprises:
and distributing corresponding shared weight values to the original weights of each layer in sequence according to the weight distribution indexes to obtain the quantization weights.
8. An apparatus for quantizing a neural network, comprising:
an initialization module to initialize the quantization weights with the original weights;
the setting module is used for setting an objective function, and the objective function comprises an included angle between the quantization weight and the original weight, a shared weight value and a weight distribution index;
a solving module for solving the objective function to minimize an included angle between the quantization weight and the original weight, and obtain the shared weight value and the weight distribution index;
an allocation module for obtaining the quantization weight according to the sharing weight value and the weight allocation index.
9. A server, characterized by comprising a memory storing a computer program and a processor for executing the program to implement the neural network quantization method of any one of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of quantifying a neural network according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010934398.1A CN112115825B (en) | 2020-09-08 | 2020-09-08 | Quantification method, device, server and storage medium of neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010934398.1A CN112115825B (en) | 2020-09-08 | 2020-09-08 | Quantification method, device, server and storage medium of neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112115825A true CN112115825A (en) | 2020-12-22 |
CN112115825B CN112115825B (en) | 2024-04-19 |
Family
ID=73803300
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010934398.1A Active CN112115825B (en) | 2020-09-08 | 2020-09-08 | Quantification method, device, server and storage medium of neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112115825B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE19734735A1 (en) * | 1997-08-11 | 1999-02-18 | Peter Prof Dr Lory | Neural network for learning vector quantisation |
CN110472725A (en) * | 2019-07-04 | 2019-11-19 | 北京航空航天大学 | A kind of balance binaryzation neural network quantization method and system |
US20200082269A1 (en) * | 2018-09-12 | 2020-03-12 | Nvidia Corporation | Memory efficient neural networks |
CN110909667A (en) * | 2019-11-20 | 2020-03-24 | 北京化工大学 | Lightweight design method for multi-angle SAR target recognition network |
CN110969251A (en) * | 2019-11-28 | 2020-04-07 | 中国科学院自动化研究所 | Neural network model quantification method and device based on label-free data |
CN111260724A (en) * | 2020-01-07 | 2020-06-09 | 王伟佳 | Example segmentation method based on periodic B spline |
-
2020
- 2020-09-08 CN CN202010934398.1A patent/CN112115825B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE19734735A1 (en) * | 1997-08-11 | 1999-02-18 | Peter Prof Dr Lory | Neural network for learning vector quantisation |
US20200082269A1 (en) * | 2018-09-12 | 2020-03-12 | Nvidia Corporation | Memory efficient neural networks |
CN110472725A (en) * | 2019-07-04 | 2019-11-19 | 北京航空航天大学 | A kind of balance binaryzation neural network quantization method and system |
CN110909667A (en) * | 2019-11-20 | 2020-03-24 | 北京化工大学 | Lightweight design method for multi-angle SAR target recognition network |
CN110969251A (en) * | 2019-11-28 | 2020-04-07 | 中国科学院自动化研究所 | Neural network model quantification method and device based on label-free data |
CN111260724A (en) * | 2020-01-07 | 2020-06-09 | 王伟佳 | Example segmentation method based on periodic B spline |
Also Published As
Publication number | Publication date |
---|---|
CN112115825B (en) | 2024-04-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | Training deep neural networks with 8-bit floating point numbers | |
US20200394523A1 (en) | Neural Network Quantization Parameter Determination Method and Related Products | |
US20210286688A1 (en) | Neural Network Quantization Parameter Determination Method and Related Products | |
CN110413255B (en) | Artificial neural network adjusting method and device | |
CN109002889B (en) | Adaptive iterative convolution neural network model compression method | |
WO2020142223A1 (en) | Dithered quantization of parameters during training with a machine learning tool | |
CN109800865B (en) | Neural network generation and image processing method and device, platform and electronic equipment | |
US10872295B1 (en) | Residual quantization of bit-shift weights in an artificial neural network | |
US11704556B2 (en) | Optimization methods for quantization of neural network models | |
US6594392B2 (en) | Pattern recognition based on piecewise linear probability density function | |
EP4008057B1 (en) | Lossless exponent and lossy mantissa weight compression for training deep neural networks | |
WO2021135715A1 (en) | Image compression method and apparatus | |
CN110874627A (en) | Data processing method, data processing apparatus, and computer readable medium | |
CN112085175B (en) | Data processing method and device based on neural network calculation | |
CN111027684A (en) | Deep learning model quantification method and device, electronic equipment and storage medium | |
CN114444686A (en) | Method and device for quantizing model parameters of convolutional neural network and related device | |
CN112115825B (en) | Quantification method, device, server and storage medium of neural network | |
CN112183726A (en) | Neural network full-quantization method and system | |
CN115081542B (en) | Subspace clustering method, terminal equipment and computer readable storage medium | |
Nicodemo et al. | Memory requirement reduction of deep neural networks for field programmable gate arrays using low-bit quantization of parameters | |
CN115829056A (en) | Deployment method and system of machine learning model and readable storage medium | |
CN114492778A (en) | Operation method of neural network model, readable medium and electronic device | |
Lu et al. | A very compact embedded CNN processor design based on logarithmic computing | |
Madadum et al. | A resource-efficient convolutional neural network accelerator using fine-grained logarithmic quantization | |
CN113902114A (en) | Quantization method, device and system of neural network model, electronic device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |