CN109902745A - A kind of low precision training based on CNN and 8 integers quantization inference methods - Google Patents

A kind of low precision training based on CNN and 8 integers quantization inference methods Download PDF

Info

Publication number
CN109902745A
CN109902745A CN201910154088.5A CN201910154088A CN109902745A CN 109902745 A CN109902745 A CN 109902745A CN 201910154088 A CN201910154088 A CN 201910154088A CN 109902745 A CN109902745 A CN 109902745A
Authority
CN
China
Prior art keywords
quantization
integers
integer
low precision
weight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910154088.5A
Other languages
Chinese (zh)
Inventor
严敏佳
王永松
刘丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Kang Qiao Electronic LLC
University of Electronic Science and Technology of China
Original Assignee
Chengdu Kang Qiao Electronic LLC
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Kang Qiao Electronic LLC, University of Electronic Science and Technology of China filed Critical Chengdu Kang Qiao Electronic LLC
Priority to CN201910154088.5A priority Critical patent/CN109902745A/en
Publication of CN109902745A publication Critical patent/CN109902745A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention provides a kind of low precision training based on CNN and 8 integers quantify inference methods, key step are as follows: low accuracy model training;Model training is carried out with the low precision fixed point algorithm of 16 floating types, obtains the model for target detection;Quantization weight;It proposes 8 integer quantization schemes, the weight parameter of convolutional neural networks is quantified as 8 integers from 16 floating types by layer;8 integers quantify reasoning;Activation value is quantified as 8 integer datas, i.e. each layer of CNN all receives int8 type quantification and inputs and generate int8 quantization output.The present invention obtains weight with the low precision fixed point algorithm training pattern of 16 floating types, the integer data that re-quantization is 8 carries out forward inference, the weight that compared to 32 floating type algorithm training patterns obtain directly carries out 8 integer quantization reasonings, the reasoning process for optimizing convolutional layer effectively reduces low level fixed point quantization reasoning bring loss of significance.

Description

A kind of low precision training based on CNN and 8 integers quantization inference methods
Technical field
The invention belongs to convolutional neural networks technical field, more particularly, to a kind of low precision training based on CNN Quantify inference method with 8 integers.
Background technique
Convolutional neural networks (Convolutional Neural Networks, CNNs) image classification, target detection, The fields such as recognition of face achieve superior achievement, but due to the complexity and computation delay of network structure, in storage resource and The embedded platform of computing resource relative deficiency realizes the real-time forward inference of CNNs, needs the condition in control loss of significance Under, compress the model size and lift scheme computational efficiency of neural network.
Currently used method be quantify CNN weight and (or) activation value, by data from 32 floating types be converted to compared with The integer of low level.But current quantization method still remains deficiency, many quantization methods in the tradeoff of precision and computational efficiency Web compression has been carried out to varying degrees, saves storage resource, but effectively cannot improve computational efficiency in hardware platform. Lot of documents quantization weight at present efficiently solves the problems, such as that storage resource is insufficient on hardware platform, but shorter mention calculates Efficiency.And binary neural network (BNN), ternary right value network (TWN), XNOR-net realize that multiplication is operated by being displaced, Computational efficiency is improved in hardware platform, but weight and activation value, which are all quantified as 1, to be indicated or 2 expressions normally result in The sharp fall of precision, this is very strict to performance force request of the quantization scheme to model, be not suitable for network structure it is simple, It is suitable for the light weight model for being applied to embedded platform.
The data representation of low level saves hardware resource, greatly optimizes the design of hardware accelerator.But it largely grinds Study carefully all is the high-precision model training of 32 floating types to be carried out using GPU acceleration, and low precision is only carried out in forward inference Quantization, accelerate the forward inference speed of CNNs.When being indicated using the data of extremely low position, the heavy losses of parameters precision cause Simulated target detection accuracy more significantly declines, therefore the model of the low precision of training is particularly important.
Summary of the invention
The invention will solve the problems, such as to be intended to overcome above-mentioned defect existing in the prior art, propose that one kind is based on The training of low precision and 8 integers of CNN quantify inference methods, solve that loss of significance existing for existing quantization method is larger, meter Calculate the not high enough problem of efficiency.
In order to solve the above technical problems, the technical solution of the invention is achieved in that
A kind of low precision training based on CNN and 8 integers quantization inference methods, include the following steps:
Low accuracy model training;Model training is carried out with the low precision fixed point algorithm of 16 floating types, is obtained for target detection Model, i.e. weight;
Quantization weight;It proposes 8 integer quantization schemes, quantifies the weight parameter of convolutional neural networks from 16 floating types by layer For 8 integers;
8 integers quantify reasoning;Activation value is quantified as 8 integer datas, i.e. each layer of CNN all receives int8 type amount Change and inputs and generate int8 quantization output.
Further, the training of low accuracy model is included in large server end with GPU acceleration come training pattern, in calculating process Data are saved with 16 floating types.
Further, in 16 real-coded GAs in calculating process, with 2 preservation integer parts, with 14 preservation decimals Part;With 14 precision for retaining real-coded GA that round up.
Further, quantization weight includes proposing quantization scheme:Wherein x indicates floating type Data, a, b respectively indicate the minimum value of data, maximum value in array to be quantified, i.e. a:=min (xi), b:=max (xi).Again By rounding up, function Round () rounding obtains quantized value q.
Further, weight is divided into a series of arrays by layer, seeks the most value of each weight array, by the same array Interior data equal proportion scaling is at 8 integers.
Further, 8 integer quantization reasoning processes include the following steps:
(a) BN algorithm pre-processes;The mean value and variance that input sample is calculated before convolution algorithm, are normalized pretreatment, The calculating for saving BN algorithm is time-consuming;
(b) integer convolution algorithm;Integer multiplying is converted by floating type multiplying with 8 integer quantization schemes, as far as possible It is fitted floating type multiplication and calculates effect, while integer multiplying significantly improves computational efficiency.
(c) optimize activation primitive;The active region [a, b] of each convolutional layer is successively chosen, activation primitive handle is optimized Convolution algorithm result is mapped to known region [a, b], and the calculating for saving quantization activation value is time-consuming.
Further, it is calculated in above-mentioned reasoning process using full integer data.
The invention has the advantages and positive effects of:
The present invention obtains weight, the integer that re-quantization is 8 with the low precision fixed point algorithm training pattern of 16 floating types first According to forward inference is carried out, the weight that compared to 32 floating type algorithm training patterns obtain directly carries out 8 integer quantizations and pushes away Reason effectively reduces low level fixed point quantization reasoning bring loss of significance.In addition, the Calculation bottleneck due to convolutional neural networks is Convolutional layer proposes 8 integer quantization schemes, and quantify pushing away for scheme optimization convolutional layer using 8 integers to improve computational efficiency Reason process first carries out the pretreatment of BN algorithm, and the calculating for reducing BN algorithm is time-consuming, then carries out convolution algorithm, then collects in verifying The active region for determining each convolutional layer is tested in enterprising line number thousand times quantization reasonings, is saved and is activated letter in reasoning process every time Quantization activation value requires first to seek the time overhead that activation value is most worth in real time after number operation.This method be fitted floating-point arithmetic before to Reasoning process improves convolutional layer computational efficiency under the premise of controlling loss of significance.
Detailed description of the invention
Fig. 1 is tiny-YOLO network structure;
Fig. 2 is the flow chart that the present invention is applied to that tiny-YOLO network carries out model training, forward inference optimization;
Fig. 3 a is not carry out pretreated 8 integers of BN to quantify reasoning flow chart;
Fig. 3 b is that the present invention carries out the pretreated 8 integers quantization reasoning flow chart of BN;
Fig. 4 is overall procedure of the present invention.
Specific embodiment
It should be noted that the feature in embodiment and embodiment in the case where not colliding, in the invention It can be combined with each other.
Detailed description of specific embodiments of the invention is provided below.
Technical solution of the present invention is divided into two stages: the first stage be with the low precision fixed point algorithms of 16 floating types into Row model training obtains the model for target detection, i.e. weight.Second stage is using 8 integer quantization scheme quantization power Activation value, is quantified as 8 integer datas, realizes the quantization reasoning of 8 integers by weight.
Specific step is as follows:
A. the low precision fixed point algorithm training pattern of 16 floating types for using<2.14>, i.e., indicate integer part with 2, with 14 It indicates fractional part, converts 16 real-coded GAs for 32 real-coded GAs with rounding up.In model training parameter In the presence of the data for being largely similar to 0, retaining 14 precision will not make multi-parameter be converted to 0, cause the loss of gradient information.This The training error of method can be fitted 32 floating type algorithms substantially, will not significantly reduce convergence rate.
B. quantization weight, the weight that model training is obtained are converted into 8 integer datas.8 are described with following formula The mathematical definition of integer quantization scheme, the i.e. int8 of numerical value indicate that (value after indicating quantization with q) (uses x with former floating type expression Indicate 32 original floating type numerical value) between corresponding relationship:
First by 16 real-coded GA scalings to the floating type of [- 127,128], wherein a, b respectively indicate number in array to be quantified According to minimum value, maximum value, i.e. a:=min (xi), b:=max (xi).Quantized value q is obtained by round again The integer 16 floating number equal proportions being scaled in [- 127,128] range.
C. the pre- of Batch normalization (BN) algorithm is carried out with the input data of weight and convolutional layer after quantization Processing.BN algorithm is usually to be handled after the convolution algorithm of each convolutional layer with individual operating block, the mathematics of BN algorithm It indicates as shown by the following formula:
Wherein W is the weight matrix of convolutional layer, and X is the input of convolutional layer, i.e. characteristic pattern matrix, zoom factor γ and be biasing ginseng Number β are BN layers and learn reconstruction parameter, can allow the network recovery to go out the feature distribution learnt in training process.ε be one very Small constant, μ are characterized the mean value of figure, δ2It is characterized the variance of figure.
Following formula can be released:
It simplifies are as follows:
DefinitionW is first calculated before convolution algorithm each timeBN、βBN, And it is quantified as 8 integer datas.
D. with the W being calculated in step CBNInput (characteristic pattern) with convolutional layer carries out convolution algorithm, i.e., According to matrix operation property, it converts convolution algorithm to the inner product operation of multiple two vectors, facilitates the DSP calculating using FPGA The hardware realization of unit progress inner product of vectors.Convolution operation result is stored with int32 type in coding, then operation result amount Int8 type is turned to, then plus the β being calculated in step CBN
For any output valve w of convolution algorithmi, it is by two vector υi=[xi1,xi2,……xin],μi=[yi1, yi2,……yin] inner product operation obtain:Wherein c indicates the convolutional layer Channel number.
The mathematical definition of quantization scheme in step B can convert are as follows:Substituting into inner product operation formula can obtain It arrives:It can be reduced to Its InMatrix multiplication is sought with simplified formula, wherein only B is floating type, the B extreme values with data Correlation can precalculate, and convert integer convolution algorithm purpose for floating type convolution algorithm to reach.
E. optimize activation primitive, the operation result of step D is activated, obtain the output result of the convolutional layer.It is testing Card collects enterprising thousand quantization reasonings of line number experiment and collects numerical value before the activation in reasoning process since first convolutional layer, intends Data distribution curve is closed, the appropriate activation range [a, b] of each convolutional layer is successively chosen, guarantees to reduce precision damage as far as possible It loses.By taking Leaky activation primitive as an example:
Design Leaky_n activation primitive:
Mainly illustrate the present invention in model so that tiny-YOLO network is in the real-time vehicle detection of object detection field as an example below Improvement in terms of training, integer reasoning.
The weight size and input feature vector figure size such as Fig. 1 institute of tiny-YOLO network structure and each convolutional layer Show.Tiny-YOLO shares 15 layers, is made of 9 convolutional layers and 6 pond layers.Network structure is simple, and parameter amount is relatively fewer, It is easier to be deployed to embedded platform realization real-time target detection.
The dimension of picture of specific embodiment of the invention is unified for 224*224 pixel, the input of first convolutional layer be to The pixel matrix for detecting the rgb format of picture, it is a series of by being exported after the BN processing of convolutional layer, convolution algorithm, activation Characteristic pattern, export new characteristic pattern using first pond layer, next layer is read spy of upper one layer of the output as this layer Sign figure carries out operation.The last layer obtains object detection results, and weight size is related to target detection classification, this example is only examined Survey " vehicle " this classification.
Before carrying out target detection using CNN, need at large server end with GPU acceleration come training pattern,
Object detection task is completed in application platform with the weight that training obtains again.The present invention is absorbed in the mould of the low precision of training Type, and accelerate the forward inference process of CNN.
The present invention carries out model training, the flow chart that forward inference optimizes as shown in Fig. 2, specific with tiny-YOLO network Implementation steps:
1,16 low accuracy model training
Model training uses darknet deep learning frame, and input dimension of picture unified standard is 224*224.Using 16 The low accuracy model training program in position, the integer part of floating number is indicated with 2, the fractional part of floating number is indicated with 14, is used It rounds up and retains 14 precision of real-coded GA.
Model training step:
(1) first with ImageNet data set training sorter network, the number of iterations is 1,600,000 times;
(2) driving data collection training detection network is disclosed with BDD100K, the number of iterations is 400,000 times.Due to tiny-YOLO mould Type structure is simple, and generalization ability is poor, and when training set and test set come from different distributions, object detection results are not ideal enough, Therefore the COCO data set for being commonly used to training detection network is replaced using BDD100K data set.
(3) it is finely adjusted with the DDS data the set pair analysis model customized, the number of iterations is 400,000 times.DDS data set is root According to practical application scene, the road conditions video in front of city acquisition driving vehicle, samples the key frame of video at home, then Mark classification obtains.
2, quantization weight, with 8 integer quantization schemes will the obtained weight W of training from 16 floating types be quantified as 8 it is whole Type data.
Step 3-10 introduction quantifies the process that inference schemes complete an object detection task with integer.
3, input verifying collection picture, obtains the pixel matrix of image, numerical value is the integer in [0,255] section, directly The integer data for saving into 8, the input sample X as first convolutional layer.
Step 4-8 introduces the integer reasoning process of convolutional layer, and 8 integer quantization schemes are applied directly to reasoning process Process is as shown in Figure 3a, by the pretreated 8 integers quantization reasoning process of BN as shown in Fig. 3 b.
4, the mean value and variance of convolutional layer input sample X are calculated: Its Middle m is the number of minimum batching data.
5, W is calculatedBN、βBNAnd 8 integer datas are saved as, wherein It completes BN pretreatment.
6, it is calculated with integer convolution algorithm methodConvolution algorithm is saved with 32 integers as a result, adding The calculated offset parameter β of step 5BN
7, it is activated using Leaky activation primitive, and collects the activation value of convolutional layer when detecting every picture, fitting activation The data distribution function of value.
8, activation value is quantified as 8 integers, obtains the volume by maximum value, the minimum value for seeking the activation value that step 7 obtains The output characteristic pattern of lamination.
9, the input feature vector figure for using the output characteristic pattern of step 8 as pond layer carries out maximum pondization operation, generates new Characteristic pattern.
10, step 4-9 is repeated according to the network structure of tiny-YOLO, is verified the testing result of collection picture, and calculated Mean accuracy mean value (mean Average Precision, mAP), the reference frame as following optimization activation primitive.
11, first suitable active region of convolutional layer [a, b] is chosen by the data distribution function that step 7 obtains, used Convolution algorithm result is mapped to known region [a, b] by activation primitive Leaky_n.It repeats step 3-10 and obtains modification first Target detection index mAP after the activation primitive of convolutional layer, control mAP loss can omit step 8 at this time, that is, ask less than 0.1% Take the process that activation value is most worth.
The active region that step 11 successively chooses the 2-9 convolutional layer is repeated, successively modifies activation primitive Leaky_n, most Control mAP loss is less than 1% eventually.
12, input test collection picture quantifies inference schemes using 8 integers for optimizing activation primitive and completes target detection Task obtains testing result.
In conjunction with described above, it is as shown in Figure 4 to obtain overall procedure of the invention.
In conclusion the present invention is to obtain weight with the low precision fixed point algorithm training pattern of 16 floating types, then measure first It turns to 8 integer datas and carries out forward inference, the weight that compared to 32 floating type algorithm training patterns obtain directly carries out 8 Position integer quantifies reasoning, effectively reduces low level fixed point quantization reasoning bring loss of significance.
In addition, to improve computational efficiency, proposing 8 integer quantities since the Calculation bottleneck of convolutional neural networks is convolutional layer Change scheme, and using the reasoning process of 8 integer quantization scheme optimization convolutional layers, the pretreatment of BN algorithm is first carried out, BN is reduced The calculating of algorithm is time-consuming, then carries out convolution algorithm.
The active region for testing each determining convolutional layer in the enterprising line number thousand times quantizations reasoning of verifying collection later, saves Quantify activation value in reasoning process after each activation primitive operation to require first to seek the time overhead that activation value is most worth in real time.It should Method is fitted floating-point arithmetic forward inference process, under the premise of controlling loss of significance, improves convolutional layer computational efficiency.
It is obvious to a person skilled in the art that the invention is not limited to the details of above-mentioned exemplary embodiment, and And without departing substantially from the spirit or essential attributes of the invention, wound that the present invention can be realized in other specific forms It makes.
Therefore, in all respects, the present embodiments are to be considered as illustrative and not restrictive, this The range of innovation and creation is indicated by the appended claims rather than the foregoing description, it is intended that equally wanting for claim will be fallen in All changes in the meaning and scope of part are included in the invention.Any appended drawing reference in claim should not be regarded To limit the claims involved.
In addition, it should be understood that although this specification is described in terms of embodiments, but not each embodiment is only wrapped Containing an independent technical solution, this description of the specification is merely for the sake of clarity, and those skilled in the art should It considers the specification as a whole, the technical solutions in the various embodiments may also be suitably combined, forms those skilled in the art The other embodiments being understood that.

Claims (7)

1. a kind of low precision training based on CNN quantifies inference method with 8 integers, which comprises the steps of:
Low accuracy model training;Model training is carried out with the low precision fixed point algorithm of 16 floating types, is obtained for target detection Model;
Quantization weight;It proposes 8 integer quantization schemes, quantifies the weight parameter of convolutional neural networks from 16 floating types by layer For 8 integers;
8 integers quantify reasoning;Activation value is quantified as 8 integer datas, i.e. each layer of CNN all receives int8 type quantification It inputs and generates int8 quantization output.
2. a kind of low precision training based on CNN according to claim 1 quantifies inference method, feature with 8 integers Be: the training of low accuracy model is included in large server end with GPU acceleration come training pattern, with 16 floating-points in calculating process Type saves data.
3. a kind of low precision training based on CNN according to claim 2 quantifies inference method, feature with 8 integers It is: in 16 real-coded GAs in calculating process, is given up with 14 preservation fractional parts with four with 2 preservation integer parts Five enter to retain 14 precision of real-coded GA.
4. a kind of low precision training based on CNN according to claim 1 quantifies inference method, feature with 8 integers It is, quantization weight includes proposing quantization scheme:, wherein x indicates real-coded GA,Point The minimum value of data, maximum value in array to be quantified are not indicated, i.e.,,
Quantized value q is obtained by function the Round () rounding that rounds up again.
5. a kind of low precision training based on CNN according to claim 4 quantifies inference method, feature with 8 integers It is: weight is divided into a series of arrays by layer, seeks the most value of each weight array, by the data etc. in the same array Scale is at 8 integers.
6. a kind of low precision training based on CNN according to claim 1 quantifies inference method, feature with 8 integers It is, 8 integer quantization reasoning processes include the following steps:
(a) BN algorithm pre-processes;The mean value and variance that input sample is calculated before convolution algorithm, are normalized pretreatment;
(b) integer convolution algorithm;Integer multiplying is converted by floating type multiplying with 8 integer quantization schemes;
(c) optimize activation primitive;The active region [a, b] of each convolutional layer is successively chosen, optimizes activation primitive convolution Operation result is mapped to known region [a, b].
7. quantifying inference method with 8 integers to 6 any low precision training based on CNN according to claim 1, feature exists In: using the calculating of full integer data in reasoning process.
CN201910154088.5A 2019-03-01 2019-03-01 A kind of low precision training based on CNN and 8 integers quantization inference methods Pending CN109902745A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910154088.5A CN109902745A (en) 2019-03-01 2019-03-01 A kind of low precision training based on CNN and 8 integers quantization inference methods

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910154088.5A CN109902745A (en) 2019-03-01 2019-03-01 A kind of low precision training based on CNN and 8 integers quantization inference methods

Publications (1)

Publication Number Publication Date
CN109902745A true CN109902745A (en) 2019-06-18

Family

ID=66946069

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910154088.5A Pending CN109902745A (en) 2019-03-01 2019-03-01 A kind of low precision training based on CNN and 8 integers quantization inference methods

Country Status (1)

Country Link
CN (1) CN109902745A (en)

Cited By (53)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110298438A (en) * 2019-07-05 2019-10-01 北京中星微电子有限公司 The method of adjustment and adjustment device of neural network model
CN110309877A (en) * 2019-06-28 2019-10-08 北京百度网讯科技有限公司 A kind of quantization method, device, electronic equipment and the storage medium of feature diagram data
CN110322414A (en) * 2019-07-05 2019-10-11 北京探境科技有限公司 A kind of image data based on AI processor quantifies antidote and system online
CN110659734A (en) * 2019-09-27 2020-01-07 中国科学院半导体研究所 Low bit quantization method for depth separable convolution structure
CN110674924A (en) * 2019-08-22 2020-01-10 苏州浪潮智能科技有限公司 Deep learning inference automatic quantification method and device
CN110852416A (en) * 2019-09-30 2020-02-28 成都恒创新星科技有限公司 CNN accelerated computing method and system based on low-precision floating-point data expression form
CN110852434A (en) * 2019-09-30 2020-02-28 成都恒创新星科技有限公司 CNN quantization method, forward calculation method and device based on low-precision floating point number
CN110929862A (en) * 2019-11-26 2020-03-27 陈子祺 Fixed-point neural network model quantization device and method
CN111178258A (en) * 2019-12-29 2020-05-19 浪潮(北京)电子信息产业有限公司 Image identification method, system, equipment and readable storage medium
CN111178087A (en) * 2019-12-20 2020-05-19 沈阳雅译网络技术有限公司 Neural machine translation decoding acceleration method based on discrete attention mechanism
CN111260022A (en) * 2019-11-22 2020-06-09 中国电子科技集团公司第五十二研究所 Method for fixed-point quantization of complete INT8 of convolutional neural network
CN111680716A (en) * 2020-05-09 2020-09-18 浙江大华技术股份有限公司 Identification comparison method and device, computer equipment and storage medium
CN111723934A (en) * 2020-06-24 2020-09-29 北京紫光展锐通信技术有限公司 Image processing method and system, electronic device and storage medium
CN111767984A (en) * 2020-06-09 2020-10-13 云知声智能科技股份有限公司 8-bit integer full-quantization inference method and device based on fixed displacement
CN111950716A (en) * 2020-08-25 2020-11-17 云知声智能科技股份有限公司 Quantification method and system for optimizing int8
CN111985495A (en) * 2020-07-09 2020-11-24 珠海亿智电子科技有限公司 Model deployment method, device, system and storage medium
CN112288744A (en) * 2020-08-24 2021-01-29 西安电子科技大学 SAR image change detection method based on integer reasoning quantification CNN
CN112308226A (en) * 2020-08-03 2021-02-02 北京沃东天骏信息技术有限公司 Quantization of neural network models, method and apparatus for outputting information
WO2021037174A1 (en) * 2019-08-29 2021-03-04 杭州海康威视数字技术股份有限公司 Neural network model training method and apparatus
WO2021036905A1 (en) * 2019-08-27 2021-03-04 安徽寒武纪信息科技有限公司 Data processing method and apparatus, computer equipment, and storage medium
CN112508125A (en) * 2020-12-22 2021-03-16 无锡江南计算技术研究所 Efficient full-integer quantization method of image detection model
CN112651500A (en) * 2020-12-30 2021-04-13 深圳金三立视频科技股份有限公司 Method for generating quantization model and terminal
WO2021068469A1 (en) * 2019-10-11 2021-04-15 百度在线网络技术(北京)有限公司 Quantization and fixed-point fusion method and apparatus for neural network
CN113032007A (en) * 2019-12-24 2021-06-25 阿里巴巴集团控股有限公司 Data processing method and device
CN113052868A (en) * 2021-03-11 2021-06-29 奥比中光科技集团股份有限公司 Cutout model training and image cutout method and device
WO2021128293A1 (en) * 2019-12-27 2021-07-01 华为技术有限公司 Model training method and apparatus, and storage medium and program product
CN113095472A (en) * 2020-01-09 2021-07-09 北京君正集成电路股份有限公司 Method for reducing precision loss of convolutional neural network through forward reasoning in quantization process
CN113343949A (en) * 2021-08-03 2021-09-03 中国航空油料集团有限公司 Pedestrian detection model training method for universal embedded platform
CN113392973A (en) * 2021-06-25 2021-09-14 广东工业大学 AI chip neural network acceleration method based on FPGA
CN113537340A (en) * 2021-07-14 2021-10-22 深圳思悦创新有限公司 Yolo target detection model compression method, system and storage medium
CN113762499A (en) * 2020-06-04 2021-12-07 合肥君正科技有限公司 Method for quantizing weight by channels
CN113762500A (en) * 2020-06-04 2021-12-07 合肥君正科技有限公司 Training method for improving model precision of convolutional neural network during quantification
CN113762452A (en) * 2020-06-04 2021-12-07 合肥君正科技有限公司 Method for quantizing PRELU activation function
WO2022007880A1 (en) * 2020-07-09 2022-01-13 北京灵汐科技有限公司 Data accuracy configuration method and apparatus, neural network device, and medium
WO2022031764A1 (en) * 2020-08-04 2022-02-10 Nvidia Corporation Hybrid quantization of neural networks for edge computing applications
CN114191267A (en) * 2021-12-06 2022-03-18 南通大学 Light-weight intelligent method and system for assisting blind person in going out in complex environment
US11397579B2 (en) 2018-02-13 2022-07-26 Shanghai Cambricon Information Technology Co., Ltd Computing device and method
US11437032B2 (en) 2017-09-29 2022-09-06 Shanghai Cambricon Information Technology Co., Ltd Image processing apparatus and method
US11442786B2 (en) 2018-05-18 2022-09-13 Shanghai Cambricon Information Technology Co., Ltd Computation method and product thereof
US11513586B2 (en) 2018-02-14 2022-11-29 Shanghai Cambricon Information Technology Co., Ltd Control device, method and equipment for processor
US11544059B2 (en) 2018-12-28 2023-01-03 Cambricon (Xi'an) Semiconductor Co., Ltd. Signal processing device, signal processing method and related products
US11609760B2 (en) 2018-02-13 2023-03-21 Shanghai Cambricon Information Technology Co., Ltd Computing device and method
US11630666B2 (en) 2018-02-13 2023-04-18 Shanghai Cambricon Information Technology Co., Ltd Computing device and method
US11676028B2 (en) 2019-06-12 2023-06-13 Shanghai Cambricon Information Technology Co., Ltd Neural network quantization parameter determination method and related products
US11675676B2 (en) 2019-06-12 2023-06-13 Shanghai Cambricon Information Technology Co., Ltd Neural network quantization parameter determination method and related products
US11703939B2 (en) 2018-09-28 2023-07-18 Shanghai Cambricon Information Technology Co., Ltd Signal processing device and related products
US11762690B2 (en) 2019-04-18 2023-09-19 Cambricon Technologies Corporation Limited Data processing method and related products
US11789847B2 (en) 2018-06-27 2023-10-17 Shanghai Cambricon Information Technology Co., Ltd On-chip code breakpoint debugging method, on-chip processor, and chip breakpoint debugging system
CN111008694B (en) * 2019-12-02 2023-10-27 许昌北邮万联网络技术有限公司 Depth convolution countermeasure generation network-based data model quantization compression method
US11847554B2 (en) 2019-04-18 2023-12-19 Cambricon Technologies Corporation Limited Data processing method and related products
WO2024031989A1 (en) * 2022-08-11 2024-02-15 山东浪潮科学研究院有限公司 Memory optimization method and system for deep learning reasoning of embedded device
US11966583B2 (en) 2018-08-28 2024-04-23 Cambricon Technologies Corporation Limited Data pre-processing method and device, and related computer device and storage medium
US12001955B2 (en) 2019-08-23 2024-06-04 Anhui Cambricon Information Technology Co., Ltd. Data processing method, device, computer equipment and storage medium

Cited By (85)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11437032B2 (en) 2017-09-29 2022-09-06 Shanghai Cambricon Information Technology Co., Ltd Image processing apparatus and method
US11630666B2 (en) 2018-02-13 2023-04-18 Shanghai Cambricon Information Technology Co., Ltd Computing device and method
US11620130B2 (en) 2018-02-13 2023-04-04 Shanghai Cambricon Information Technology Co., Ltd Computing device and method
US11740898B2 (en) 2018-02-13 2023-08-29 Shanghai Cambricon Information Technology Co., Ltd Computing device and method
US11720357B2 (en) 2018-02-13 2023-08-08 Shanghai Cambricon Information Technology Co., Ltd Computing device and method
US11709672B2 (en) 2018-02-13 2023-07-25 Shanghai Cambricon Information Technology Co., Ltd Computing device and method
US11704125B2 (en) 2018-02-13 2023-07-18 Cambricon (Xi'an) Semiconductor Co., Ltd. Computing device and method
US11507370B2 (en) 2018-02-13 2022-11-22 Cambricon (Xi'an) Semiconductor Co., Ltd. Method and device for dynamically adjusting decimal point positions in neural network computations
US11397579B2 (en) 2018-02-13 2022-07-26 Shanghai Cambricon Information Technology Co., Ltd Computing device and method
US11663002B2 (en) 2018-02-13 2023-05-30 Shanghai Cambricon Information Technology Co., Ltd Computing device and method
US11609760B2 (en) 2018-02-13 2023-03-21 Shanghai Cambricon Information Technology Co., Ltd Computing device and method
US11513586B2 (en) 2018-02-14 2022-11-29 Shanghai Cambricon Information Technology Co., Ltd Control device, method and equipment for processor
US11442785B2 (en) 2018-05-18 2022-09-13 Shanghai Cambricon Information Technology Co., Ltd Computation method and product thereof
US11442786B2 (en) 2018-05-18 2022-09-13 Shanghai Cambricon Information Technology Co., Ltd Computation method and product thereof
US11789847B2 (en) 2018-06-27 2023-10-17 Shanghai Cambricon Information Technology Co., Ltd On-chip code breakpoint debugging method, on-chip processor, and chip breakpoint debugging system
US11966583B2 (en) 2018-08-28 2024-04-23 Cambricon Technologies Corporation Limited Data pre-processing method and device, and related computer device and storage medium
US11703939B2 (en) 2018-09-28 2023-07-18 Shanghai Cambricon Information Technology Co., Ltd Signal processing device and related products
US11544059B2 (en) 2018-12-28 2023-01-03 Cambricon (Xi'an) Semiconductor Co., Ltd. Signal processing device, signal processing method and related products
US11934940B2 (en) 2019-04-18 2024-03-19 Cambricon Technologies Corporation Limited AI processor simulation
US11762690B2 (en) 2019-04-18 2023-09-19 Cambricon Technologies Corporation Limited Data processing method and related products
US11847554B2 (en) 2019-04-18 2023-12-19 Cambricon Technologies Corporation Limited Data processing method and related products
US11676028B2 (en) 2019-06-12 2023-06-13 Shanghai Cambricon Information Technology Co., Ltd Neural network quantization parameter determination method and related products
US11676029B2 (en) 2019-06-12 2023-06-13 Shanghai Cambricon Information Technology Co., Ltd Neural network quantization parameter determination method and related products
US11675676B2 (en) 2019-06-12 2023-06-13 Shanghai Cambricon Information Technology Co., Ltd Neural network quantization parameter determination method and related products
CN110309877A (en) * 2019-06-28 2019-10-08 北京百度网讯科技有限公司 A kind of quantization method, device, electronic equipment and the storage medium of feature diagram data
CN110309877B (en) * 2019-06-28 2021-12-07 北京百度网讯科技有限公司 Feature map data quantization method and device, electronic equipment and storage medium
CN110298438A (en) * 2019-07-05 2019-10-01 北京中星微电子有限公司 The method of adjustment and adjustment device of neural network model
CN110322414A (en) * 2019-07-05 2019-10-11 北京探境科技有限公司 A kind of image data based on AI processor quantifies antidote and system online
CN110298438B (en) * 2019-07-05 2024-04-26 北京中星微电子有限公司 Neural network model adjusting method and device
CN110322414B (en) * 2019-07-05 2021-08-10 北京探境科技有限公司 Image data online quantitative correction method and system based on AI processor
CN110674924A (en) * 2019-08-22 2020-01-10 苏州浪潮智能科技有限公司 Deep learning inference automatic quantification method and device
CN110674924B (en) * 2019-08-22 2022-06-03 苏州浪潮智能科技有限公司 Deep learning inference automatic quantification method and device
US12001955B2 (en) 2019-08-23 2024-06-04 Anhui Cambricon Information Technology Co., Ltd. Data processing method, device, computer equipment and storage medium
WO2021036905A1 (en) * 2019-08-27 2021-03-04 安徽寒武纪信息科技有限公司 Data processing method and apparatus, computer equipment, and storage medium
CN112446461A (en) * 2019-08-29 2021-03-05 杭州海康威视数字技术股份有限公司 Neural network model training method and device
WO2021037174A1 (en) * 2019-08-29 2021-03-04 杭州海康威视数字技术股份有限公司 Neural network model training method and apparatus
CN110659734A (en) * 2019-09-27 2020-01-07 中国科学院半导体研究所 Low bit quantization method for depth separable convolution structure
CN110659734B (en) * 2019-09-27 2022-12-23 中国科学院半导体研究所 Low bit quantization method for depth separable convolution structure
CN110852434A (en) * 2019-09-30 2020-02-28 成都恒创新星科技有限公司 CNN quantization method, forward calculation method and device based on low-precision floating point number
CN110852434B (en) * 2019-09-30 2022-09-23 梁磊 CNN quantization method, forward calculation method and hardware device based on low-precision floating point number
CN110852416B (en) * 2019-09-30 2022-10-04 梁磊 CNN hardware acceleration computing method and system based on low-precision floating point data representation form
CN110852416A (en) * 2019-09-30 2020-02-28 成都恒创新星科技有限公司 CNN accelerated computing method and system based on low-precision floating-point data expression form
WO2021068469A1 (en) * 2019-10-11 2021-04-15 百度在线网络技术(北京)有限公司 Quantization and fixed-point fusion method and apparatus for neural network
CN111260022A (en) * 2019-11-22 2020-06-09 中国电子科技集团公司第五十二研究所 Method for fixed-point quantization of complete INT8 of convolutional neural network
CN111260022B (en) * 2019-11-22 2023-09-05 中国电子科技集团公司第五十二研究所 Full INT8 fixed-point quantization method for convolutional neural network
CN110929862A (en) * 2019-11-26 2020-03-27 陈子祺 Fixed-point neural network model quantization device and method
CN110929862B (en) * 2019-11-26 2023-08-01 陈子祺 Fixed-point neural network model quantification device and method
CN111008694B (en) * 2019-12-02 2023-10-27 许昌北邮万联网络技术有限公司 Depth convolution countermeasure generation network-based data model quantization compression method
CN111178087B (en) * 2019-12-20 2023-05-09 沈阳雅译网络技术有限公司 Neural machine translation decoding acceleration method based on discrete type attention mechanism
CN111178087A (en) * 2019-12-20 2020-05-19 沈阳雅译网络技术有限公司 Neural machine translation decoding acceleration method based on discrete attention mechanism
CN113032007B (en) * 2019-12-24 2024-06-11 阿里巴巴集团控股有限公司 Data processing method and device
CN113032007A (en) * 2019-12-24 2021-06-25 阿里巴巴集团控股有限公司 Data processing method and device
WO2021128293A1 (en) * 2019-12-27 2021-07-01 华为技术有限公司 Model training method and apparatus, and storage medium and program product
CN111178258A (en) * 2019-12-29 2020-05-19 浪潮(北京)电子信息产业有限公司 Image identification method, system, equipment and readable storage medium
CN111178258B (en) * 2019-12-29 2022-04-22 浪潮(北京)电子信息产业有限公司 Image identification method, system, equipment and readable storage medium
CN113095472A (en) * 2020-01-09 2021-07-09 北京君正集成电路股份有限公司 Method for reducing precision loss of convolutional neural network through forward reasoning in quantization process
CN111680716A (en) * 2020-05-09 2020-09-18 浙江大华技术股份有限公司 Identification comparison method and device, computer equipment and storage medium
CN111680716B (en) * 2020-05-09 2023-05-12 浙江大华技术股份有限公司 Identification comparison method, device, computer equipment and storage medium
CN113762499A (en) * 2020-06-04 2021-12-07 合肥君正科技有限公司 Method for quantizing weight by channels
CN113762500A (en) * 2020-06-04 2021-12-07 合肥君正科技有限公司 Training method for improving model precision of convolutional neural network during quantification
CN113762500B (en) * 2020-06-04 2024-04-02 合肥君正科技有限公司 Training method for improving model precision during quantization of convolutional neural network
CN113762499B (en) * 2020-06-04 2024-04-02 合肥君正科技有限公司 Method for quantizing weights by using multiple channels
CN113762452B (en) * 2020-06-04 2024-01-02 合肥君正科技有限公司 Method for quantizing PRELU activation function
CN113762452A (en) * 2020-06-04 2021-12-07 合肥君正科技有限公司 Method for quantizing PRELU activation function
CN111767984A (en) * 2020-06-09 2020-10-13 云知声智能科技股份有限公司 8-bit integer full-quantization inference method and device based on fixed displacement
CN111723934A (en) * 2020-06-24 2020-09-29 北京紫光展锐通信技术有限公司 Image processing method and system, electronic device and storage medium
CN111985495A (en) * 2020-07-09 2020-11-24 珠海亿智电子科技有限公司 Model deployment method, device, system and storage medium
WO2022007880A1 (en) * 2020-07-09 2022-01-13 北京灵汐科技有限公司 Data accuracy configuration method and apparatus, neural network device, and medium
CN111985495B (en) * 2020-07-09 2024-02-02 珠海亿智电子科技有限公司 Model deployment method, device, system and storage medium
CN112308226B (en) * 2020-08-03 2024-05-24 北京沃东天骏信息技术有限公司 Quantization of neural network model, method and apparatus for outputting information
CN112308226A (en) * 2020-08-03 2021-02-02 北京沃东天骏信息技术有限公司 Quantization of neural network models, method and apparatus for outputting information
WO2022031764A1 (en) * 2020-08-04 2022-02-10 Nvidia Corporation Hybrid quantization of neural networks for edge computing applications
CN112288744B (en) * 2020-08-24 2023-04-07 西安电子科技大学 SAR image change detection method based on integer reasoning quantification CNN
CN112288744A (en) * 2020-08-24 2021-01-29 西安电子科技大学 SAR image change detection method based on integer reasoning quantification CNN
CN111950716A (en) * 2020-08-25 2020-11-17 云知声智能科技股份有限公司 Quantification method and system for optimizing int8
CN112508125A (en) * 2020-12-22 2021-03-16 无锡江南计算技术研究所 Efficient full-integer quantization method of image detection model
CN112651500A (en) * 2020-12-30 2021-04-13 深圳金三立视频科技股份有限公司 Method for generating quantization model and terminal
CN112651500B (en) * 2020-12-30 2021-12-28 深圳金三立视频科技股份有限公司 Method for generating quantization model and terminal
CN113052868A (en) * 2021-03-11 2021-06-29 奥比中光科技集团股份有限公司 Cutout model training and image cutout method and device
CN113392973B (en) * 2021-06-25 2023-01-13 广东工业大学 AI chip neural network acceleration method based on FPGA
CN113392973A (en) * 2021-06-25 2021-09-14 广东工业大学 AI chip neural network acceleration method based on FPGA
CN113537340A (en) * 2021-07-14 2021-10-22 深圳思悦创新有限公司 Yolo target detection model compression method, system and storage medium
CN113343949A (en) * 2021-08-03 2021-09-03 中国航空油料集团有限公司 Pedestrian detection model training method for universal embedded platform
CN114191267A (en) * 2021-12-06 2022-03-18 南通大学 Light-weight intelligent method and system for assisting blind person in going out in complex environment
WO2024031989A1 (en) * 2022-08-11 2024-02-15 山东浪潮科学研究院有限公司 Memory optimization method and system for deep learning reasoning of embedded device

Similar Documents

Publication Publication Date Title
CN109902745A (en) A kind of low precision training based on CNN and 8 integers quantization inference methods
US11924334B2 (en) Quantum neural network
US20210089922A1 (en) Joint pruning and quantization scheme for deep neural networks
CN111860982A (en) Wind power plant short-term wind power prediction method based on VMD-FCM-GRU
CN110909926A (en) TCN-LSTM-based solar photovoltaic power generation prediction method
CN107679618A (en) A kind of static policies fixed point training method and device
CN107688849A (en) A kind of dynamic strategy fixed point training method and device
CN110120041A (en) Pavement crack image detecting method
CN108446766A (en) A kind of method of quick trained storehouse own coding deep neural network
CN107705556A (en) A kind of traffic flow forecasting method combined based on SVMs and BP neural network
CN103077408B (en) Method for converting seabed sonar image into acoustic substrate classification based on wavelet neutral network
CN115759237A (en) End-to-end deep neural network model compression and heterogeneous conversion system and method
CN112183742A (en) Neural network hybrid quantization method based on progressive quantization and Hessian information
WO2022241932A1 (en) Prediction method based on non-intrusive attention preprocessing process and bilstm model
CN115311506A (en) Image classification method and device based on quantization factor optimization of resistive random access memory
CN115223158A (en) License plate image generation method and system based on adaptive diffusion prior variation self-encoder
CN103354073B (en) A kind of LCD color deviation correction method
Doherty et al. Comparative study of activation functions and their impact on the YOLOv5 object detection model
Li et al. Weight-dependent gates for differentiable neural network pruning
CN116843544A (en) Method, system and equipment for super-resolution reconstruction by introducing hypersonic flow field into convolutional neural network
CN117439069A (en) Electric quantity prediction method based on neural network
CN116579408A (en) Model pruning method and system based on redundancy of model structure
CN117152427A (en) Remote sensing image semantic segmentation method and system based on diffusion model and knowledge distillation
CN116109868A (en) Image classification model construction and small sample image classification method based on lightweight neural network
WO2023019899A1 (en) Real-time pruning method and system for neural network, and neural network accelerator

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190618

WD01 Invention patent application deemed withdrawn after publication