CN107688855A

CN107688855A - It is directed to the layered quantization method and apparatus of Complex Neural Network

Info

Publication number: CN107688855A
Application number: CN201610698184.2A
Authority: CN
Inventors: 余金城; 姚颂
Original assignee: Beijing Insight Technology Co Ltd
Current assignee: Xilinx Inc
Priority date: 2016-08-12
Filing date: 2016-08-19
Publication date: 2018-02-13
Anticipated expiration: 2036-08-19
Also published as: CN107688855B; CN107657316A; CN107657316B

Abstract

The present invention relates to artificial neural network (ANN), such as convolutional neural networks (CNN), more particularly to how to realize compression by quantifying to the fixed point of Complex Neural Network and accelerate artificial neural network.

Description

It is directed to the layered quantization method and apparatus of Complex Neural Network

The priority requisition of reference

This application claims the Chinese patent application 201610663201.9 formerly submitted, a kind of " optimized artificial neural network Method " and Chinese patent application 201610663563.8 " a kind of to be used to realize ANN advanced treatment unit " priority.

Technical field

The present invention relates to artificial neural network (ANN), such as convolutional neural networks (CNN), more particularly to how by right The fixed point of Complex Neural Network quantifies to realize compression and accelerate artificial neural network.

Background technology

Based on artificial neural network, especially convolutional neural networks (CNN, Convolutional Neural Network) Method all achieve great success in many applications, especially obtain most powerful sum always in computer vision field Widely use.

Image classification is a basic problem in computer vision (CV).Convolutional neural networks (CNN) cause image point Class precision obtains very big progress.In Image-Net Large Scale Vision Recognition Challenge (ILSVRC) 2012, Krizhevsky et al. is represented, by obtaining 84.7% preceding 5 accuracys rate, wherein CNN in classification task With great role, this is apparently higher than other traditional image classification methods.In ensuing several years, such as ILSVRC2013, ILSVRC2014 and ILSVRC2015, precision bring up to 88.8%, 93.3% and 96.4%.

Although the method based on CNN has state-of-the-art performance, need more to calculate compared with conventional method and interior Deposit resource.It is most of that large server is necessarily dependent upon based on CNN methods.However, there is one can not neglect for embedded system Depending on market, this market demands high accuracy and can real time target recognitio, such as autonomous driving vehicle and robot.But for insertion Formula system, the problem of limited battery and resource are serious.

Convolutional neural networks have a very wide range of applications in present image process field, and neutral net has training method simple The characteristics of single, calculating structure is unified.But neutral net storage amount of calculation is all very big.In present system it is many using 32 or 64 floating number numeral expression systems of person.But bulk redundancy be present in the data of neutral net, with low bit fixed-point number one The data result of neutral net is influenceed in the case of a little little.

In past patent application, the successively change precision fixed point for convolutional neural networks has been proposed in inventor Method and device.For example, it is directed to the scheme of simple branchiess neutral net.Specifically, can only be directed to pure serial Neutral net (any one layer of Layer N of neutral net, one and only one precursor layer, one and only one subsequent layer), As shown in Fig. 2 and re -training is not carried out to neutral net.Basic procedure includes：For the neutral net of input, by from defeated Enter to output, successively by a function error is minimized, decide each layer of precision, to the last one layer.

Such scheme is successively propagated using fix-point method, the front layer that each layer of fixed point will rely on.For network structure That branch merges with branch be present to have no idea to handle.Such scheme for current trend network (GoogLeNet, SqueezeNet etc.) and do not apply to.

The content of the invention

The target of the invention is exactly to solve the fixed-point problem with branching networks.The present invention devise a kind of scheme to Determine to determine the position of fixed-point number decimal point, while the dynamic range of analyze data when fixed-point number bit wide, find neutral net Each layer of dynamic range is different, so we have proposed the method pinpointed respectively to different parameters in each layer according to different layers.

According to an aspect of the invention, it is proposed that a kind of method for quantifying artificial neural network (ANN), methods described bag Include：(a) fixed point quantization is carried out to the number range of each layer of output in the artificial neural network；(b) determine described artificial Each sub-network in neutral net, wherein each sub-network is using cascading layers as last layer, wherein each cascading layers are before An output is merged into the output that multiple layers of one-level；(c) each sub-network is directed to, the fixed point of the output based on cascading layers quantifies model Enclose, change the fixed point quantizing range of the output of each layer of the previous stage of the cascading layers.

According to another aspect of the present invention, the present invention proposes a kind of method for quantifying artificial neural network (ANN), wherein The ANN is the neutral net for having branch, is comprised at least：1st, the 2nd ... n-th of convolutional layer (CONV layers), the 1st, the 2nd ... M is complete to connect layer (FC layers), the 1st, the 2nd ... l-th of CONCAT layer, wherein described n, m, l are positive integer, methods described includes：It is right The number range of each layer of output carries out fixed point quantization in the CONV layers, FC layers, CONCAT layers；Determine in the ANN Multiple sub-networks, each sub-network is using CONCAT layers as last layer, wherein multiple layer of each CONCAT layers previous stage Output merge into an output；For each sub-network, the fixed point quantizing range of the output based on CONCAT layers, described in modification The fixed point quantizing range of the output of each layer of the previous stage of CONCAT layers.

According to another aspect of the invention, the present invention proposes a kind of device for quantifying artificial neural network (ANN), wherein The ANN is the neutral net for having branch, is comprised at least：1st, the 2nd ... n-th of convolutional layer (CONV layers), the 1st, the 2nd ... M is complete to connect layer (FC layers), the 1st, the 2nd ... l-th of CONCAT layer, wherein described n, m, l are positive integer, described device includes：Number According to fixed point quantization device, for being pinpointed to the number range of each layer of output in the CONV layers, FC layers, CONCAT layers Quantify；CONCAT layer determining devices, for determining multiple sub-networks in the ANN, each sub-network using CONCAT layers as Last layer, wherein multiple layers of output of previous stage is merged into an output by each CONCAT layers；Adjusting apparatus is pinpointed, is used In to each sub-network, the fixed point quantizing range of the output based on CONCAT layers, the fixed point of the input of the CONCAT layers is changed Quantizing range.

According to another aspect proposed by the present invention, in dynamic quantization, it is allowed to which neutral net has the structure of Multi-layer Parallel, example In Fig. 6 " DepthConcat " represent layer.Pinpoint the basic procedure quantified：For the neutral net of input, by from input It is divided into several levels to output, one-level is can be regarded as in multiple inputs in parallel of a certain layer；By a function error is minimized step by step, really The precision of every one-level is decided, to the last one-level.

According to another aspect proposed by the present invention, it is further improved, cross-layer, RoI pooling layers can be supported Not support layer before more so, be one more comprehensively and novel quantization method.

Brief description of the drawings

Fig. 1 shows typical CNN schematic diagram.

Fig. 2 shows to CNN compressions, quantization, compiled so as to realize the schematic diagram of optimization acceleration.

Fig. 3 shows the schematic diagram of the quantization step in flow shown in Fig. 2.

Fig. 4 shows each convolutional layer in the CNN networks being connected in series, the full schematic diagram for connecting layer and the output of each layer.

Fig. 5 shows the quantization scheme of the CNN for being connected in series.

Fig. 6 a-6c show the example GoogLeNet in the presence of the complicated CNN for being connected in parallel and being connected in series.

Fig. 7 shows the CONCAT operations for complicated CNN.

Fig. 8 shows the schematic diagram of CONCAT operations.

Fig. 9 shows the quantization scheme for complicated CNN.

Figure 10 shows the fixed point quantization that previous stage input is adjusted according to CONCAT layers based on the embodiment of the present invention The example put.

Embodiment

A part of content of the application is once by inventor Yao Song academic article " Going Deeper With Embedded FPGA Platform for Convolutional Neural Network " (2016.2) are delivered.The application The content of above-mentioned article is included, and has carried out more improvement on its basis.

In the application, mainly it will illustrate improvement of the present invention to CNN by taking image procossing as an example.The scheme of the application is applicable In various artificial neural networks, including deep neural network (DNN), Recognition with Recurrent Neural Network (RNN) and convolutional neural networks (CNN).Illustrated below by taking CNN as an example

CNN basic conceptions

CNN reaches state-of-the-art performance in extensive visual correlation task.Help, which is understood in the application, to be analyzed Based on CNN image classification algorithms, we describe CNN rudimentary knowledge first, introduce image network data set and existing CNN moulds Type.

As shown in figure 1, typical CNN is made up of a series of layer of orderly functions.

The parameter of CNN models is referred to as " weight " (weights).CNN first layer reads input picture, and exports a system The characteristic pattern (map) of row.Following layer reads the characteristic pattern as caused by last layer, and exports new characteristic pattern.Last point The probability for each classification that class device (classifier) output input picture may belong to.CONV layers (convolutional layer) and FC layers are (complete Even layer) it is two kinds of basic channel types in CNN.After CONV layers, generally there is tether layer (Pooling layers).

In this application, for a CNN layer,J-th of input feature vector figure (input feature map) is represented,Represent i-th of output characteristic figure (output feature map), b_iRepresent the bias term of i-th of output figure.

For CONV layers, n_inAnd n_outThe quantity of input and output characteristic figure is represented respectively.

For FC layers, n_inAnd n_outThe length of input and output characteristic vector is represented respectively.

The definition of CONV layers (Convolutional layers, convolutional layer)：CONV layers are using series of features figure as defeated Enter, and output characteristic figure is obtained with convolution kernels convolution.

The non-linear layer being generally connected with CONV layers, i.e. nonlinear activation function, be applied to every in output characteristic figure Individual element.

CONV layers can be represented with expression formula 1:

Wherein g_i,jIt is applied to the convolution kernels of j-th of input feature vector figure and i-th of output characteristic figure.

The definition of FC layers (Fully-Connected layers, connect layer entirely)：FC layers are applied on input feature value One linear transformation：

f^out=Wfⁱⁿ+b (2)

W is a n_out×n_inTransformation matrix, b are bias terms.It is noted that for FC layers, input is not several two dimensions The combination of characteristic pattern, but a characteristic vector.Therefore, in expression formula 2, parameter n_inAnd n_outActually correspond to input and The length of output characteristic vector.

Collect (pooling) layer：Generally it is connected with CONV layers, for exporting each subregion in each characteristic pattern (subarea) maximum or average value.Pooling maximums can be represented by expression formula 3:

Wherein p is the size for collecting kernel.This nonlinear " down-sampled " is not only that next layer reduces characteristic pattern Size and calculating, additionally provide a kind of translation invariant (translation invariance).

CNN can be used for during forward inference carrying out image classification.But before CNN is used to any task, it should first First train CNN data sets.It has recently been demonstrated that the CNN of the forward direction training based on large data sets for a Given task Model can be used for other tasks, and realize high-precision minor adjustment in network weight (network weights), this Minor adjustment is called " fine setting (fine-tune) ".CNN training is mainly realized on large server.For embedded FPGA platform, we are absorbed in the reasoning process for accelerating CNN.

Fig. 2 is shown to accelerate a whole set of technical scheme that CNN proposes from handling process and the angle of hardware structure.

Artificial nerve network model is shown on the left of Fig. 2, that is, the target to be optimized.Illustrated how between in fig. 2 compression, Quantify, compiling CNN models reduce loss of significance to greatest extent to reduce EMS memory occupation and operation amount.It is aobvious on the right side of Fig. 2 The specialized hardware provided for the CNN after compression has been provided.

The dynamic quantization scheme for neutral net of connecting

Fig. 3 shows the more details of Fig. 2 quantization step.

For a fixed-point number, its value represents as follows：

Wherein bw is several bit widths, f_lBe can be negative part length (fractional length).

In order to obtain full accuracy while floating number is converted into fixed-point number, it is proposed that a dynamic accuracy data quantization Tactful and automatic workflow.

It is different from former static accuracy quantization strategy, in the data quantization flow proposed, f_lFor different layers and Feature atlas is dynamic change, while keeps static in one layer, to reduce by every layer of truncated error as far as possible.

As shown in figure 3, the quantization flow that the application is proposed mainly is made up of two stages：Weight quantization stage and data Quantization stage.

The purpose of weight quantization stage is the optimal f for the weight for finding a layer_l, such as expression formula 5：

Wherein W is weight, and W (bw, fl) is represented in given bw and f_lUnder W fixed point format.

Alternatively, the dynamic range of each layer of weight is analyzed first, such as is estimated by sampling.Afterwards, in order to Avoid data from overflowing, initialize f_l.In addition, we are in initial f_lThe optimal f of neighborhood search_l。

Alternatively, in weight pinpoints quantization step, optimal f is found using another way_l, such as expression formula 6.

Wherein, i represents a certain position in bw positions, k_iFor this weight.By the way of expression formula 6, to different positions Different weights is given, then calculates optimal f_l。

The data quantization stage is it is intended that the feature atlas between two layers of CNN models finds optimal f_l., can be with this stage CNN is trained using training dataset (bench mark).The training dataset can be data set0.

Alternatively, all CNN CONV layers are completed first, the weight of FC layers quantifies, then carries out data quantization.Now, Training dataset is input to the CNN for being quantized weight, and by the successively processing of CONV layers, FC layers, it is special to obtain each layer input Sign figure.

For each layer of input feature vector figure, successively compared in fixed point CNN models and floating-point CNN models using greedy algorithm Between data, to reduce loss of significance.Each layer of optimization aim is as shown in expression formula 7：

In expression formula 7, when A represents the calculating of one layer (such as a certain CONV layers or FC layers), x represents input, x⁺=A During x, x⁺Represent the output of this layer.It is worth noting that, for CONV layers or FC layers, direct result x⁺With than given standard Longer bit width.Therefore, as optimal f_lNeed to block during selection.Finally, whole data quantization configuration generation.

According to another embodiment of the invention, in data pinpoint quantization step, found most using another way Good f_l, such as expression formula 8.

Wherein, i represents a certain position in bw positions, k_iFor this weight.It is similar with the mode of expression formula 6, to different Different weights is given in position, then calculates optimal f_l。

Above-mentioned data quantization step obtains optimal f_l。

In addition, weight quantifies and data quantization can be alternately.For the flow order of data processing, the ANN Convolutional layer (CONV layers), to connect each layer in layer (FC layers) entirely be series relationship, the training dataset is by the CONV of the ANN Each feature atlas that layer and FC layers obtain when handling successively.

Specifically, the weight quantization step and the data quantization step according to the series relationship alternately, Wherein completed wherein in the weight quantization step when the fixed point of after pinpointing quantization of preceding layer, next layer of beginning quantifies it Before, data quantization step is performed to the feature atlas exported when preceding layer.

The above-mentioned scheme for successively becoming precision fixed point method and device and being applied to simple branchiess neutral net.

Fig. 4 shows pure serial neutral net, any one layer of Layer N of neutral net, one and only one forerunner Layer, one and only one subsequent layer.Basic procedure：For the neutral net of input, by from output is input to, successively by a letter Number causes error to minimize, and decides each layer of precision, to the last one layer.

Positioned ways shown in Fig. 5：Most suitable fixed position is found by the way of successively pinpointing.

It can be seen that Fig. 5 method needs online generation fixed-point number neutral net.So-called " online ", exactly chooses some Typical picture, the serial picture is tested, the result of centre is just can know that during testing these pictures. Neutral net is pinpointed because Fig. 5 scheme employs the mode successively pinpointed, so needing one to support fixed-point number Testing tool, the output result of fixed point last layer is had already passed through during the input of instrument, the output of instrument is this layer of fixed-point number network Result.

The fixed point dynamic quantization scheme of complex network

Fig. 5 scheme is successively propagated using fix-point method, the front layer that each layer of fixed point will rely on.For network structure That branch merges with branch be present to have no idea to handle.

Fig. 5 scheme for current trend network (GoogLeNet, SqueezeNet etc.) and do not apply to.Fig. 5 method Each layer of fixed-point operation all relies on the fixed point results of last layer, so having considerable restraint to network structure.

Fig. 6 a-6c show an example GoogLeNet of the neutral net of complexity, and wherein network has multiple branches, together When the relation including series, parallel, Fig. 6 c are the inputs of GoogLeNet models, and Fig. 6 a are the output of GoogLeNet models.Fig. 6 Shown complex network GoogLeNet more information may refer to Christian Szegedy et al. Going deeper The texts of with convolutions mono-.

For there is the network of branch (such as GoogLeNet), the cascade (CONCAT, concatenation) for multilayer There is the input that multiple layers of output is linked into CONCAT on operation, upper strata.

CONCAT operations just refer to the data of each input layer being attached (cascade, CONCATenation) according to passage Into new one layer, then export to next layer.Such as CONCAT input has two layers：A and input B are inputted, inputs A characteristic pattern Size is WxH, port number C1, and the characteristic pattern size for inputting B is WxH, port number C2.By the spy after CONCAT layers It is WxHx (C1+C2) to levy figure dimension.

In an example as shown in Figure 7, CONCAT layers have 4 inputs, 1*1 convolutional layer, 3*3 convolutional layer, 5*5 The maximum tether layer of convolutional layer, 3*3, CONCAT layers cascade up this 4 inputs, there is provided an output.There is the complexity of branch Neutral net needs CONCAT to operate, so as to have corresponding CONCAT layers in neural network model.

Fig. 8 shows the example for the operation that CONCAT layers perform.

BLOB (binary large object) is binary large object, is an appearance that can store binary file Device.In a computer, BLOB is often the field type for being used for storing binary file in database.BLOB can be understood as one Individual big file, typical BLOB is a pictures or an audio files, due to their size, it is necessary to uses special mode Come handle (such as：Upload, download or be stored in a database).

In embodiments of the present invention, BLOB can be understood as the data structure of the four-dimension.CONCAT layers are the multiple of previous stage BLOB1, BLOB2 of the output of layer ... BLOBn is cascaded as an output.

Further, realize within hardware CONCAT operate, by change each input BLOB1, BLOB2 ... BLOBn is in internal memory In the position (memory address) put realize the merging of branch.

As shown in figure 8, BLOB 1,2,3 ... n fixed point configuration information may be inconsistent.However, in actual hardware, It is required that the fixed point configuration of all inputs of CONCAT layers is consistent.If it is inconsistent to pinpoint configuration information, CONCAT can be caused The data collision of layer, neutral net successively can not be run to next layer again.

As shown in figure 9, in order to solve the above problems, we determine determining for the input range of neutral net using new method Point position.

In the method as shown in figure 9, CNN (convolutional neural networks) is the neutral net for having branch, is comprised at least：1st, 2nd ... n-th of convolutional layer (CONV layers), the 1st, the 2nd ... m-th complete to connect layer (FC layers), the 1st, the 2nd ... l-th of CONCAT layer, Wherein described n, m, l are positive integer.

Weight parameter flow shown in Fig. 9 left-hand branch is roughly the same with Fig. 5.It is different from method shown in Fig. 5, in Fig. 9 In the data quantization flow of right-hand branch, comprise the following steps.

First, the number range of the output to each layer (each layer in CONV layers, FC layers, CONCAT layers) of the CNN Estimated, wherein the numerical value is floating number.

According to one embodiment of present invention, wherein first step includes：Input data is supplied to the CNN, it is described Input data connects layer (FC layers), l CONCAT layers processing entirely by the m convolutional layer (CONV layers) of the CNN, n, obtains each layer Output.

Second, the number range of above-mentioned output is quantified as fixed-point number from floating number fixed point.

Every layer of output is quantified as fixed-point number by above-mentioned steps from floating number, wherein the output for every layer is dynamically chosen Quantizing range, the quantizing range are constant in this described layer.

According to one embodiment of present invention, optimal f can be calculated by the way of formula 7 or 8_l, so that it is determined that fixed Point quantizing range.

3rd, the fixed point quantizing range of the output based on CONCAT layers, each input for changing the CONCAT layers is determined Point quantizing range.

Third step includes：Each CONCAT layers in the CNN are determined, wherein each CONCAT layers are the more of previous stage The output of oneself is merged into the output of individual layer.For example, multiple sub-networks can be found out from the network model of the CNN, Each sub-network is using CONCAT layers as last layer, so as to be handled using the sub-network as unit.

According to one embodiment of present invention, third step also includes：It is defeated to multiple layers of the previous stage of CONCAT layers Go out, the fixed point quantizing range and the fixed point amount of the output of the CONCAT layers of every layer of output of previous stage of the CONCAT layers Change scope.If it is not the same, then the fixed point quantizing range of the input is revised as the fixed point amount with the output of the CONCAT layers Change scope.

According to one embodiment of present invention, third step also includes：If some is defeated for the previous stage of a CONCAT layer It is another CONCAT layer to enter, then performs the step of step the three using another described CONCAT layer as another sub-network, iteration Suddenly.

As shown in Fig. 9 left-hand branch, with according to one embodiment of the present of invention, in addition to：Weight pinpoints quantization step, The weight of each layer is quantified as fixed-point number from floating number in CONV layers, FC layers and CONCAT layers.

In addition, the weight quantization flow of left-hand branch shown in Fig. 9 and the data quantization flow of right-hand branch can be simultaneously Perform, can also be alternately performed.

For example, before performing data fixed point quantization step, first to described in the completion of all CONV layers, FC layers and CONCAT layers Weight pinpoints quantization step.

Or the weight quantization step and the data quantization step can be alternately.Located according to input data The order of reason, the weight quantization step complete the convolutional layer (CONV layers), connect full layer (FC layers), CONCAT layers it is current One layer of fixed point quantify after, before the fixed point for starting next layer quantifies, output to this layer performs data quantization step.

According to one embodiment of present invention, also additionally four steps is included：After the third step, the CONV is exported The fixed point quantizing range of each layer of output in layer, FC layers, CONCAT layers.

In Figure 10 example, CONCAT layers have two inputs, are convolutional layer CONV3, CONV4 respectively, the input of CONV3 layers For CONV2, the input of CONV2 layers is CONV1.

According to Fig. 9 flow, first step includes：Enter data into neutral net, obtain each layer of output number According to, and determine the number range of the output of each layer.For example, Figure 10 shows the numerical value of the output of each CONV layers and CONCAT layers Scope, such as with Gaussian Profile.

Second step, the number range of the output of each layer is quantified as fixed-point number from floating number fixed point.For example, reference formula 4, it is assumed that floating number is quantified as the fixed-point number of 8, i.e. bw=8.For example, the output fixed point of CONV3 layers is f_l=5, CONV4 The output fixed point of layer is f_lThe output fixed point of=3, CONCAT layer is f_l=4.

Third step, the fixed point configuration information of the output based on the CONCAT layers, to change each of the CONCAT layers Individual input CONV3, CONV4 fixed point quantizing range.

For example, the fixed point quantizing range (f of the output of the CONV3_l=5) determine with the output of the CONCAT layers Point quantizing range (f_l=4).Both differ, then the fixed point quantizing range of CONV3 output are revised as and the CONCAT The fixed point quantizing range of the output of layer.Therefore, the fixed point quantizing range of CONV3 output is by modification f_l=4.

Next, the fixed point quantizing range (f of the output of the CONV4_l=3) with the output of the CONCAT layers Pinpoint quantizing range (f_l=4), both differ, then the fixed point quantizing range of CONV4 output be revised as with it is described The fixed point quantizing range of the output of CONCAT layers.Therefore, the fixed point quantizing range of CONV4 output is by modification f_l=4.

If CONCAT layers also have other inputs, modify in a similar manner.

In addition, if CONCAT layers CONCAT₁Some input and a CONCAT layers CONCAT₁, then iteration behaviour is performed Make.First, by the latter CONCAT₂It is considered as the input of previous stage, according to output CONCAT₁Fixed point configuration modify；Then, Again by amended CONCAT₂As output, CONCAT is changed₂The each input of previous stage fixed point configuration.

Moreover, it will be appreciated that the solution of the present invention is applied to various forms of complicated artificial neural networks, it is not limited solely to Artificial neural network with CONCAT cascade operations, with branch.In addition, CONCAT operations should also be as managing as broad sense Solution, i.e. different sub-networks (or, network branches) is combined as the operation of a network.

In addition, " multiple " in description of the invention and claim refer to two or more.

It should be noted that each embodiment in this specification is described by the way of progressive, each embodiment emphasis What is illustrated is all the difference with other embodiment, between each embodiment identical similar part mutually referring to.

In several embodiments provided herein, it should be understood that disclosed apparatus and method, can also pass through Other modes are realized.Device embodiment described above is only schematical, for example, flow chart and block diagram in accompanying drawing Show the device of multiple embodiments according to the present invention, method and computer program product architectural framework in the cards, Function and operation.At this point, each square frame in flow chart or block diagram can represent the one of a module, program segment or code Part, a part for the module, program segment or code include one or more and are used to realize holding for defined logic function Row instruction.It should also be noted that at some as in the implementation replaced, the function that is marked in square frame can also with different from The order marked in accompanying drawing occurs.For example, two continuous square frames can essentially perform substantially in parallel, they are sometimes It can perform in the opposite order, this is depending on involved function.It is it is also noted that every in block diagram and/or flow chart The combination of individual square frame and block diagram and/or the square frame in flow chart, function or the special base of action as defined in performing can be used Realize, or can be realized with the combination of specialized hardware and computer instruction in the system of hardware.

The preferred embodiments of the present invention are the foregoing is only, are not intended to limit the invention, for the skill of this area For art personnel, the present invention can have various modifications and variations.Within the spirit and principles of the invention, that is made any repaiies Change, equivalent substitution, improvement etc., should be included in the scope of the protection.It should be noted that：Similar label and letter exists Similar terms is represented in following accompanying drawing, therefore, once being defined in a certain Xiang Yi accompanying drawing, is then not required in subsequent accompanying drawing It is further defined and explained.

The foregoing is only a specific embodiment of the invention, but protection scope of the present invention is not limited thereto, any Those familiar with the art the invention discloses technical scope in, change or replacement can be readily occurred in, should all be contained Cover within protection scope of the present invention.Therefore, protection scope of the present invention described should be defined by scope of the claims.

Claims

1. the method that one kind quantifies artificial neural network (ANN), wherein the ANN is the neutral net for having branch, is comprised at least： 1st, the 2nd ... n-th of convolutional layer (CONV layers), the 1st, the 2nd ... m-th complete to connect layer (FC layers), the 1st, the 2nd ... l-th of cascade Layer (CONCAT layers), wherein described n, m, l are positive integer, methods described includes：

(a) fixed point quantization is carried out to the number range of each layer of output in the CONV layers, FC layers, CONCAT layers；

(b) each sub-network in the ANN is determined, each sub-network is using CONCAT layers as last layer, wherein each The output of multiple layers of previous stage is merged into an output by CONCAT layers；

(c) each sub-network is directed to, the fixed point quantizing range of the output based on CONCAT layers, changes the previous of the CONCAT layers The fixed point quantizing range of the output of each layer of level.

2. method according to claim 1, wherein step c includes：

For multiple layers of the previous stage of CONCAT layers, the fixed point amount of the output of each layer of the previous stage of the CONCAT layers Change scope and the fixed point quantizing range of the output of the CONCAT layers, if it is not the same, then the fixed point quantizing range of the input It is revised as the fixed point quantizing range with the output of the CONCAT layers.

3. method according to claim 1, wherein step c includes：

If some input that last layer of the sub-network is CONCAT layers and the CONCAT layers is another CONCAT Layer, then using another described CONCAT layer as another sub-network, perform step c.

4. method according to claim 1, wherein step a includes：

Input data is supplied to the ANN, the input data is connected entirely by the m convolutional layer (CONV layers) of the ANN, n Every layer of output, is quantified as fixed-point number, wherein for every layer of output by layer (FC layers), l CONCAT layers processing from floating number Quantizing range is dynamically chosen, the quantizing range is constant in this described layer.

5. method according to claim 4, the output every layer is quantified as fixed-point number from floating number and further comprised：

Fixed-point number y is represented using following expression：

<mrow> <mi>y</mi> <mo>=</mo> <munderover> <mo>&Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <mi>b</mi> <mi>w</mi> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <msub> <mi>B</mi> <mi>i</mi> </msub> <mo>&CenterDot;</mo> <msup> <mn>2</mn> <mrow> <mo>-</mo> <msub> <mi>f</mi> <mi>l</mi> </msub> </mrow> </msup> <mo>&CenterDot;</mo> <msup> <mn>2</mn> <mi>i</mi> </msup> </mrow>

Wherein bw is the bit width of fixed-point number number, f_lBe can be negative part length；

Find the output of each layer obtained when the input data is handled by the CONV layers, FC layers, CONCAT layers of the ANN most Excellent f_l。

6. method according to claim 5, wherein finding optimal f_lThe step of include：

Analyze the floating number span of every layer of output；

Based on the scope, to f_lOne initial value is set；

With following expression initial value the optimal f of neighborhood search_l, based on the f_lFixed-point number expression with floating number express it Between error it is minimum,

<mrow> <msub> <mi>f</mi> <mi>l</mi> </msub> <mo>=</mo> <mi>arg</mi> <munder> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> <msub> <mi>f</mi> <mi>l</mi> </msub> </munder> <mo>&Sigma;</mo> <mo>|</mo> <msubsup> <mi>x</mi> <mrow> <mi>f</mi> <mi>l</mi> <mi>o</mi> <mi>a</mi> <mi>t</mi> </mrow> <mo>+</mo> </msubsup> <mo>-</mo> <msup> <mi>x</mi> <mo>+</mo> </msup> <mrow> <mo>(</mo> <mi>b</mi> <mi>w</mi> <mo>,</mo> <msub> <mi>f</mi> <mi>l</mi> </msub> <mo>)</mo> </mrow> <mo>|</mo> </mrow>

Wherein x⁺=Ax, A represent any layer in CONV layers, FC layers or CONCAT layers, and x represents to be supplied to any layer Input, x⁺Represent the output of any layer.

7. according to the method for claim 5, wherein finding optimal f_lThe step of include：

Analyze the floating number span of every layer of output；

Based on the scope, to f_lOne initial value is set；

With following expression initial value the optimal f of neighborhood search_l,

<mrow> <msub> <mi>f</mi> <mi>l</mi> </msub> <mo>=</mo> <mi>arg</mi> <munder> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> <msub> <mi>f</mi> <mi>l</mi> </msub> </munder> <mo>&Sigma;</mo> <mrow> <mo>|</mo> <mrow> <munder> <mo>&Sigma;</mo> <mi>N</mi> </munder> <msub> <mi>k</mi> <mi>i</mi> </msub> <mo>|</mo> <msubsup> <mi>X</mi> <mrow> <msub> <mi>float</mi> <mi>i</mi> </msub> </mrow> <mo>+</mo> </msubsup> <mo>-</mo> <msup> <mi>X</mi> <mo>+</mo> </msup> <msub> <mrow> <mo>(</mo> <mi>b</mi> <mi>w</mi> <mo>,</mo> <msub> <mi>f</mi> <mi>l</mi> </msub> <mo>)</mo> </mrow> <mi>i</mi> </msub> <mo>|</mo> </mrow> <mo>|</mo> </mrow> </mrow>

Wherein x⁺=Ax, A represent any layer in CONV layers, FC layers or CONCAT layers, and x represents to be supplied to any layer Input, x⁺Represent the output of any layer；

I represents a certain position in bw position, k_iFor this weight.

8. method according to claim 1, in addition to：

After step c, the fixed point quantizing range of each layer of output in the CONV layers, FC layers, CONCAT layers is exported.

9. method according to claim 1, in addition to：

Weight pinpoints quantization step, and the weight of each layer in CONV layers, FC layers and CONCAT layers is quantified as fixed-point number from floating number.

10. method according to claim 9, the convolutional layer (CONV layers) ANN, connect entirely in layer (FC layers) and CONCAT layers Each layer of weight parameter is quantified as fixed-point number from floating number to be included：Quantization model is dynamically chosen for the weight parameter of each layer Enclose, the quantizing range is constant this layer of inside.

11. method according to claim 10, the weight parameter for each layer is dynamically chosen quantizing range and further wrapped Include：

Fixed-point number y is represented using following expression：

Wherein bw is the bit width of fixed-point number, f_lBe can be negative part length；

Find the optimal f for each layer of weight_l, with the weight being quantized described in representing.

12. method according to claim 11, wherein finding optimal f_lThe step of include：

Estimate the floating number span of each layer of weight parameter；

Based on the scope, to f_lOne initial value is set；

With following expression initial value the optimal f of neighborhood search_lSo that the f_lFixed-point number weight parameter and floating number weigh Error between weight parameter is minimum,

<mrow> <msub> <mi>f</mi> <mi>l</mi> </msub> <mo>=</mo> <mi>arg</mi> <munder> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> <msub> <mi>f</mi> <mi>l</mi> </msub> </munder> <mo>&Sigma;</mo> <mo>|</mo> <msub> <mi>W</mi> <mrow> <mi>f</mi> <mi>l</mi> <mi>o</mi> <mi>a</mi> <mi>t</mi> </mrow> </msub> <mo>-</mo> <mi>W</mi> <mrow> <mo>(</mo> <mi>b</mi> <mi>w</mi> <mo>,</mo> <msub> <mi>f</mi> <mi>l</mi> </msub> <mo>)</mo> </mrow> <mo>|</mo> </mrow>

13. according to the method for claim 11, wherein finding optimal f_lThe step of include：

Estimate the floating number span of each layer of weight parameter；

Based on the scope, to f_lOne initial value is set；

With following expression initial value the optimal f of neighborhood search_l,

<mrow> <msub> <mi>f</mi> <mi>l</mi> </msub> <mo>=</mo> <mi>arg</mi> <munder> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> <msub> <mi>f</mi> <mi>l</mi> </msub> </munder> <mo>&Sigma;</mo> <mrow> <mo>|</mo> <mrow> <mo>&Sigma;</mo> <msub> <mi>k</mi> <mi>i</mi> </msub> <mo>|</mo> <msub> <mi>W</mi> <mrow> <msub> <mi>float</mi> <mi>i</mi> </msub> </mrow> </msub> <mo>-</mo> <mi>W</mi> <msub> <mrow> <mo>(</mo> <mi>b</mi> <mi>w</mi> <mo>,</mo> <msub> <mi>f</mi> <mi>l</mi> </msub> <mo>)</mo> </mrow> <mi>i</mi> </msub> <mo>|</mo> </mrow> <mo>|</mo> </mrow> </mrow>

Wherein W is weight, W (bw, f_l) represent in given bw and f_lUnder W fixed point format,

I represents a certain position in bw position, k_iFor this weight.

14. method according to claim 9, in addition to：

Before step a, the weight is completed to all CONV layers, FC layers and CONCAT layers and pinpoints quantization step.

15. method according to claim 9, in addition to：

The weight quantization step and the step a alternately, wherein completing the convolutional layer in the weight quantization step (CONV layers), connect full layer (FC layers), CONCAT layers when preceding layer fixed point quantify after, start next layer fixed point quantify Before, the output to this layer performs step a.

16. one kind quantifies the device of artificial neural network (ANN), wherein the ANN is the neutral net for having branch, at least wrap Include：1st, the 2nd ... n-th of convolutional layer (CONV layers), the 1st, the 2nd ... m-th complete to connect layer (FC layers), the 1st, the 2nd ... l-th Cascading layers (CONCAT layers), wherein described n, m, l are positive integer, described device includes：

Data pinpoint quantization device, for entering to the number range of each layer of output in the CONV layers, FC layers, CONCAT layers Row fixed point quantifies；

CONCAT layer determining devices, for determining each sub-network in the ANN, each sub-network is using CONCAT layers as most Later layer, wherein the output of multiple layers of previous stage is merged into an output by each CONCAT layers；

Adjusting apparatus is pinpointed, for each sub-network, the fixed point quantizing range of the output based on CONCAT layers, described in modification The fixed point quantizing range of the output of each layer of the previous stage of CONCAT layers.

17. device according to claim 16, wherein fixed point adjusting apparatus is configured as：

For the output of multiple layers of the previous stage of CONCAT layers, the output of each layer of the previous stage of the CONCAT layers is determined Point quantizing range and the fixed point quantizing range of the output of the CONCAT layers, if it is not the same, then the fixed point of the input is quantified Scope is revised as the fixed point quantizing range with the output of the CONCAT layers.

18. device according to claim 16, wherein fixed point adjusting apparatus is configured as：

If some input that last layer of the sub-network is CONCAT layers and the CONCAT layers is another CONCAT Layer, then using another described CONCAT layer as another sub-network, then carry out fixed point adjustment operation.

19. device according to claim 16, wherein data fixed point quantization device are configured as：

20. device according to claim 16, in addition to：

Weight pinpoints quantization device, fixed for the weight of each layer in CONV layers, FC layers and CONCAT layers to be quantified as from floating number Points.

21. the method that one kind quantifies artificial neural network (ANN), methods described include：

(a) fixed point quantization is carried out to the number range of each layer of output in the artificial neural network；

(b) each sub-network in the artificial neural network is determined, wherein each sub-network is using cascading layers as last layer, The output of multiple layers of previous stage is merged into an output by wherein each cascading layers；

(c) it is directed to each sub-network, the fixed point quantizing range of the output based on cascading layers, the previous stage for changing the cascading layers is each The fixed point quantizing range of the output of individual layer.

22. method according to claim 21, wherein step c includes：For multiple layers of the previous stage of the cascading layers, compare The fixed point quantizing range and the fixed point quantizing range of the output of the cascading layers of the output of each layer of the previous stage of the cascading layers, If it is not the same, then the fixed point quantizing range of the input is revised as the fixed point quantizing range with the output of the CONCAT layers.

23. method according to claim 21, wherein step c includes：If last layer of the sub-network be cascading layers and Some input of the cascading layers is another cascading layers, then using another described cascading layers as another sub-network, performs step Rapid c.

24. method according to claim 21, wherein the ANN is the neutral net for having branch, comprise at least：1st, the 2nd ... N-th of convolutional layer (CONV layers), the 1st, the 2nd ... m-th complete to connect layer (FC layers), the 1st, the 2nd ... l-th of cascading layers (CONCAT Layer).

25. method according to claim 24, wherein step a includes：

Input data is supplied to the ANN, the input data is connected entirely by the m convolutional layer (CONV layers) of the ANN, n Every layer of output, is quantified as fixed-point number, wherein for every by layer (FC layers), l cascading layers (CONCAT layers) processing from floating number Quantizing range is dynamically chosen in the output of layer, and the quantizing range is constant in this described layer.