CN109389212A

CN109389212A - A kind of restructural activation quantization pond system towards low-bit width convolutional neural networks

Info

Publication number: CN109389212A
Application number: CN201811646433.9A
Authority: CN
Inventors: 李丽; 陈沁雨; 傅玉祥; 陈铠; 何书专; 陈辉; 程开丰
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2018-12-30
Filing date: 2018-12-30
Publication date: 2019-02-26
Anticipated expiration: 2038-12-30
Also published as: CN109389212B

Abstract

Restructural activation quantization pond system towards low-bit width convolutional neural networks of the invention includes: several restructural activation quantization pond processing units, for executing activation, quantization, pondization operation, and executes operating mode and activate-quantify operating mode or the restructural operation of activation-quantization-Chi Huagong operation mode；Storage unit controller, for controlling the data transmission of restructural activation quantization pond unit and storage unit under different configurations；Storage unit, for convolutional layer result data needed for scratch poolization operation.Software optimization design can reduce redundant computation and not change original function by the way that several steps such as the activation of low-bit width convolutional neural networks, quantization are reduced to a step.The utility model has the advantages that reducing hardware resource area by being mapped in tri- activation, quantization, Chi Hua steps on same hardware cell in restructural mode；Using the method for hardware and software co-optimization, have the characteristics that area is small, low in energy consumption, flexibility is high.

Description

A kind of restructural activation quantization pond system towards low-bit width convolutional neural networks

Technical field

The invention belongs to the hardware-accelerated fields of intelligent algorithm, more particularly to one kind is towards low-bit width convolutional neural networks Restructural activation quantify pond system.

Background technique

Low-bit width convolutional neural networks are typically expressed as 4bit and quantization convolutional neural networks below, are different from tradition Convolutional neural networks, its weight and image input data can be indicated only with several bit, such as binaryzation network, three Value network and other low-bit widths quantify neural network.The weight and image input data of binaryzation network can be only with 0 or 1 tables Show；The weight of three-valued network is only used and 0 or 1 is indicated, image input data characterization is -1,0 or 1；In many other low-bit widths Quantify in neural network, commonly uses certain bit combination mode to express certain number, such as " 01 " of 2bit indicates numerical value 0.5.

In low-bit width convolutional neural networks, in addition to convolutional layer, active coating, the pond layer for including in traditional network, can also Specially one quantization operation of design, the image output data of generation is quantified again to the bit wide set originally.

It is more and more for the hardware design of this low-bit width convolutional neural networks in recent years, the common processing of convolutional layer Process is that sequence executes following operation: convolution, batch standardization, activation, quantization and pond, some convolutional layers do not have pondization operation； The complete usual process flow of articulamentum is that sequence executes following operation: full connection, batch standardization, activation and quantization.But it is such Serial operation can reduce treatment effeciency, bring additional hardware spending, cannot meet the needs of practical application well.

Summary of the invention

It is an object of the invention to overcome the deficiency of the above prior art, one kind is provided and effectively improves activation, quantization, pond Change the flexibility of operation, reduce power consumption, reduces the restructural activation quantization towards low-bit width convolutional neural networks of hardware spending Pond system, is specifically realized by the following technical scheme:

The restructural activation towards low-bit width convolutional neural networks quantifies pond system, receives convolutional layer number of results According to the system includes:

Several restructural activation quantify pond processing unit, for executing activation, quantization, pondization operation, and execute Operating mode activates-quantifies operating mode or the restructural operation of activation-quantization-Chi Huagong operation mode；

Storage unit controller, for controlling the number of restructural activation quantization pond unit and storage unit under different configurations According to transmission；

Storage unit, for convolutional layer result data needed for scratch poolization operation.

It is described towards low-bit width convolutional neural networks restructural activation quantization pond system it is further design be, can Reconstruct activation quantization pond processing unit data under activation-quantization operating mode are transferred to restructural from convolution processing unit Activation quantization pond processing unit directly exports result data after treatment；Under activation-quantization-pond operating mode The convolutional layer result data is received, storage unit is first deposited into, under the control of storage unit controller, is passed to restructural sharp Quantization pond processing unit processes living, processing result are still stored back to storage unit.

It is described towards low-bit width convolutional neural networks restructural activation quantization pond system it is further design be, institute The activation primitive such as formula (1) in restructural activation quantization pond processing unit is stated,

x_o=min (abs (x_i),1) (1)

Wherein, x_iRepresent process of convolution it is complete after data, x_oIndicate the activation value after activation.

Quantization function such as formula (2) in restructural activation quantization pond processing unit,

Wherein, k indicates the bit bit wide after quantization, x at this_iIndicate activation value, x_oQuantized value after indicating quantization.

And corresponding Chi Huahe size is 2x 2, such as formula (3):

x_o(i, j)=max (x (2i, 2j), x (2i, 2j+1), x (2i+1,2j), x (2x+1,2j+1)) (3)

Wherein, i, j are illustrated respectively in the coordinate position in single channel input picture, x at this_iIndicate quantized value, x_oIndicate pond Pond value after change.

It is described towards low-bit width convolutional neural networks restructural activation quantization pond system it is further design be, institute The workflow for stating system includes the following steps:

Operating mode is determined first；If operating mode is activation-quantization, by analyzing different low bit convolutional Neural nets Activation primitive, the quantization method of network determine the series of characteristics or parameter of activation primitive, quantization method；Then activation letter is determined Several and quantization method output area crossing redundancy part, by this simplified partial；

If operating mode is activation-quantization-pond, analysis cell core on the basis of activation-quantization algorithm optimization is needed Size；After optimization, pondization operation is incorporated into activation-quantization operation, forms new activation-quantization-pondization operation.

It is described towards low-bit width convolutional neural networks restructural activation quantization pond system it is further design be, institute It states storage unit and supports ping-pong operation, the incoming data of a part of storage unit storage convolutional layer, another part storage unit is deposited Data needed for putting restructural activation quantization pond processing unit.

It is described towards low-bit width convolutional neural networks restructural activation quantization pond system it is further design be, institute Stating restructural activation quantization pond processing unit includes three phases unit, distinguish first stage unit, second stage unit with And phase III unit, each stage unit include comparator, gate and register, the comparison in first stage unit The input of device two is external image input data and threshold value 3, and the input of comparator two in second stage unit is single from the first stage The data and threshold value 2 of member output, the input of comparator two in phase III unit for the data that exports from second stage unit with Threshold value 1.

Advantages of the present invention is as follows:

Restructural activation quantization pond system towards low-bit width convolutional neural networks of the invention is mainly for low-bit width Convolutional neural networks characteristic, realize multiple types activation quantization pond soft or hard piece optimization；Spirit possessed by the design method It is active high, computation complexity is low, area is small, the characteristics such as low in energy consumption.

Detailed description of the invention

Fig. 1 is the module diagram that the restructural activation towards low-bit width convolutional neural networks quantifies pond system.

Fig. 2 is restructural activation quantization pond processing unit schematic diagram.

Fig. 3 is restructural activation quantization pond processing unit configuration schematic diagram.

Fig. 4 is system operation schematic diagram in the case where activation quantifies operating mode.

Fig. 5 is system operation schematic diagram in the case where quantifying pond chemical industry operation mode.

Specific embodiment

The present invention program is described in detail with reference to the accompanying drawing.

If the restructural activation quantization pond system towards low-bit width convolutional neural networks of Fig. 1, the present embodiment include more A restructural activation quantization pond processing unit, storage unit controller and storage unit；Restructural activation quantifies Chi Huadan Member is for executing activation, quantization, pondization operation；Storage unit controller is for controlling restructural activation quantization pond under different configurations Change the data transmission of unit and storage unit；Storage unit is for convolutional layer result data needed for scratch poolization operation.

In Fig. 1, dotted line indicates that data flow under activation-quantization operating mode, data are transferred to from convolution processing unit Restructural activation quantifies pond processing unit, after treatment, directly exports result data；Solid line indicates activation-quantization-pond Operating mode under data flow, convolutional layer result data is transferred to the module, is first deposited into storage unit, in storage unit Under the control of controller, restructural activation quantization pond processing unit processes are passed to, processing result is still stored back to storage unit；It should Storage unit supports ping-pong operation, to guarantee that implementation procedure will not be interrupted.

Below with reference to a specific low-bit width convolutional neural networks, the specific application method of the design method is illustrated.It should Low-bit width convolutional neural networks image input data bit wide is 2bit, weight 1bit；Activation primitive and quantization function are respectively such as Under:

x_o=min (abs (x_i),1) (4)

And Chi Huahe size is 2x 2, it is as follows:

x_o(i, j)=max (x (2i, 2j), x (2i, 2j+1), x (2i+1,2j), x (2x+1,2j+1)) (6)

System optimization software algorithm first, according to the parameter k=2 of specific low-bit width convolutional neural networks.Root According to analysis, activation primitive output area can be obtained in [0,1], the input range of quantization function is the output area of activation primitive； The threshold value of the quantization function is 1/6,1/2,5/6, after being quantified by quantization function, and output will fall in 0,1/3,2/3,1 this four Numerically；Activation and quantization function can be replaced by following series of comparisons, if input is greater than 5/6, quantized value takes 1；If Input is greater than 1/2 and is less than or equal to 5/6, then quantized value takes 2/3；If input is greater than 1/6 and is less than or equal to 1/2, quantized value takes 1/ 3；If x is less than or equal to 1/6, quantized value takes 0.

The system optimizes hardware components, because the bit wide of the image input data of the network is 2bit, institute With the number of stages of the multistage unit stream treatment framework for 3, as shown in Figure 2；It include a comparison during each stage unit is equal Device, several gates and register.The input of comparator two in stage unit 1 is external image input data and threshold value 3, stage The input of comparator two in unit 2 is the data and threshold value 2 exported from stage unit 1, and the comparator two in stage unit 3 inputs For the data and threshold value 1 exported from stage unit 2.Configuration words in figure are used to configure the operating mode of the unit, configuration words 1 When the cell operation mode be activation-quantization-pond, configuration words be 0 when, the cell operation under activation-quantization mode, have The numerical value configuration of body is as shown in Figure 3.

The unit under activation-quantization-Chi Huagong operation mode specifically executes process, referring to fig. 4.It is operated above comparator When number is greater than lower section operand, otherwise comparator output 1 exports 0；Assuming that (a, b, c, d) is the four of one 2x2 subregion of image A pixel value, the relationship of 3 threshold values and this four values are as follows: a > 3 > b of threshold value > 2 > c of threshold value > 1 > d of threshold value；At the moment 1, pixel value a It is compared with threshold value 3, compares 1 and set 1, be stored in enabled 1, quantized value 4 is then the quantized result of a, and there are in register 1； In the moment 2, pixel b enters stage unit 1 because enabled 1 is set to 1 in last moment, no matter b and threshold value 3 Between size relation, 1 will be remained by comparing 1, therefore the value in register 1 can't change, at the same time, in register Quantized value 4 in 1 is transmitted into stage unit 2, and is compared with threshold value 2, although threshold value 2 is greater than threshold value 1, gating 22 It is controlled by enabled 1 from stage unit 1, still remains 0, there are the ratios that the number in register 2 is also a upper stage unit Compared with result threshold value 3；And so on, when output enable signal sets high, the threshold value 3 being stored in register 3 is used as the subregion The result of pondization operation will be exported；Gray background then represents the stage unit being turned off in Fig. 4.

The unit under activation-quantization operating mode specifically executes process, referring to Fig. 5.After configuration words set 0, comparison signal It will not be influenced by the enable signal of the stage unit last moment, but still will receive the enable signal of a stage unit Control；Because it needs the number to each input to export after quantifying, not compared with activation-quantization-pond mode Need to turn off the relatively decimal of subsequent input in subregion.

If the stage, in some stage unit, input is greater than threshold value, enable signal will set 1, with from two sides in length and breadth Always subsequent operation is turned off, it is longitudinal to indicate remaining image input data, laterally indicate processing present image input data Operation inside Remaining Stages unit.

The system is optimized from two angles of software and hardware, using multistage unit stream treatment framework as base Plinth is support with Reconfiguration Technologies, is guiding with stage shutdown Low-power Technology, can make the power consumption drop of activation quantization pond module Low, area is reduced, flexibility is promoted.

The above, is only presently preferred embodiments of the present invention, is not that the invention has other forms of limitations, any ripe Know the equivalent reality that professional and technical personnel was changed or be modified as equivalent variations possibly also with the technology contents of the disclosure above Apply example.But without departing from the technical solutions of the present invention, to the above embodiments according to the technical essence of the invention Any simple modification, equivalent variations and remodeling, still fall within the protection scope of technical solution of the present invention.

Claims

1. a kind of restructural activation towards low-bit width convolutional neural networks quantifies pond system, reception convolutional layer result data its It is characterized in that: including:

Several restructural activation quantify pond processing unit, for executing activation, quantization, pondization operation, and execute work Mode activation-quantization operating mode or the restructural operation of activation-quantization-Chi Huagong operation mode；

Storage unit controller is passed for controlling the data of restructural activation quantization pond unit and storage unit under different configurations It is defeated；

2. the restructural activation according to claim 1 towards low-bit width convolutional neural networks quantifies pond system, special Sign is: restructural activation quantization pond processing unit data under activation-quantization operating mode are passed from convolution processing unit It is defeated to quantify pond processing unit to restructural activation, after treatment, directly export result data；In activation-quantization-pond The convolutional layer result data is received under operating mode, is first deposited into storage unit, under the control of storage unit controller, is passed Enter restructural activation quantization pond processing unit processes, processing result is still stored back to storage unit.

3. the restructural activation according to claim 1 towards low-bit width convolutional neural networks quantifies pond system, special Sign is: the activation primitive such as formula (1) in the restructural activation quantization pond processing unit,

x_o=min (abs (x_i), 1), (1)

Wherein, k indicates the bit bit wide after quantization, x at this_iIndicate activation value, x_oQuantized value after indicating quantization.And it is corresponding Chi Huahe size is 2x 2, such as formula (3):

x_o(i, j)=max (x (2i, 2j), x (2i, 2j+1), x (2i+1,2j), x (2x+1,2j+1)), (3)

Wherein, i, j are illustrated respectively in the coordinate position in single channel input picture, x at this_iIndicate quantized value, x_oIndicate Chi Huahou Pond value.

4. the restructural activation according to claim 1 towards low-bit width convolutional neural networks quantifies pond system, special Sign is: the workflow of the system includes the following steps:

Operating mode is determined first；If operating mode is activation-quantization, by analyzing different low bit convolutional neural networks Activation primitive, quantization method determine the series of characteristics or parameter of activation primitive, quantization method；Then determine activation primitive and The crossing redundancy part of quantization method output area, which is realized, to be simplified；If operating mode is activation-quantization-pond, need in activation- Analysis cell core size on the basis of the algorithm optimization of quantization；After optimization, pondization operation is incorporated into activation-quantization operation, shape Activation-quantization of Cheng Xin-pondization operation.

5. the restructural activation according to claim 4 towards low-bit width convolutional neural networks quantifies pond system, special Sign is: the storage unit supports ping-pong operation, the incoming data of a part of storage unit storage convolutional layer, another part to deposit Data needed for storage unit stores restructural activation quantization pond processing unit.

6. the restructural activation according to claim 1 towards low-bit width convolutional neural networks quantifies pond system, special Sign is: the restructural activation quantization pond processing unit includes three phases unit, respectively first stage unit, second-order Segment unit and phase III unit, each stage unit include comparator, gate and register, first stage unit In the input of comparator two be external image input data and threshold value 3, the input of comparator two in second stage unit is from the The data and threshold value 2 of one stage unit output, the input of comparator two in phase III unit is exports from second stage unit Data and threshold value 1.