CN111832719A - Fixed point quantization convolution neural network accelerator calculation circuit - Google Patents

Fixed point quantization convolution neural network accelerator calculation circuit Download PDF

Info

Publication number
CN111832719A
CN111832719A CN202010736970.3A CN202010736970A CN111832719A CN 111832719 A CN111832719 A CN 111832719A CN 202010736970 A CN202010736970 A CN 202010736970A CN 111832719 A CN111832719 A CN 111832719A
Authority
CN
China
Prior art keywords
input
adder
quantization
neural network
multiplier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010736970.3A
Other languages
Chinese (zh)
Inventor
贺雅娟
周航
蔡卢麟
朱飞宇
候博文
张波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202010736970.3A priority Critical patent/CN111832719A/en
Publication of CN111832719A publication Critical patent/CN111832719A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Complex Calculations (AREA)

Abstract

A fixed point quantization convolution neural network accelerator computing circuit belongs to the technical field of integrated circuits. Processing input data of N input channels by using N input channel processing units, and quickly adding a plurality of input characteristic graphs and a plurality of weights in the input data of each input channel after the multiple input characteristic graphs and the multiple weights are correspondingly multiplied to obtain convolution results of the N input channels; the partial accumulation adding unit is used for accumulating all convolution results of the N input channels and outputting the convolution results to the quantization activation unit; and the quantization activation unit sequentially performs offset accumulation, multiplication by an approximate multiplier, right shift, function activation, addition with zero data and output digit amplitude limiting to obtain an output result of the calculation circuit of the convolutional neural network accelerator. The method realizes the acceleration of the calculation speed of the convolutional neural network with fixed point quantization on the premise of not generating obvious precision loss, has the characteristics of low power consumption and small circuit area, and is suitable for the convolutional neural network system needing shaping quantization.

Description

Fixed point quantization convolution neural network accelerator calculation circuit
Technical Field
The invention belongs to the field of integrated circuits, and relates to a computing circuit of a fixed-point quantization convolutional neural network accelerator.
Background
Convolutional Neural Networks (CNNs) have enjoyed great success in the field of image recognition due to their excellent predictive performance. Nevertheless, modern CNNs with high inference accuracy typically have large model size and high computational complexity, resulting in complex deployment on data centers or edge devices, especially for application scenarios requiring low resource consumption or low response delay. To facilitate the application of complex CNNs, the emerging field of model compression research focuses on reducing the model size and execution time of CNNs with minimal loss of precision.
The accuracy of the network parameters can be quantified down to 1 bit, and the XNOR-net and related network variants can achieve 32 times compression of the network parameters, resulting in a significant reduction in model capacity, reducing memory and bandwidth overhead. However, the remarkable decrease in inference accuracy makes it difficult for the recent binary networks to satisfy practical applications. The three-value network uses 2 bits to represent model parameters, so the range of quantization parameters is increased to [ -1,0, +1], which helps to improve the accuracy in reasoning and has better application potential compared with a two-value network. Compared with a binary and ternary network, the INT8 quantization has a wider parameter search space, thereby better retaining inference precision, especially for a complex network. Therefore, the INT8 quantification technology has been widely used in the industry, such as in TensorFlow-Lite, TensorRT, etc. platforms.
Although fixed point quantization has reduced the complexity of convolution computation, there is still a large amount of computation for deep neural networks. By using the error tolerance of the convolutional neural network, approximate calculation can be introduced to further apply low-power design on hardware design, so that the low-power requirement of an embedded system can be met. In convolutional neural networks, the computation of convolutional layers takes up more than ninety percent of the total network computation. The multiplication and accumulation introduces approximate calculation, for example, the multiplier is designed into an approximate multiplier, and the power consumption of the calculation circuit can be reduced under the condition of not bringing obvious errors to a prediction system.
Disclosure of Invention
Aiming at the problem of power consumption caused by a large amount of operations of the convolutional layer in the fixed-point quantization convolutional neural network, the invention provides the fixed-point quantization convolutional neural network accelerator calculation circuit, which is used for finishing the large amount of operations of the convolutional layer, can accelerate the calculation speed of the fixed-point quantization convolutional neural network on the premise of not generating obvious precision loss, and has the characteristics of low power consumption and small circuit area.
The technical scheme adopted by the invention is as follows:
a fixed point quantization convolution neural network accelerator calculation circuit comprises N input channel processing units, a partial accumulation adding unit and a quantization activation unit, wherein N is a positive integer;
the input channel processing unit is used for processing input data of one input channel, and the input data of the input channel comprises a plurality of input feature maps and a plurality of weights; the input channel processing unit comprises an approximate multiplier array and a quick addition unit, wherein the approximate multiplier array is used for correspondingly multiplying a plurality of input feature maps and a plurality of weights in input data of one input channel; the fast addition unit is used for compressing and then adding partial product results output by the approximate multiplier array to obtain a convolution result of an input channel;
the partial accumulation adding unit is used for accumulating all convolution results of the N input channels output by the N input channel processing units respectively and outputting the result to the quantization activation unit;
the quantization activation unit includes:
a first adder, the first input end of which is connected with the convolution accumulation results of the N input channels output by the partial product accumulation unit, and the second input end of which is connected with the offset data;
a multiplier, a first multiplier input end of which is connected with the output end of the first adder, and a second multiplier input end of which is connected with an approximate multiplier;
the input end of the arithmetic right shift unit is connected with the output end of the multiplier;
the input end of the function activation unit is connected with the output end of the arithmetic right shift unit;
a second adder, a first input end of which is connected with the output end of the function activation unit, and a second input end of which is connected with zero data;
and the input end of the amplitude limiting unit is connected with the output end of the second adder and is used for limiting the output bit number of the second adder to a specified bit number, and the output end of the amplitude limiting unit is used as the output end of the convolutional neural network accelerator calculation circuit.
Specifically, the approximate multiplier array includes a plurality of approximate multipliers, the number of the approximate multipliers of the approximate multiplier array in each input channel processing unit, the number of the input feature maps in the input data of each input channel, and the number of the weights are determined by convolution kernels of the convolution neural network, for an M × M convolution kernel, M is a positive integer, M × M approximate multipliers are arranged in each input channel processing unit to form the approximate multiplier array, and M × M input feature map input ends and M × M weight input ends are arranged to receive the input data of one input channel.
Specifically, the approximate multiplier generates partial products by using Booth coding, adds sign bits to the partial products obtained after Booth coding to expand the partial products to form a partial product array, divides the partial product array into three parts according to the weight, wherein the part with the lowest weight is approximately compressed by using an OR gate, the part with the highest weight is accurately compressed by using a 3-2 compressor and a 4-2 compressor, or operates the sign bits and the corresponding partial products in the rest part and then approximately compresses by using a 4-2 compressor, and adds the compressed partial products by using a third adder to obtain the partial product result output by the approximate multiplier array.
Specifically, the fast adding unit includes a wallace tree and a fourth adder, the wallace tree performs sign bit expansion on partial product results output by the approximate multiplier array, and then performs three-time compression by using a 4-2 compressor and a 3-2 compressor respectively, and the fourth adder adds the partial products subjected to the three-time compression by the wallace tree to obtain a convolution result of the input channel.
Specifically, the partial product accumulation unit includes a fifth adder, a data selector and an intermediate result register,
the first input end of the fifth adder is connected with convolution results of N input channels output by the N input channel processing units respectively, the second input end of the fifth adder is connected with the output end of the intermediate result register, and the output end of the fifth adder is connected with the input end of the data selector;
the data selector outputs the output data of the fifth adder to the intermediate result register before the fifth adder does not complete the convolution result accumulation of the N input channels, and outputs the output data of the fifth adder to the quantization activation unit after the fifth adder completes the convolution result accumulation of the N input channels.
Specifically, the convolutional neural network adopts INT8 data type quantization, and the quantization scheme selects TensorFlow quantization specification.
Specifically, the approximate multiplier connected to the second multiplier input terminal of the multiplier, the right shift number of the arithmetic right shift unit, and the zero data connected to the second input terminal of the second adder are obtained according to a tensrflow quantization algorithm, the function activation unit is activated by ReLu, and the activation expression is:
Figure BDA0002605437600000031
the invention has the beneficial effects that: according to the invention, by optimizing the calculation unit, the memory space occupied by the weight and the characteristic value is effectively reduced, so that more data can be transmitted under the same bandwidth, the throughput rate and the energy efficiency are improved, the calculation speed of the convolutional neural network with fixed point quantization is accelerated on the premise of not generating obvious precision loss, the characteristics of low power consumption and small circuit area are realized, and the method is suitable for the convolutional neural network system needing shaping quantization.
Drawings
Fig. 1 is an overall structural diagram of a computation circuit of a fixed-point quantization convolutional neural network accelerator according to an embodiment of the present invention.
Fig. 2 is a radix-4 Booth coding circuit used in an embodiment of an approximate multiplier in a fixed-point quantization convolutional neural network accelerator calculation circuit provided by the invention.
Fig. 3 is a partial product generated by an approximate multiplier in a computation circuit of a fixed-point quantization convolutional neural network accelerator according to an embodiment of the present invention after Booth encoding is performed.
Fig. 4 is a schematic diagram of a specific operation process of an approximate multiplier in a computation circuit of a fixed-point quantization convolutional neural network accelerator according to an embodiment of the present invention.
FIG. 5 is a block diagram of a 3-2 compressor and a 4-2 compressor used in a computation circuit of a convolutional neural network accelerator with fixed point quantization according to the present invention.
Fig. 6 is a detailed structural diagram of a Wallace tree employed in an embodiment of the computation circuit of the convolutional neural network accelerator with fixed point quantization according to the present invention.
Fig. 7 is a detailed structural diagram of a quantization activation unit in a computation circuit of a convolutional neural network accelerator with fixed point quantization according to the present invention.
Detailed Description
The technical solution of the present invention is described in detail below with reference to the accompanying drawings and specific embodiments.
The present invention provides a convolutional neural network accelerator computing circuit based on fixed point quantization, which is described below by taking the convolutional neural network application of INT8 integer quantization, quantization scheme selection and quantization specification of tensrflow as an example, but the specific quantization type and quantization scheme should not be construed as a limitation to the present invention. The INT8 data type quantization is carried out on the convolutional neural network, on the basis, the accelerator calculation circuit provided by the invention is applied, after shaping and quantization, not only the memory space occupied by the weight is changed into the original 1/4, and more data can be transmitted under the same bandwidth, but also the calculation process is of a fixed point type, so that the throughput rate and the energy efficiency are further improved. The weights and characteristic diagrams of the input of the accelerator calculation circuit in the embodiment are INT8 type data, the number of shift bits and the offset.
As shown in fig. 1, the computation circuit of a fixed-point quantization convolutional neural network accelerator provided by the present invention includes N input channel processing units, a partial accumulation adding unit, and a quantization activating unit, where N, which is determined by the number of input channels, is a positive integer and can be selected according to factors such as resources.
Each input channel processing unit is used for processing input data of one input channel, and the input data of one input channel comprises a plurality of input feature maps and a plurality of weights. Each input channel processing unit comprises an approximate multiplier array and a quick addition unit, wherein the approximate multiplier array comprises a plurality of approximate multipliers for multiplying the characteristic graphs and the weights, and is used for correspondingly multiplying the plurality of input characteristic graphs and the plurality of weights in the input data of one input channel, the number of the approximate multipliers of the approximate multiplier array in each input channel processing unit and the number of the input characteristic graphs and the weights in the input data of each input channel are determined by a convolution kernel of a convolution neural network, M is a positive integer for M multiplied by M convolution kernels, M multiplied by M approximate multipliers are arranged in each input channel processing unit to form the approximate multiplier array, and M multiplied by M input characteristic graph input ends and M multiplied by M weight input ends are arranged for receiving the input data of one input channel. For example, for a 3 × 3 convolution kernel, the number of the feature map input ends, the weight input ends and the approximate multipliers in each input channel processing unit can be set to 9, due to INT8 integer quantization, each feature map input end and each weight input end are 8 bits, each approximate multiplier input end is 8 bits signed number, and outputs 16 bits signed number, each input channel processing unit has 9 inputs of 8 bits feature map input ends and 9 8 bits weight input ends, the feature map and weight input data of each input channel are 9 × 8 bits, the output data are 9 × 16 bits, and the number of bits is changed for different quantization schemes. In this embodiment, 3 × 3 convolution kernels and N parallel input channels are taken as examples, but the present invention should not be construed as a limitation to the present invention, and the present invention may also support a plurality of convolution kernel sizes by expansion, for example, for the sizes of different convolution kernels, such as 3 × 3, 5 × 5, 7 × 7, 11 × 11, and the like, the number of feature diagram input ends, the number of weight input ends, and the number of 8 × 8 approximate multipliers may be correspondingly increased. In addition, the embodiment adopts a fixed point complement form, and the range of the characteristic value input and the weight is [ -128,127 ].
In the embodiment, the 8-bit × 8-bit approximate multiplier generates a partial product by using radix-4 Booth coding, and can be replaced by different kinds of approximate multipliers according to the requirements of prediction precision and power consumption. Booth encoding the circuit for generating partial products is shown in FIG. 2, where Bn and An in FIG. 2 represent the representation of two M-bit (8-bit in this embodiment) signed binary operands, respectively, e.g., B0 represents the 0 th bit of B binary, and A7 represents the 7 th bit of A binary. For convenience of the formula we will here denote a-7 a6 … a0 and B-7B 6 … B0. The a by B expression is derived as follows:
Figure BDA0002605437600000051
wherein b is-10. Each term in the equation is called a partial product. The multiplier in the formula is divided into continuous arrays by mutually overlapped 3-bit bits, and each partial product belongs to one of-2A, -0, 2A, A and 0.
Each partial product is generated in the circuit, and the logical expression of Booth coded output is as follows:
Neg=b2i+1
Figure BDA0002605437600000052
Figure BDA0002605437600000053
Figure BDA0002605437600000054
neg, X1, X2, and Z are encoded by 3 adjacent multipliers { b2i +1, b2i,2b2i-1 }. Neg is the sign offset bit. When the partial product result is-2A, -A and-0, the multiplicand needs to be added with 1, Neg has the effect of complementing the multiplicand, and the value of Neg is 1. When the partial product is 2A, A and 0, the complement of the multiplicand is equal to the original code, and the Neg bit is 0. The Z signal is to prevent the multiplicand from shifting left when the partial product is 0 and-0. Neg, X1, X2, Z produce a circuit as shown in the left circuit in FIG. 2.
Figure BDA0002605437600000055
Figure BDA0002605437600000056
Figure BDA0002605437600000057
PPij is the partial product of the ith row and j column, which is logically combined by the multiplier aj, aj-1 and the multiplicand b2i +1, b2i,2b2i-1, as shown in the right circuit of FIG. 2.
The partial product array generated after sign bit expansion is added is shown in fig. 3, the approximate compression method is shown in fig. 4, and the basic idea of compressing after the approximate multiplier provided by the embodiment generates the partial product by adopting radix-4 Booth coding is as follows: the lower order is decompressed with an approximate compressor and the upper order is decompressed with an exact compressor. Booth encoding generates 5 partial products, the partial products are artificially divided into three parts, the first part (high 8 bits) is accurately compressed, the second part and the third part (low 8 bits) adopt approximate compression processing, as shown in figure 4, the part with the lowest weight, namely the third part adopts a multi-input single-output OR gate (OR) to compress the partial products, and P0-P5 is directly generated to obtain an approximate compression result; the part with the highest weight, namely the first part, is compressed by partial products by a 3-2 compressor and a 4-2 compressor; the remaining second portion was ORed P30 and Neg3 and then decompressed using a 4-2 compressor. After the first compression, two rows of partial products are obtained, and finally, a third adder is used to obtain a final 16-bit multiplication result, in this embodiment, the size of a convolution kernel of 3 × 3 is taken as an example, assuming that a parallel input channel is set to be 1, input data are 9 8-bit feature maps and 9 8-bit weight inputs, 9 approximate multiplications are performed simultaneously, and 9 identical approximate multipliers obtain 9 16-bit results. Because the convolutional neural network has certain resistance to noise, the convolution operation accumulates the result after multiple multiplications, and if the error accumulation approaches to 0, the result of classification prediction cannot be greatly influenced; even if the result of error accumulation is not 0, the loss of accuracy is acceptable because the convolutional neural network itself is fault-tolerant.
The fast addition unit is used for compressing and then adding partial product results output by the approximate multiplier array to obtain a convolution result of an input channel. As shown in fig. 1, the fast adding unit in this embodiment includes a Wallace tree (Wallace tree) and a fourth adder, where the fourth adder may be a ripple carry adder, and the Wallace tree and the fourth adder are used to fast add results output by 16 bits of 9 approximate multipliers corresponding to input channels, and the Wallace tree includes a plurality of 3-2 compressors and 4-2 compressors. As shown in fig. 6, taking the result of compressing a 3 × 3 convolution kernel as an example, in this embodiment, the Wallace tree performs fast accumulation on 9 16 bits, in order to prevent data overflow, sign bits of all 16-bit operands are expanded to 20 bits, then a 4-2 compressor and a 3-2 compressor are respectively used to perform three-time compression on 9 operands, and first 40 4-2 compressors are used to perform first compression; secondly, compressing the partial product by using 19 4-2 compressors and a 3-2 compressor; the third time, 20 3-2 compressors are adopted; the final addition stage uses a fourth adder to obtain the final 20-bit result. Although the present embodiment takes the compression of 9 16-bit operands as an example, the number of partial products to be compressed can be further increased after the expansion of the convolution kernel, and should not be construed as a limitation to the present invention.
Fig. 5 is a structural diagram of a 3-2 compressor and a 4-2 compressor, where the 3-2 compressor is also called a full adder and the logical expression is P1+ P2+ P3 ═ Sum +2 × Cout. The 4-2 compressor is a 3-2 compressor, and its logical expression is P1+ P2+ P3+ P4+ P5 ═ 2 × Cout +2 × Carry + Sum. Both compressors are used in the approximate multiplier and Wallace tree sections.
In this embodiment, the convolution layer has a plurality of groups of input channels corresponding to a plurality of convolution kernels, N input channel processing units respectively output convolution results of N input channels, a partial product accumulation unit needs to add convolution results of different input channels, and when convolution results of N input channels are not accumulated, 32-bit intermediate results are stored in an intermediate result register until convolution results of a group of N input channels are all accumulated, and 32-bit signed numbers are output and sent to a quantization activation unit. As shown in fig. 1, the partial product accumulation unit includes a plurality of adders for adding convolution results of different input channels; a 32-bit accumulator, i.e. a fifth adder, is used for accumulating the convolution results of the N input channels; and the data selector and the intermediate result register are used for judging whether all channel results are accumulated completely, if so, the output data of the fifth adder is transmitted to the quantization activation unit, otherwise, the output data of the fifth adder is stored in the intermediate result register, and the intermediate result register can be realized by using an SRAM or a register.
As shown in fig. 7, the quantization activation unit includes a first adder, a multiplier, an arithmetic right shift unit, a function activation unit, a second adder, and a clipping unit, and an input of the quantization activation unit includes an accumulated result of one point in an output feature map. The first adder is used for offset accumulation, a first input end of the first adder is connected with convolution accumulation results of the N input channels, a second input end of the first adder is connected with offset data, an output end of the first adder is connected with a first multiplier input end of the multiplier, and bit width of the offset data is 32 bits in the embodiment. The multiplier in this embodiment is a 32-bit × 32-bit fixed-point multiplier, the second multiplier input terminal is connected to the approximate multiplier, the inputs of the multipliers are 32-bit signed numbers, and the output is a 64-bit signed number. The right shift number of the arithmetic right shift unit and the approximate multiplier of the second input end of the multiplier are calculated in advance according to a quantization algorithm, the quantization algorithm utilizes the floating point type scaling coefficients of the input characteristic diagram, the input weight and the output characteristic diagram to convert to obtain the approximate multiplier and the right shift number, and fixed point data are obtained after conversion, wherein the right shift number is 8-bit data and is often larger than 31; the approximate multiplier is a 32-bit signed number, shifted left by 31 bits from a fraction between 0.5 and 1. Obtaining a 32-bit intermediate result after multiplication and shift, adding 8-bit zero data by a second adder after passing through a function activation unit, wherein the function activation unit is used as an activation function of an algorithm level, the second adder is a 32-bit fixed point adder and is used for accumulating zeros calculated in advance in a quantization algorithm, the zero data is also obtained in advance by the quantization algorithm, and the range of the zero data is [ -128,127 ]. After the final slicing unit, the output data is limited to an 8-bit signed number, the output range is [ -128,127 ]. The function activation unit can be activated by ReLu, and the activation expression is as follows:
Figure BDA0002605437600000071
in summary, compared with a conventional floating-point operation neural network accelerated computation circuit or an integer neural network accelerator computation circuit, the fixed-point quantization convolutional neural network accelerated computation circuit based on approximate computation provided by the invention is added with an approximate multiplier, adopts the Wallace tree to compress different dot products to replace an addition tree structure, can effectively reduce the memory space occupied by the weight and the characteristic value, can transmit more data under the same bandwidth, improves the throughput and the energy efficiency, accelerates the computation speed of the fixed-point quantization convolutional neural network on the premise of smaller prediction accuracy loss, and reduces the area and the power consumption.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims (7)

1. A fixed point quantization convolution neural network accelerator calculation circuit is characterized by comprising N input channel processing units, a partial accumulation adding unit and a quantization activation unit, wherein N is a positive integer;
the input channel processing unit is used for processing input data of one input channel, and the input data of the input channel comprises a plurality of input feature maps and a plurality of weights; the input channel processing unit comprises an approximate multiplier array and a quick addition unit, wherein the approximate multiplier array is used for correspondingly multiplying a plurality of input feature maps and a plurality of weights in input data of one input channel; the fast addition unit is used for compressing and then adding partial product results output by the approximate multiplier array to obtain a convolution result of an input channel;
the partial accumulation adding unit is used for accumulating all convolution results of the N input channels output by the N input channel processing units respectively and outputting the result to the quantization activation unit;
the quantization activation unit includes:
a first adder, the first input end of which is connected with the convolution accumulation results of the N input channels output by the partial product accumulation unit, and the second input end of which is connected with the offset data;
a multiplier, a first multiplier input end of which is connected with the output end of the first adder, and a second multiplier input end of which is connected with an approximate multiplier;
the input end of the arithmetic right shift unit is connected with the output end of the multiplier;
the input end of the function activation unit is connected with the output end of the arithmetic right shift unit;
a second adder, a first input end of which is connected with the output end of the function activation unit, and a second input end of which is connected with zero data;
and the input end of the amplitude limiting unit is connected with the output end of the second adder and is used for limiting the output bit number of the second adder to a specified bit number, and the output end of the amplitude limiting unit is used as the output end of the convolutional neural network accelerator calculation circuit.
2. The fixed-point quantization convolutional neural network accelerator calculation circuit of claim 1, wherein the approximation multiplier array comprises a plurality of approximation multipliers, the number of approximation multipliers of the approximation multiplier array in each of the input channel processing units, the number of input feature maps in the input data of each of the input channels, and the number of weights are determined by convolution kernels of the convolutional neural network, M is a positive integer for M × M convolution kernels, M × M approximation multipliers are provided in each of the input channel processing units to constitute the approximation multiplier array, and M × M input feature map inputs and M × M weight inputs are provided for receiving input data of one input channel.
3. The fixed-point quantization convolutional neural network accelerator calculating circuit as claimed in claim 2, wherein the approximate multiplier generates partial products by using booth encoding, the partial products obtained after booth encoding are added to sign bit expansion to form a partial product array, the partial product array is divided into three parts according to the weight, wherein the part with the lowest weight is approximately compressed by using an or gate, the part with the highest weight is accurately compressed by using a 3-2 compressor and a 4-2 compressor, the remaining part performs or operation on the sign bit and the corresponding partial product and then performs approximate compression by using a 4-2 compressor, and the partial products after compression are added by using a third adder to obtain the partial product result output by the approximate multiplier array.
4. The convolutional neural network accelerator calculating circuit for fixed point quantization as claimed in claim 1, wherein the fast adding unit comprises a wallace tree and a fourth adder, the wallace tree performs sign bit expansion on the partial product results output by the approximate multiplier array and then performs three times of compression by a 4-2 compressor and a 3-2 compressor respectively, and the fourth adder adds the partial products subjected to the three times of compression by the wallace tree to obtain the convolution result of the input channel.
5. The fixed-point quantized convolutional neural network accelerator computation circuit of claim 1, wherein the partial accumulation addition unit comprises a fifth adder, a data selector, and an intermediate result register,
the first input end of the fifth adder is connected with convolution results of N input channels output by the N input channel processing units respectively, the second input end of the fifth adder is connected with the output end of the intermediate result register, and the output end of the fifth adder is connected with the input end of the data selector;
the data selector outputs the output data of the fifth adder to the intermediate result register before the fifth adder does not complete the convolution result accumulation of the N input channels, and outputs the output data of the fifth adder to the quantization activation unit after the fifth adder completes the convolution result accumulation of the N input channels.
6. The fixed-point quantized convolutional neural network accelerator computation circuit of any of claims 1 to 5, wherein the convolutional neural network employs quantization of INT8 data type, and the quantization scheme selects the quantization specification of TensorFlow.
7. The convolutional neural network accelerator computing circuit for fixed-point quantization of claim 6, wherein the approximate multiplier connected to the second multiplier input terminal of the multiplier, the right shift number of the arithmetic right shift unit, and the zero data connected to the second input terminal of the second adder are obtained according to a tensrflow quantization algorithm, the function activation unit is activated by ReLu, and the activation expression is:
Figure FDA0002605437590000021
CN202010736970.3A 2020-07-28 2020-07-28 Fixed point quantization convolution neural network accelerator calculation circuit Pending CN111832719A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010736970.3A CN111832719A (en) 2020-07-28 2020-07-28 Fixed point quantization convolution neural network accelerator calculation circuit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010736970.3A CN111832719A (en) 2020-07-28 2020-07-28 Fixed point quantization convolution neural network accelerator calculation circuit

Publications (1)

Publication Number Publication Date
CN111832719A true CN111832719A (en) 2020-10-27

Family

ID=72925737

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010736970.3A Pending CN111832719A (en) 2020-07-28 2020-07-28 Fixed point quantization convolution neural network accelerator calculation circuit

Country Status (1)

Country Link
CN (1) CN111832719A (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112230884A (en) * 2020-12-17 2021-01-15 季华实验室 Target detection hardware accelerator and acceleration method
CN112434801A (en) * 2020-10-30 2021-03-02 西安交通大学 Convolution operation acceleration method for carrying out weight splitting according to bit precision
CN112734023A (en) * 2021-02-02 2021-04-30 中国科学院半导体研究所 Reconfigurable circuit applied to activation function of recurrent neural network
CN112766477A (en) * 2021-01-13 2021-05-07 天津智模科技有限公司 Neural network operation circuit
CN112965931A (en) * 2021-02-22 2021-06-15 北京微芯智通科技合伙企业(有限合伙) Digital integration processing method based on CNN cell neural network structure
CN113469327A (en) * 2021-06-24 2021-10-01 上海寒武纪信息科技有限公司 Integrated circuit device for executing advance of revolution
CN113554163A (en) * 2021-07-27 2021-10-26 深圳思谋信息科技有限公司 Convolutional neural network accelerator
CN114723031A (en) * 2022-05-06 2022-07-08 北京宽温微电子科技有限公司 Computing device
CN114819129A (en) * 2022-05-10 2022-07-29 福州大学 Convolution neural network hardware acceleration method of parallel computing unit
CN115879530A (en) * 2023-03-02 2023-03-31 湖北大学 Method for optimizing array structure of RRAM (resistive random access memory) memory computing system
CN115982529A (en) * 2022-12-14 2023-04-18 北京登临科技有限公司 Convolution operation structure, convolution operation array and related equipment
CN116048455A (en) * 2023-03-07 2023-05-02 南京航空航天大学 Insertion type approximate multiplication accumulator
CN116151340A (en) * 2022-12-26 2023-05-23 辉羲智能科技(上海)有限公司 Parallel random computing neural network system and hardware compression method and system thereof
CN116720563A (en) * 2022-09-19 2023-09-08 荣耀终端有限公司 Method and device for improving fixed-point neural network model precision and electronic equipment
CN117910421A (en) * 2024-03-15 2024-04-19 南京美辰微电子有限公司 Dynamic approximate circuit calculation deployment method and system based on neural network

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109542393A (en) * 2018-11-19 2019-03-29 电子科技大学 A kind of approximation 4-2 compressor and approximate multiplier
US20190212981A1 (en) * 2018-01-09 2019-07-11 Samsung Electronics Co., Ltd. Neural network processing unit including approximate multiplier and system on chip including the same
CN110363279A (en) * 2018-03-26 2019-10-22 华为技术有限公司 Image processing method and device based on convolutional neural networks model
CN110705702A (en) * 2019-09-29 2020-01-17 东南大学 Dynamic extensible convolutional neural network accelerator
CN110780845A (en) * 2019-10-17 2020-02-11 浙江大学 Configurable approximate multiplier for quantization convolutional neural network and implementation method thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190212981A1 (en) * 2018-01-09 2019-07-11 Samsung Electronics Co., Ltd. Neural network processing unit including approximate multiplier and system on chip including the same
CN110363279A (en) * 2018-03-26 2019-10-22 华为技术有限公司 Image processing method and device based on convolutional neural networks model
CN109542393A (en) * 2018-11-19 2019-03-29 电子科技大学 A kind of approximation 4-2 compressor and approximate multiplier
CN110705702A (en) * 2019-09-29 2020-01-17 东南大学 Dynamic extensible convolutional neural network accelerator
CN110780845A (en) * 2019-10-17 2020-02-11 浙江大学 Configurable approximate multiplier for quantization convolutional neural network and implementation method thereof

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BENOIT JACOB等: "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference", 《2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *
CHULIANG GUO等: "A Reconfigurable Approximate Multiplier for Quantized CNN Applications", 《2020 ASP-DAC》 *
FASIH UD DIN FARRUKH等: "Power Efficient Tiny Yolo CNN Using Reduced Hardware Resources Based on Booth Multiplier and WALLACE Tree Adders", 《IEEE OPEN JOURNAL OF CIRCUITS AND SYSTEMS ( VOLUME: 1)》 *
朱智洋: "基于近似计算与数据调度的 CNN 加速", 《中国优秀硕士学位论文全文数据库》 *

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112434801A (en) * 2020-10-30 2021-03-02 西安交通大学 Convolution operation acceleration method for carrying out weight splitting according to bit precision
CN112434801B (en) * 2020-10-30 2022-12-09 西安交通大学 Convolution operation acceleration method for carrying out weight splitting according to bit precision
CN112230884A (en) * 2020-12-17 2021-01-15 季华实验室 Target detection hardware accelerator and acceleration method
CN112766477A (en) * 2021-01-13 2021-05-07 天津智模科技有限公司 Neural network operation circuit
CN112766477B (en) * 2021-01-13 2023-05-30 天津智模科技有限公司 Neural network operation circuit
CN112734023A (en) * 2021-02-02 2021-04-30 中国科学院半导体研究所 Reconfigurable circuit applied to activation function of recurrent neural network
CN112734023B (en) * 2021-02-02 2023-10-13 中国科学院半导体研究所 Reconfigurable circuit applied to activation function of cyclic neural network
CN112965931A (en) * 2021-02-22 2021-06-15 北京微芯智通科技合伙企业(有限合伙) Digital integration processing method based on CNN cell neural network structure
CN113469327A (en) * 2021-06-24 2021-10-01 上海寒武纪信息科技有限公司 Integrated circuit device for executing advance of revolution
CN113469327B (en) * 2021-06-24 2024-04-05 上海寒武纪信息科技有限公司 Integrated circuit device for performing rotation number advance
CN113554163A (en) * 2021-07-27 2021-10-26 深圳思谋信息科技有限公司 Convolutional neural network accelerator
CN113554163B (en) * 2021-07-27 2024-03-29 深圳思谋信息科技有限公司 Convolutional neural network accelerator
CN114723031A (en) * 2022-05-06 2022-07-08 北京宽温微电子科技有限公司 Computing device
CN114723031B (en) * 2022-05-06 2023-10-20 苏州宽温电子科技有限公司 Computing device
CN114819129A (en) * 2022-05-10 2022-07-29 福州大学 Convolution neural network hardware acceleration method of parallel computing unit
CN116720563A (en) * 2022-09-19 2023-09-08 荣耀终端有限公司 Method and device for improving fixed-point neural network model precision and electronic equipment
CN116720563B (en) * 2022-09-19 2024-03-29 荣耀终端有限公司 Method and device for improving fixed-point neural network model precision and electronic equipment
CN115982529B (en) * 2022-12-14 2023-09-08 北京登临科技有限公司 Convolution operation structure, convolution operation array and related equipment
CN115982529A (en) * 2022-12-14 2023-04-18 北京登临科技有限公司 Convolution operation structure, convolution operation array and related equipment
CN116151340B (en) * 2022-12-26 2023-09-01 辉羲智能科技(上海)有限公司 Parallel random computing neural network system and hardware compression method and system thereof
CN116151340A (en) * 2022-12-26 2023-05-23 辉羲智能科技(上海)有限公司 Parallel random computing neural network system and hardware compression method and system thereof
CN115879530A (en) * 2023-03-02 2023-03-31 湖北大学 Method for optimizing array structure of RRAM (resistive random access memory) memory computing system
CN116048455B (en) * 2023-03-07 2023-06-02 南京航空航天大学 Insertion type approximate multiplication accumulator
CN116048455A (en) * 2023-03-07 2023-05-02 南京航空航天大学 Insertion type approximate multiplication accumulator
CN117910421A (en) * 2024-03-15 2024-04-19 南京美辰微电子有限公司 Dynamic approximate circuit calculation deployment method and system based on neural network

Similar Documents

Publication Publication Date Title
CN111832719A (en) Fixed point quantization convolution neural network accelerator calculation circuit
CN106909970B (en) Approximate calculation-based binary weight convolution neural network hardware accelerator calculation device
US20210349692A1 (en) Multiplier and multiplication method
US10872295B1 (en) Residual quantization of bit-shift weights in an artificial neural network
CN111488133B (en) High-radix approximate Booth coding method and mixed-radix Booth coding approximate multiplier
CN112434801B (en) Convolution operation acceleration method for carrying out weight splitting according to bit precision
CN113283587B (en) Winograd convolution operation acceleration method and acceleration module
CN114647399B (en) Low-energy-consumption high-precision approximate parallel fixed-width multiplication accumulation device
CN116400883A (en) Floating point multiply-add device capable of switching precision
CN111008698B (en) Sparse matrix multiplication accelerator for hybrid compression cyclic neural networks
CN115982528A (en) Booth algorithm-based approximate precoding convolution operation method and system
CN110955403B (en) Approximate base-8 Booth encoder and approximate binary multiplier of mixed Booth encoding
CN113902109A (en) Compression method and device for regular bit serial computation of neural network
CN110825346B (en) Low logic complexity unsigned approximation multiplier
CN116205244B (en) Digital signal processing structure
CN110659014B (en) Multiplier and neural network computing platform
Yang et al. A low-power approximate multiply-add unit
Kumar et al. Complex multiplier: implementation using efficient algorithms for signal processing application
CN114065923A (en) Compression method, system and accelerating device of convolutional neural network
CN113360131A (en) Logarithm approximate multiplication accumulator for convolutional neural network accelerator
CN116151340B (en) Parallel random computing neural network system and hardware compression method and system thereof
CN116126283B (en) Resource occupancy rate optimization method of FPGA convolution accelerator
CN116402106B (en) Neural network acceleration method, neural network accelerator, chip and electronic equipment
Suzuki et al. ProgressiveNN: Achieving Computational Scalability with Dynamic Bit-Precision Adjustment by MSB-first Accumulative Computation
WO2023078364A1 (en) Operation method and apparatus for matrix multiplication

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20201027