CN111611529B

CN111611529B - Multi-bit convolution operation module with variable capacitance, current integration and charge sharing

Info

Publication number: CN111611529B
Application number: CN202010261165.XA
Authority: CN
Inventors: 阿隆索·莫尔加多; 刘洪杰
Original assignee: Shenzhen Jiutian Ruixin Technology Co ltd
Current assignee: Shenzhen Jiutian Ruixin Technology Co ltd
Priority date: 2020-04-03
Filing date: 2020-04-03
Publication date: 2023-05-02
Anticipated expiration: 2040-04-03
Also published as: CN111611529A

Abstract

The present invention relates to an analog operation module, and more particularly, to an analog operation module for convolution operation, which provides a set of analog Multipliers and Accumulators (MACs). The current integration in the capacitor is used for multiplication operation of two multi-bit binary convolution processes, the capacity double decreasing coding of the capacitor with the same clock period is utilized to realize the change of multiplier or multiplicand bit weight in the multiplication stage, the charge sharing among the capacitors realizes the addition process, and the module has higher speed and the same unit clock when carrying out convolution operation and can be used for exchange delay or acceleration in a region. This approach is applicable to a series of binary bit number adjustable multi-bit convolutions that can be used to achieve a general convolution of two or more inputs. In particular, an array of bias operation cells may be added. The invention can be used as a neural network convolution operation unit or a unit of memory or near memory operation realized by operation accelerator hardware.

Description

Multi-bit convolution operation module with variable capacitance, current integration and charge sharing

Technical Field

The present invention relates to a simulation computing module, and more particularly, to a simulation computing module for convolution operation.

Background

For quantization with low signal-to-noise ratio, analog operation has higher efficiency than traditional digital operation, so that digital quantity is usually converted into analog quantity for operation. Especially for the neural network, compared with the operation energy consumption in the medium and large hardware implementation of the neural network, as the traditional data is stored in a magnetic disk, the data needs to be extracted into a memory when the operation is performed, and the process needs a large amount of I/O to be connected with the storage of the traditional memory, so that more power consumption is occupied. The operation process can be sent to the data for local execution based on the analog memory and the near memory operation, so that the operation speed is greatly improved, the storage area is saved, and the data transmission and the operation power consumption are reduced. The invention provides an effective implementation method of ultralow-power-consumption analog memory or near-memory operation.

The recent paper "a Mixed-Signal Binarized Convolutional-real-Network Accelerator Integrating Dense weight Storage and Multiplication for Reduced Data Movement" symp.vlsi Circuits, pp.141-142,2018 proposes a binary-based Memory or near-Memory analog operation for 1-bit binary multiplication, which shows an efficient performance, and a Static Random-Access Memory (SRAM) unit stores a 1-bit weight and performs convolution operation with an input Mixed signal, so that the operation capability is greatly improved and the storage area is reduced. However, in this background art document, the implementation of the analog operation circuit does not involve the change of the multiplier or multiplicand weight, and the 1-bit multiplication operation is limited to the input of the first layer, and cannot be used for convolution analog operation of multi-bit binary numbers.

Very few multi-bit operations involve a change in the weighting of the multiplier or multiplicand, as in:

(1)“In-Memory Computation of a Machine-Learning Classifier in a Standard6T SRAM Array”,JSSC,pp.915-924,2017；(2)“A 481pJ/decision 3.4M decision/s multifunctional deep inmemory inference processor using standard 6T SRAM array”,arXiv:1610.07501,2016；(3)“A Microprocessor implemented in 65nm CMOS with Configurable and Bit-scalable Accelerator for Programmable In-memory Computing”，arXiv:1811.04047,2018；(4)“A Twin-8T SRAM Computation-In-Memory Macro for Multiple-Bit CNN-Based Machine Learning,”，ISSCC,pp.396-398,2018，(5)“A 42pJ/Decision 3.12TOPS/W Robust In-Memory Machine Learning Classifier with On-Chip Training,”ISSCC,pp.490-491,2018；

but these multi-bit operations are all implemented by using control buses in the current domain of modulation, capacitive charge sharing, pulse-width-modulated (PWM), modifying SRAM cells, or complex digital matrix vector processing with near-memory operations. In the implementation method of these multi-bit operations, the multi-bit analog multiplier and accumulator always adopt very complex digital processing control, but in the aspect of quantization with low signal-to-noise ratio, the traditional digital operations consume a great deal of efficiency compared with the analog operations, so that the multi-bit operations under the digital processing control can generate great operation energy consumption.

The binarization convolution proposed by CN201910068644 is used for realizing potential change by modulating a control bus in an SRAM in an exclusive OR operation stage, but the technical scheme and the teaching provided by the patent are that complicated digital processing control is required, the requirement on a control module is high, and excessive energy consumption is consumed. Therefore, there is a need in the art for a solution that uses analog convolution operation for low signal-to-noise signals to achieve ultra-low power consumption.

Disclosure of Invention

In view of the above, the present invention is directed to a module for multi-bit binary convolution analog operation based on capacitive variable current integration and charge sharing with ultra-low power consumption, compact structure and fast operation speed, which supports general convolution of two or more inputs, and the binary bit number can be adjusted, and especially can be used as a unit for analog memory operation implemented by neural network convolution operation unit or operation accelerator hardware.

The module concerned, in addition to the advantages described above, the implementation based on matrix elements is reasonable for convolution-based convolution operation elements in or near memory, not only reducing the power of processes associated with memory access, but also making the matrix physical implementation more compact. In order to achieve the above purpose, the following technical scheme is adopted:

based on the two stages of multiplication and addition of convolution operation, the invention provides a multi-bit convolution operation module based on capacitance-capacity variable current integration and charge sharing. The module comprises: at least one digital input x _i At least one digital-to-analog converter (Digital to Analog Converter, DAC) inputs said digital input x _i Conversion to current Ix according to a given number of bits _i Transmitting in a circuit; at least one weight w _ji When the weight is expressed as a binary number, w _ji,k A value at its kth bit; a convolution array of a plurality of convolution units 102, the convolution array performing a multiplication and addition of the convolution operation, at least one output y _j ；

Further, the current Ix _i Is counted by DACWord input x _i Converted according to a given number of bits of the DAC, the current Ix _i Mirrored or copied into a convolution array, a current Ix _i Corresponding to j x k convolution operation units;

in particular, the convolution operation array has a scale of i x j x k, and each convolution operation unit (i, j, k) includes a current Ix _i A switch, at least one control signal, node a _ji,k At least one capacitor.

In particular, the same current Ix _i Integrating in the capacitor with the decreasing capacitance value of 1/2, and storing different charge amounts, so that the voltage at two ends of the obtained capacitor is changed according to the different capacitance values. For weight w _ji ，w _ji,k Is the weight w _ji Value on kth bit in binary representation, k ε [1, B]。

Further, the currents of the same j-k plane are the same, and the convolution array allows the input of a multi-bit signal while the current Ix _i Can be scaled in DAC, current Ix _i At the same time as the switch is reached, i.e. for a current Ix _i Integrating capacitor, current Ix _i The time to start integration is the same, as is the time to end integration.

Further, each of the currents Ix _i The capacitance capacity of the k direction of the corresponding operation array is reduced by 1/2 bit by bit, and for binary weight w _ji (j represents that the weight is the weight index of the jth window), w _ji,k Is the weight w _ji Value at k-th bit, each w _ji,k Corresponding to a convolution operation unit, k is E [1, B]The capacitance of the former bit convolution operation unit is 2 times that of the latter bit, for example, the capacitance of the lowest bit k=1 convolution operation unit is Cu, and the capacitance of the highest bit k=b convolution operation unit is 1/2 ^(k ^-1) *Cu。

Further, when the switch is closed, the current Ix _i Integrating into capacitors of different capacities.

In particular, the switch closure may be controlled by a control signal which is always on or off, w _ji,k =1, the control signal in the convolution operation unit is always on, and the switch is closed; w (w) _ji,k Control signal in convolution operation unit is always off, switch is off.

Further, as described above, assume w _ji,1 ＝w _ji,B After the current in the capacitor has the same integration time, the charge stored in the capacitor is the same, the voltage across the corresponding capacitor is 2 of the capacitor voltage of k=1, and the capacitor of k=b ^(k-1) Multiple times.

Further, the node a _ji,k At a voltage x of _i *w _ji,k *2 ^(k-1) Multiplier result, the value of which is determined by the capacitance value of the capacitor connected to the node and w _ji,k Determination, x _i Corresponding 1*k convolution operation units are used for x _i *w _ji Is performed by the computer system.

Further, the y _j Is given by j, all a of an i x k plane _ji,k The nodes, because of the discharge characteristic of the capacitors, the capacitors in different convolution operation units share charges through the connected nodes, and after the charge sharing is finished, the charge quantity in each capacitor is the same, but the current Ix in the multiplication stage _i The total charge amount obtained by integration is unchanged, the accumulated voltage at the combined node is the output of convolution operation, and a convolution operation unit with an i-x-k plane is used for the multiplication operation of the convolution process of a convolution kernel and an input matrix.

Further, for the neural network convolution operation unit, a bias is generally added. Offset b of the invention _j Conversion to a given current Ix _i Fixed current I of additional input of (a) _b The offset operation unit array is operated by adding additional offset operation units, the scale of the offset operation unit array is j x k, and each convolution operation unit (j, k) comprises a current I _b Switch, node a _j,k At least one capacitor.

Further, the y _j Offset b of (2) _j All nodes a for 1*k group unit _j,k The voltage sum is accumulated.

Further, to mitigate kickback or transient effects on the current mirror, the switch is a virtual switch or a current transformer or a non-switching element.

Further, a damping capacitor C is added when the combination node is connected _att Thereby adjusting the scale range of the accumulated voltage, enabling the accumulated voltage to be scaled to be within a certain scale range, and meeting the input range of the digital-to-analog converter.

The invention also provides a multi-bit convolution analog operation method based on current integration and charge sharing with variable current values, which comprises the following steps: DAC inputs a number x by a given number _i Current Ix converted to analog signal _i Transmitting in a circuit; w (w) _ji,k Is the weight w _ji The value at the kth bit, k.epsilon.1, B]Wherein B refers to the highest bit of the binary system, each bit w _ji,k Corresponding to a convolution operation unit and w _ji,k Is 0 or 1; the k-direction convolution operation unit depends on the weight w _ji Each bit w of (2) _ji,k Ranging from low to high; the capacitance value in the k-direction convolution operation unit is reduced by 1/2 of the bit, the capacitance value in the next-bit convolution operation unit is 1/2 of the previous bit, and the capacitance of k=1 is C _u The capacitance value of the k=b convolution operation unit is C _u /2 ^(B-1) The method comprises the steps of carrying out a first treatment on the surface of the Current Ix _i Control of current Ix using control signals before flowing through switches in convolution operation units _i Is turned on, the switch is closed, and the current Ix _i Through node a _ji,k Integrating in the capacitor, and forming a node a above the capacitor in the convolution operation unit with different k directions _ji,k Voltage of (2) is related to capacitance capacity in cell ^(k-1) The control signal is turned off and the current Ix is changed _i The integrated charge is 0, node a above the capacitor _ji,k Is 0, the voltage is x _i *w _ji,k *2 ^(k-1) Is a multiplication result of (a); current Ix _i After the same integration time in the capacitor, shorting all convolution operation unit internal nodes a of one i-by-k plane _ji,k The charge sharing between the capacitors in each convolution operation unit is realized, and the obtained voltage of the combined node is the convolution output result y _j 。

Further, to accommodate digital input x _i Digital input x is converted in DAC _i The resolution of the DAC was previously adjusted.

Further, at the connection ADC output y _j Previously, the attenuation capacitors were connected in parallel to adjust the full scale range of the cumulative voltage such that the cumulative voltage swing at the combined node is below the analog-to-digital converter input range.

Drawings

FIG. 1 is a schematic diagram of a multiplication stage circuit implementation in an embodiment of the present invention;

FIG. 2 is a schematic diagram of an embodiment of an addition stage circuit based on charge redistribution (ADC is not shown, where y is needed) _j Can be added at each output y when converted to digital output _j Previously);

FIG. 3 is a schematic diagram illustrating an implementation of adding a bias operation unit to a convolution operation according to an embodiment of the present invention;

FIG. 4 is a schematic diagram illustrating an accumulation process in an array of bias cells according to an embodiment of the invention.

Description of the main reference signs

Module group	10
		Digital-to-analog converter	101
Convolution operation unit	102
		Switch	1021
Capacitance device	1022
		Control signal	103
Offset operation array	104
		Offset operation unit	1041
Attenuation capacitor	105
		Digital input	x _i
Weighting of	w _ji
		Electric current	Ix _i

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, wherein the purpose, principle, technical solution and advantages of the present invention are more clearly understood.

It should be understood that the embodiments described herein are for illustrative purposes only and that the invention may be practiced otherwise than as specifically described, and that persons of ordinary skill in the art may readily devise their own right without departing from the spirit or scope of the present invention, which is therefore not limited by the specific embodiments disclosed below.

Referring to fig. 1, for one general convolution operation:

binary number x of bits _i An input matrix consisting of i from 1 to N; weight w _ji Structured convolution kernel, j tableShowing a corresponding j-th window after i is determined; when the input forms an n-n input matrix and the convolution kernel is an m-m weight matrix, j is 1-n-m+1 (n>m, window movement); the output is y _j All y _j Forming a convolution operation result, namely extracting a layer of neural network characteristics;

the w is _ji When expressed as a binary number of bits, w _ji,k Is w _ji A value at the kth bit; two multi-bit binary Σx _i *w _ji The convolution operation process of (1) is divided into two stages:

multiplication stage: input x _i Multiplied by weight w _ji Multiplying each bit of the number 2 by the bit weight 2 ^(k-1) I.e. x _i *w _ji,k *2 ^(k-1) And w is _ji,k 0 or 1.

And an addition stage: the result of each multiplication operation in the multiplication stage is accumulated and summed to obtain an output y _j 。

When the module of the invention is used for convolution calculation of the neural network, the multiplication stage weight w _ji The constituent weight matrix shares, i.e. j changes from 1 to n-m+1, with the output yj being w under the size determination of the convolution kernel _j1 ＝w _j2 ＝w _j3 ＝...＝w _j i。

For the above convolution operation for multi-bit binary, the present invention needs to solve the change of the bit weight and the addition stage of the accumulation of the multiplication result when the multiplicand of the multiplication stage multiplies each bit of the multiplier.

The embodiment of the invention provides an operation module 10 for realizing the multi-bit convolution operation based on the current integration and the charge accumulation with variable capacitance. The module 10 comprises: at least one digital input x _i At least one digital-to-analog converter 101 (Digital to analog converter, DAC) converts the digital input into a current Ix _i Transmitting in a circuit; at least one weight w _ji When the weight is expressed as a binary number, w _ji,k Representing the value on the kth bit for its binary system; a convolution operation array constituted by a plurality of convolution operation units 102, the convolution operation array having a scale of i×j×k, each convolution operation beingCell 102 (i, j, k) includes current Ix _i A switch 1021, at least one control signal 103, node a _ji,k At least one capacitor 1022, the negative terminal of the capacitor 1022 is grounded, and the capacitor 1022 needs to be reset to a given dc voltage before performing the convolution operation. The array performs the multiplication and addition operations of the convolution operations and at least one output y _j ；

During the multiplication phase, the bit k increment direction capacitance 1022 is 1/2 decrement. In this embodiment, the implementation of the convolution operation unit 102 based on the matrix unit for convolution in or near the memory reduces the power of the process related to the memory access, so that the physical implementation of the matrix is more compact. Specifically, referring to FIG. 1, a digital-to-analog converter 101 of the module 10 of the present invention inputs a number x of binary digits of a given number of bits _i Current Ix converted to analog signal _i Resolution of DAC and digital input x _i In some embodiments, the resolution of the DAC may be adjusted prior to conversion. Current Ix _i Is mirrored or copied by a current mirror into j x k convolution operation units 102 corresponding to the same i, and the k direction w _ji,k Each convolution operation unit is arranged in sequence from low to high according to the position of the convolution operation unit, and the weight w _ji The capacitance 1022 decreases by 1/2 as the number of bits increases; the current integration of the cells in the j direction can thus be performed simultaneously and ended simultaneously for different i-k planes. In a further embodiment, the current Ix required to be converted by the DAC _i Can be scaled in the DAC and then transmitted in the circuit as required to control the current value not to exceed a certain threshold value, thereby reducing the power loss of transmission. Thereafter the current Ix _i To the switch 1021, which may be a virtual switch or a current transformer or a non-switching element such as a current transformer or a virtual load, in order to mitigate the kickback or transient effects on the current mirror.

Although the capacity decrementing allocation to the capacitors 1022 may be implemented by multiple units of capacitors 1022 in parallel, such that the physical implementation of the present invention has a higher speed and clock of the same unit. However, this approach is prone to problems with interconnect line capacitance parasitics (interconnection capacitive parasitics). So thatEach current Ix described in the module 10 of the present invention _i The value 1/2 of the capacitance 1022 decreases in the increasing direction of k of the operation array of 1 x j x k, and the weight w is given _ji The capacitor 1022 value in the convolution operation unit 102 corresponding to the latter bit is directly half of the former bit, so that the module 10 of the present invention has higher speed and the same unit clock in the operation process. For binary weight w _ji (j represents that the weight is the weight index of the jth window), w _ji,k Is the weight w _ji Value at k-th bit, each w _ji,k Corresponding to a convolution operation unit 102, k E [1, B]B is the highest bit of the binary system, the capacitance 1022 of the former convolution operation unit 102 is 2 times that of the latter bit, and corresponding k=1, 2, 3, and the capacitance values in the corresponding convolution operation unit 102 are Cu, 1/2×cu, and 1/4×cu, respectively, so that the capacitance 1022 in the convolution operation unit 102 of the highest bit k=b is 1/2 ^(B-1) * Cu. In the present embodiment, the weight w is given by the fact that there is no accurate requirement for the integration time of the current _ji No complex digital processing control is required anymore, the integration of the current can be controlled by a control signal 103 which is always on or always off. Specifically, bit w _ji,k When 1, the control signal 103 is on, the switch is off, and the current is integrated into the capacitor 1022, generating a voltage across the capacitor. Bit w _ji,k When the value of (2) is 0, the control signal 103 is off, the switch is off, the current does not enter the capacitor for integration, and the voltage across the capacitor is 0 at any time.

For example, let w _ji,1 ＝w _ji,2 ＝w _ji,3 =.1 (corresponding to the subscripts i, j being the same, respectively), corresponding k=1, 2, 3, the capacities of the capacitors 1022 in the convolution operation unit 102 are Cu, 1/2×cu, 1/4×cu, respectively, and the capacity of the capacitor 1022 in the k-th convolution operation unit 102 is 1/2 ^(k-1) * Cu, then current Ix in capacitor 1022 _i After the same integration time, the current Ix in the 3 capacitors 1022 _i The integrated charge is the same, and when the charge is the same, the voltage across the capacitor 1022 is inversely proportional to the capacitance, the voltage across the corresponding capacitor 1022For U, 2U, 4U, the value of the capacitance 1022 in the convolution operation unit 102 of k=b will be 2 of the voltage of the capacitance 1022 in the convolution operation unit 102 of k=1 ^(k-1) Doubling, i.e. realizing the weight w _ji Or a multiplier each bit being multiplied by an input x _i Or the multiplicand has a change in the weight. It should be noted that the above description uses w for convenience only in describing the capacitance versus voltage _ji In fact, regardless of w _ji,k 0 or 1, the current Ix in each convolution operation unit 102 _i Is the same, but w _ji,k The convolution operation unit 102 performs integration with current value 0, w _ji,k The corresponding convolution operation unit 102 of =1 is of value 1/2 ^(k-1) * The capacitance 1022 in each convolution operation unit 102 is reduced by 1/2 of the bit by Cu integration, and is not due to w _ji,k Is 0 or 1.

The node a _ji,k At a voltage x of _i*

w

_ji,k* 2 ^(k-1) Multiplier result, the value of which is determined by the capacitance 1022 value of the capacitance 1022 to which the node is connected and w _ji,k Determining, and thus the voltage across the capacitor 1022, while the negative terminal of the capacitor 1022 is grounded, then the voltage across the capacitor 1022 is the node a connected to the positive plate of the capacitor 1022 _ji,k Voltage at.

And in the addition stage, convolution output is obtained through charge sharing stored in the capacitor. After all convolution operation units 102 of the present invention complete the current integration operation in the multiplication stage, for j=1, x ₁ The corresponding k units are completed once by x ₁ *w ₁₁ X is calculated by ₁ *w ₁₁ Binary multiplication of the input x is split to see the input x ₁ Respectively multiplied by weights w ₁₁ Each bit w of (2) _ji,k Bit weight 2 of the bit ^(k-1) And adding the obtained results. Similarly, x _i The corresponding k units are completed once by x _i *w _i1 Operation, then j=1, and all i× 1*k arrays corresponding to i∈n complete the multiplication operation of one convolution window. Node a of each unit of the i-1*k array of the present invention _ji,k The voltage is multiplied by x at each step _i *w _ji,k *2 ^(k-1) As a result, after the multiplication is completed, the capacitors 1022 are shorted, and the shorting j=1 corresponds to the node a above all the capacitors 1022 in the array _ji,k Because of the discharging property of the capacitor 1022, the capacitors 1022 in different convolution operation units 102 share charges through the connected nodes, and after the charge sharing is finished, the charge amount in each capacitor 1022 is the same, but the total charge amount obtained by integrating the current in the multiplication stage is unchanged, and the accumulated voltage at the combined node is the output of convolution operation, namely the output y ₁ . Similarly, shorting node a of the i x j x k array corresponding to other j _ji,k By connecting the capacitors 1022 in parallel, other corresponding outputs y can be obtained _j Equation 1 below. It is noted that, in the case of the convolutional neural network having weight matrix sharing, the convolution kernels corresponding to different windows are the same, i.e. when different window convolution results are calculated, the multiplicand is the weight w _ji The weight matrix is the same, w _1j ＝w _2j ＝w _3j ＝.....＝w _ji The number of parameters involved in the operation is reduced.

Optionally, the output signal is converted. After the operation of the convolution operation array to perform the accumulation of the Analog multiplication, the output is an Analog signal, and when the output signal is a digital signal, an Analog-to-Digital Converter (ADC) is added before the output, and the obtained output y _j Digital signals. For example, the convolution operation module is applied to a convolution neural network, and the digital output y _j And the digital input can be used as a digital input to enter a convolution operation array to carry out convolution operation of a neural network of the next layer. In addition, if the accumulated voltage swings or is too high in the analog-to-digital converter input range, the unit capacitance C can be increased in the multiplication stage as in FIG. 1 _u To effectively solve the problem, but the number of capacitors required by each group of convolution operation units 102 is increased, so that more physical area is required, which is disadvantageous for element micro-scaleAnd (5) miniaturization. It is therefore considered that when connecting the combining node, an additional attenuation capacitance C is connected at the same time _att And entering a combined node, so that the scale range of the accumulated voltage is adjusted, the accumulated voltage is scaled to be within a certain scale range, and the input range of the digital-to-analog converter is met. Whenever yj is output, the attenuation capacitor 105 is used, the node a above the attenuation capacitor _att,j With the original node a _ji,k In connection, this solution makes more efficient use of the area physically realized by the module.

Fig. 3 and 4 show an embodiment of the offset calculation unit 1041 when the convolution calculation unit 102 according to the present invention is used for convolution neural network calculation. Adding the offset b to account for the convolution operation makes the convolution operation more efficient and accurate, typically for a given output y _j Adding a binary offset b _j . Then the corresponding convolution outputs the result y _j The equation 1 is changed to the following equation 2.

Fig. 3 illustrates how this additional functionality is added during the multiplication stage. Since quantization of the bias bits is performed in a manner similar to the weights in fig. 1 or 2, implementation of the bias is considered as given current Ix _i Fixed current I of additional input of (a) _b 。

Offset b of the invention _j Conversion to a given current Ix _i Fixed current I of additional input of (a) _b The offset operation units 104 are added with additional offset operation units 1041 to perform operation individually, the offset operation units 104 form an offset operation array 104 with the scale of j x k, and each offset operation unit 1041 (j, k) comprises a current I _b Switch 1021, node a _j,k At least one capacitor 1022, current I _b Integrating within capacitor 1022.

Each current I in the bias operation array 104 of the present invention _b The capacitance 1022 value of each corresponding 1.k bias operation array decreases in the increasing direction of k by 1/2, i.e. the capacitance 1022 value in the next bias operation unit 1041 is directly made half of the previous bitThe capacitance capacity in the offset operation unit 1041 corresponding to the kth bit is C _u /2 ^(k-1) The capacitances 1022 in the bias operation units 1041 corresponding to the same k bits of different weights are the same. Similarly, the switch 1021 in the bias operation array 104 is controlled by the control signal 103 which is always on or always off, b _j,k When 1 is set, the control signal 103 in the corresponding bias operation unit 1041 is in an on state, and the bias current I _b Integrating, b into the capacitance 1022 in the bias arithmetic unit 1041 _j,k When the value is 0, the control signal 103 in the corresponding bias operation unit 1041 is in an off state, and the bias current I _b Not into the capacitor 1022. With the convolution operation array, the voltage across the capacitor 1022 is the result of the multiplication stage of the offset operation unit 1041.

Fig. 4 illustrates that during the accumulation phase, additional capacitance 1022 is required for charge sharing and node accumulation.

Similarly, the k unit nodes a corresponding to the given j are short-circuited _j,k Because of the discharging property of the capacitors 1022, the capacitors 1022 in the shorted array share charges, and the stored charges in each capacitor 1022 are the same after sharing, but the total charge value is unchanged, and the obtained voltage of the combined node is the voltage of each multiplication result node a in the multiplication stage _ji,k The sum of the voltages, i.e. y _j Is the bias b of all nodes a of the 1-k group unit _j,k The voltage sum is accumulated.

The invention also provides a multi-bit convolution analog operation method based on current integration and charge sharing with variable current values, which comprises the following steps: DAC inputs a number x by a given number _i Current Ix converted to analog signal _i Transmitting in a circuit; w (w) _ji,k Is the weight w _ji The value at the kth bit, k.epsilon.1, B]Wherein B refers to the highest bit of the binary system, each bit w _ji,k Corresponding to a convolution operation unit and w _ji,k Is 0 or 1; the k-direction convolution operation unit depends on the weight w _ji Each bit w of (2) _ji,k Ranging from low to high; the capacitance value in the k-direction convolution operation unit is reduced by 1/2 of the bit, the capacitance value in the next-bit convolution operation unit is 1/2 of the previous bit, and the capacitance of k=1 is C _u The capacitance value of the k=b convolution operation unit is C _u /2 ^(B-1) The method comprises the steps of carrying out a first treatment on the surface of the Current Ix _i Control of current Ix using control signals before flowing through switches in convolution operation units _i Is turned on, the switch is closed, and the current Ix _i Through node a _ji,k Integrating in the capacitor, and forming a node a above the capacitor in the convolution operation unit with different k directions _ji,k Voltage of (2) is related to capacitance capacity in cell ^(k-1) The control signal is turned off and the current Ix is changed _i The integrated charge is 0, node a above the capacitor _ji,k Is 0, the voltage is x _i *w _ji,k *2 ^(k -1) the multiplication result; current Ix _i After the same integration time in the capacitor, shorting all convolution operation unit internal nodes a of one i-by-k plane _ji,k The charge sharing between the capacitors in each convolution operation unit is realized, and the obtained voltage of the combined node is the convolution output result y _j 。

In further embodiments, the resolution of the DAC is adjusted to accommodate the varying number of bits of the digital input xi before the DAC converts the digital input xi. In addition, the attenuation capacitor is connected in parallel to adjust the full scale range of the accumulated voltage so that the accumulated voltage swing at the combining node is lower than the analog-to-digital converter input range before the ADC output yj is connected.

It should be noted that, in the above embodiment, each included module is only divided according to the functional logic, but not limited to the above division, so long as the corresponding function can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present invention.

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims

1. The multi-bit convolution operation module with variable capacitance, current integration and charge sharing is characterized by comprising:

at least one digital input x _i At least one digital-to-analog converter DAC, at least one weight w _ji A convolution operation array comprising a plurality of convolution operation units, at least one output y _j ；

The digital input x _i Current Ix converted into an analog signal by the DAC according to a given number of bits _i Transmitting in a circuit;

the weight w _ji J represents that the weight is the weight index of the jth window, w _ji,k Is the weight w _ji A value at bit k, 0 or 1; each w _ji,k Corresponding to a convolution operation unit, k is E [1, B]Wherein B is the highest bit of the binary system;

the scale of the convolution operation array is i x j x k, the i direction is the input direction, the j direction is the convolution window direction, and the k direction convolution operation unit is used for weighting w _ji Each bit w of (2) _ji,k Ranging from low to high; each convolution operation unit (i, j, k) comprises a current Ix _i A switch, at least one control signal, node a _ji,k At least one capacitor, the capacitance value in the k-direction convolution operation unit is reduced by 1/2 of the bit, the capacitance value in the latter convolution operation unit is 1/2 of the former bit, and the capacitance of k=1 is C _u The capacitance value of the k=b convolution operation unit is C _u /2 ^(B-1) ；

The control signal controls the current Ix _i Is turned on, the switch is closed, and the current Ix _i Through node a _ji,k Integrating in the capacitor, and forming a node a above the capacitor in the convolution operation unit with different k directions _ji,k Voltage of (2) is related to capacitance capacity in cell ^(k-1) The control signal is turned off and the current Ix is changed _i The integrated charge is 0, node a above the capacitor _ji,k Is 0, the voltage is x _i *w _ji,k *2 ^(k-1) Is a multiplication result of (a);

said y _j Is obtained by shorting all convolution operation unit internal nodes a of an i-by-k plane _ji,k Charge sharing between capacitors in each convolution operation unit, and obtained combinationThe voltage at the node, which is the output of the convolution operation.

2. The computing module of claim 1, wherein the digital input x _i The corresponding 1*k convolution operation units have a combined voltage x _i *w _ji As a result of (a) the voltage at the combined node of a convolution operation unit of the i x k plane is Σx _i ·w _ji Output y _j And completing the operation of the convolution process of the convolution kernel and the input matrix.

3. The computing module of claim 2, wherein the control signal is always on or always off.

4. The computing module of claim 3, wherein the current Ix _i Mirrored or copied into convolution array, the corresponding j-k plane currents are identical, the current Ix _i May be scaled in a digital-to-analog converter.

5. The computing module of claim 4, wherein each of the currents Ix _i Corresponding 1*k arithmetic units are arranged in the k direction, the capacitance of the next capacitor is half of that of the previous capacitor, and after the current in the capacitor passes through the same integration time, adjacent w _ji,k And the convolution operation units are 1, the voltage at two ends of the capacitor at the later position is 2 times of the voltage at the previous position, and the integration time of all the capacitors is started and ended simultaneously.

6. The computing module of claim 5, wherein the capacitor is replaceable with a resistor, the resistance value of the k-direction resistor increasing by a factor of 2.

7. The computing module of any one of claims 1 to 6, wherein the switch is a virtual switch or a current transformer that reduces kickback or transient effects on a current mirror.

8. The computing module of claim 7, wherein when the cumulative voltage swing at the combining node is higher than or too high of the input range of the analog-to-digital converter ADC, the output y _j The full scale range of the accumulated voltage is adjusted by connecting a damping capacitor in parallel before connecting an analog-to-digital converter ADC.

9. The computing module of claim 8, wherein the convolution operation array may add offsets comprising:

an array of offset cells consisting of a plurality of offset operation cells, the array of offset cells having a scale of i x j x k, each convolution operation cell (i, j, k) comprising a current Ix _i A switch, at least one control signal, node a _j,k At least one capacitor;

said current I _b Is the current Ix _i Is transmitted in the circuit, I _b Corresponding to k convolution operation units;

b _j,k is a multi-bit binary offset b _j The value of the k bit of the (b) is 0 or 1, the capacitance capacity 1/2 in the k-direction convolution operation unit is reduced, and the offset b is offset _j Is set to each bias bit b of _j,k The voltage of the combined nodes is that

Wherein B is the highest bit of the binary system;

y _j is 1*k set of cell all nodes a _j,k The voltage sum is accumulated.

10. A multi-bit convolution analog operation method based on capacitance-capacity variable current integration and charge sharing is characterized by comprising the following steps:

DAC inputs a number x by a given number _i Current Ix converted to analog signal _i Transmitting in a circuit;

w _ji,k is the weight w _ji The value at the kth bit, k.epsilon.1, B]Wherein B refers to the highest bit of the binary system, each bit w _ji,k Corresponding to a rollProduct operation unit and w _ji,k Is 0 or 1; the k-direction convolution operation unit depends on the weight w _ji Each bit w of (2) _ji,k A low order to high order, where j represents the weight index where the weight is the jth window;

the capacitance value in the k-direction convolution operation unit is reduced by 1/2 of the bit, the capacitance value in the next-bit convolution operation unit is 1/2 of the previous bit, and the capacitance of k=1 is C _u The capacitance value of the k=b convolution operation unit is C _u /2 ^(B-1) ；

Current Ix _i Control of current Ix using control signals before flowing through switches in convolution operation units _i Is turned on, the switch is closed, and the current Ix _i Through node a _ji,k Integrating in the capacitor, and forming a node a above the capacitor in the convolution operation unit with different k directions _ji,k Voltage of (2) is related to capacitance capacity in cell ^(k-1) The control signal is turned off and the current Ix is changed _i The integrated charge is 0, node a above the capacitor _ji,k Is 0, the voltage is x _i *w _ji,k *2 ^(k-1) Is a multiplication result of (a);

current Ix _i After the same integration time in the capacitor, shorting all convolution operation unit internal nodes a of one i-by-k plane _ji,k The charge sharing between the capacitors in each convolution operation unit is realized, and the obtained voltage of the combined node is the convolution output result y _j 。

11. The method of operation of claim 10 wherein said y is output using a connected analog-to-digital converter ADC _j Previously, the attenuation capacitor was connected in parallel to adjust the full scale range of the accumulated voltage such that the accumulated voltage swing at the combining node is below the analog-to-digital converter ADC input range.