CN111611529A - Current integration and charge sharing multi-bit convolution operation module with variable capacitance capacity - Google Patents

Current integration and charge sharing multi-bit convolution operation module with variable capacitance capacity Download PDF

Info

Publication number
CN111611529A
CN111611529A CN202010261165.XA CN202010261165A CN111611529A CN 111611529 A CN111611529 A CN 111611529A CN 202010261165 A CN202010261165 A CN 202010261165A CN 111611529 A CN111611529 A CN 111611529A
Authority
CN
China
Prior art keywords
convolution operation
current
bit
operation unit
capacitor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010261165.XA
Other languages
Chinese (zh)
Other versions
CN111611529B (en
Inventor
阿隆索·莫尔加多
刘洪杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Jiutian Ruixin Technology Co ltd
Original Assignee
Shenzhen Jiutian Ruixin Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Jiutian Ruixin Technology Co ltd filed Critical Shenzhen Jiutian Ruixin Technology Co ltd
Priority to CN202010261165.XA priority Critical patent/CN111611529B/en
Publication of CN111611529A publication Critical patent/CN111611529A/en
Application granted granted Critical
Publication of CN111611529B publication Critical patent/CN111611529B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/15Correlation function computation including computation of convolution operations
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Analogue/Digital Conversion (AREA)

Abstract

The invention relates to an analog operation module, in particular to an analog operation module related to convolution operation, and provides a group of analog Multipliers and Accumulators (MAC). The module has higher speed and the same unit clock when carrying out convolution operation, and can be used for exchange delay or acceleration in an area. This concept applies to a series of binary number adjustable multi-bit convolutions that can be used to implement a general convolution with two or more inputs. In particular, an array of offset arithmetic units may be added. The invention can be used as a neural network convolution operation unit or a unit of memory or near memory operation realized by operation accelerator hardware.

Description

Current integration and charge sharing multi-bit convolution operation module with variable capacitance capacity
Technical Field
The present invention relates to analog computation modules, and particularly to an analog computation module for convolution operations and an analog computation method for convolution operations.
Background
For quantization with low signal-to-noise ratio, analog operation has higher efficiency than traditional digital operation, and therefore, digital quantity is usually converted into analog quantity for operation. Especially for the neural network, compared with the medium and large hardware implementation of the neural network, the operation energy consumption of the neural network is lower, because the traditional data is stored in the disk, the data needs to be extracted into the memory during the operation, and the process needs a large amount of I/O connected with the storage of the traditional memory, which usually occupies more power consumption. And the operation process can be sent to data for local execution based on the analog memory and near memory operation, so that the operation speed is greatly improved, the storage area is saved, and the data transmission and the operation power consumption are reduced. The invention provides an effective realization method of ultra-low power consumption analog memory or near memory operation.
The recent paper "a Mixed-Signal binary weighted Storage and Multiplication for reduced data Movement" symp.vlsi Circuits, pp.141-142,2018 proposes a binary-based or near-Memory analog operation showing an efficient performance, and a Static Random Access Memory (SRAM) unit stores a 1-bit weight and performs a convolution operation on an input Mixed Signal, thereby greatly improving the operational capability and reducing the Storage area. However, in this background art document, the implementation of the analog operation circuit does not involve a change in the weight bits of the multiplier or multiplicand, and is limited to the input of 1-bit multiplication in the first order layer, and cannot be used for convolution analog operation of a multi-bit binary number.
Very few multi-bit operations involve changes in the weight bits of the multiplier or multiplicand, as in the article:
(1)“In-Memory Computation of a Machine-Learning Classifier in aStandard6T SRAM Array”,JSSC,pp.915-924,2017;(2)“A 481pJ/decision 3.4Mdecision/s multifunctional deep inmemory inference processor using standard6T SRAM array”,arXiv:1610.07501,2016;(3)“A Microprocessor implemented in 65nmCMOS with Configurable and Bit-scalable Accelerator for Programmable In-memory Computing”,arXiv:1811.04047,2018;(4)“A Twin-8T SRAM Computation-In-Memory Macro for Multiple-Bit CNN-Based Machine Learning,”,ISSCC,pp.396-398,2018,(5)“A 42pJ/Decision 3.12TOPS/W Robust In-Memory Machine LearningClassifier with On-Chip Training,”ISSCC,pp.490-491,2018;
but these multi-bit operations are implemented by using modulation of the control bus in the current domain, capacitive charge sharing, Pulse-width-modulated (PWM), modification of SRAM cells, or complex digital matrix vector processing with near memory operations, among other ways. In the implementation methods of the multi-bit operation, the multi-bit analog multiplier and accumulator always adopt very complicated digital processing control, but in the aspect of quantization with low signal to noise ratio, the traditional digital operation consumes a lot of effects compared with the analog operation, so the multi-bit operation under the control of the digital processing generates great operation energy consumption.
In the stage of performing the exclusive or operation by the binarization convolution proposed in CN201910068644, the potential change is realized by modulating a control bus in the SRAM, but the technical scheme and teaching provided by the patent require complex digital processing control, have high requirements on a control module, and consume excessive energy consumption. Therefore, a solution for realizing ultra-low power consumption by using analog convolution operation on a signal with a low signal-to-noise ratio is needed in the art.
Disclosure of Invention
In view of the above, an object of the present invention is to provide a multi-bit binary convolution analog operation module based on variable capacitance-capacitance current integration and charge sharing, which has ultra-low power consumption, a compact structure and a fast operation speed, and supports general convolution of two or more inputs, and the bit number of the binary can be adjusted, especially, the module can be used as a neural network convolution operation unit or a unit for analog memory operation implemented by operation accelerator hardware.
Besides the advantages, the implementation of the related module based on the matrix unit is reasonable for the convolution operation unit in the memory or close to the memory, so that the power of processes related to memory access is reduced, and the matrix physical implementation is more compact. In order to realize the purpose, the following technical scheme is adopted:
based on the aboveThe invention provides a multi-bit convolution operation module based on capacitance capacity variable current integration and charge sharing. The module includes: at least one digital input xiAt least one Digital to Analog Converter (DAC) for inputting the Digital data into the ConverteriConverted into a current Ix according to a given number of bitsiTransmitting in the circuit; at least one weight wjiWhen the weight is expressed as a binary number, wji,kIs the value at its k-th bit; a convolution operation array composed of a plurality of convolution operation units 102, the convolution operation array completing multiplication and addition of convolution operation, at least one output yj
Figure BDA0002439340560000031
Further, the current IxiIs to input digital x by DACiCurrent Ix converted according to a given number of bits of DACiMirrored or copied into a convolution array, a current IxiCorresponding to j × k convolution operation units;
in particular, said array of convolution operations is of size i j k, each convolution operation unit (i, j, k) comprising a current IxiSwitch, at least one control signal, node aji,kAnd at least one capacitor.
In particular, the same current IxiThe amount of charge stored varies as a function of the capacitance integrated within the capacitor with decreasing capacitance value 1/2, and the resulting voltage across the capacitor varies as a function of the capacitance value. For the weight wji,wji,kIs the weight wjiBinary representation of the value at the k-th bit, k ∈ [1, B]。
Further, the currents of the same j × k plane are the same, and the convolution operation array allows the input of the multi-bit signal to be the same as the current IxiCan be scaled in the DAC, current IxiThe same time as arriving at the switch, i.e. for the presence of current IxiIntegral capacitor, current IxiThe same time is used for starting integration and the same time is used for finishing integration.
Further, each of the currents IxiThe capacitance of the corresponding operational array in the k-direction decreases bit by bit 1/2 for a binary weight wji(j indicates that the weight is the weight index of the jth window), wji,kIs the weight wjiThe value at the k-th bit, each wji,kCorresponding to a convolution operation unit, k ∈ [1, B ]]The capacitance of the previous convolution operation unit is 2 times that of the next convolution operation unit, for example, if the lowest bit k is 1 and the capacitance in the convolution operation unit is Cu, the capacitance in the highest bit k is B and the capacitance in the convolution operation unit is 1/2(k -1)*Cu。
Further, when the switch is closed, the current IxiAnd enter into the integration of capacitors of different capacities.
In particular, the switch closure can be controlled by a control signal, w, which is always on or offji,kWhen the signal is equal to 1, the control signal in the convolution operation unit is always on, and the switch is closed; w is aji,kWhen the convolution operation unit is equal to 0, the control signal in the convolution operation unit is always turned off, and the switch is turned off.
Further, as described above, assume wji,1=wji,BWhen the current in the capacitor has passed through the same integration time, the amount of charge stored in the capacitor is the same, and the voltage across the corresponding capacitor, where k is B, is 2 of the capacitor voltage where k is 1(k-1)And (4) doubling.
Further, the node aji,kAt a voltage of xi*wji,k*2(k-1)The result of the multiplier, whose value is given by the capacitance of the capacitor connected to the node and wji,kDetermination of xiCorresponding 1 × k convolution operation units for xi*wjiAnd (4) performing the operation of (1).
Further, y isjIs given all a of a j, i x k facesji,kThe capacitors in different convolution operation units share charges through the connected nodes due to the characteristic of capacitor discharge, after the charge sharing is finished, the charge amount in each capacitor is the same, but the current Ix in the multiplication stage isiThe total charge resulting from integration is constant, and the accumulation at the node is combinedThe voltage is the output of convolution operation, and an i x k surface convolution operation unit is used for multiplication operation of a convolution kernel and the convolution process of an input matrix.
Further, for neural network convolution operation units, an offset is typically added. Offset b of the inventionjConversion to a given current IxiAdditional input of a fixed current IbAdding additional bias operation units for independent operation, wherein the size of the bias operation unit array is j × k, and each convolution operation unit (j, k) comprises a current IbSwitch, node aj,kAnd at least one capacitor.
Further, y isjOffset b ofjAll nodes a of the unit are 1 x kj,kThe sum of the voltages is accumulated.
Further, to reduce kickback or transient effects on the current mirror, the switch is a virtual switch or a current device or a non-switching element.
Further, a damping capacitor C is added when connecting the combined nodesattTherefore, the scale range of the accumulated voltage is adjusted, so that the accumulated voltage is scaled to a certain scale range, and the input range of the digital-to-analog converter is met.
The invention also provides a multi-bit convolution analog operation method based on current integration and charge sharing with variable current values, which comprises the following steps: DAC inputting digital number x according to given bit numberiCurrent Ix converted to analog signaliTransmitting in the circuit; w is aji,kIs the weight wjiThe value at bit k, k ∈ [1, B]Wherein B refers to the highest bit of the binary system, each bit wji,kCorresponding to a convolution operation unit and wji,kIs 0 or 1; the k direction convolution operation unit depends on the weight wjiEach bit w ofji,kArranged from low to high; the capacitance value in the convolution operation unit in the k direction is decreased progressively according to the 1/2, the capacitance value in the convolution operation unit in the next bit is 1/2 of the previous bit, and the capacitance of the capacitor with k being 1 is CuAnd the capacitance value of the convolution operation unit is C when k is equal to Bu/2(B-1)(ii) a Current IxiControlling the current Ix using a control signal before flowing through a switch in a convolution operation unitiIntegral ofThe control signal is on, the switch is closed, and the current IxiThrough node aji,kEntering the capacitor upper node a in the convolution operation unit with different k directions for capacitor internal integrationji,kVoltage dependent cell capacitance 2(k-1)Change, the control signal is in off state, the current IxiThe integrated charge is 0, and the node a above the capacitorji,kIs 0, the voltage is xi*wji,k*2(k-1)The multiplication result of (2); current IxiAfter the same integration time in the capacitor, all nodes a in the convolution operation unit of one i x k surface are short-circuitedji,kThe electric charge sharing between the capacitors in each convolution operation unit is carried out, and the obtained voltage of the combined node is the output result y of the convolutionj
Further, to accommodate the digital input xiIs converted into a digital input x in the DACiPreviously, the resolution of the DAC is adjusted.
Further, at the output y of the connected ADCjPreviously, attenuation capacitors were connected in parallel to adjust the full scale range of the accumulated voltage, making the accumulated voltage swing at the combining node lower than the analog-to-digital converter input range.
Drawings
FIG. 1 is a schematic diagram of a multiplication phase circuit implementation according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an implementation of an adder stage circuit based on charge redistribution according to an embodiment of the present invention (ADC is not shown, if y is neededjCan be added to each output y when converted to digital outputjBefore);
FIG. 3 is a schematic diagram illustrating an implementation of adding an offset unit to a convolution operation according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating an accumulation process in an offset cell array according to an embodiment of the present invention.
Description of the main elements
Module group 10
Digital-to-analog converter 101
Convolution operation unit 102
Switch with a switch body 1021
Capacitor with a capacitor element 1022
Control signal 103
Offset operation array 104
Offset operation unit 1041
Attenuating capacitor 105
Digital input xi
Weight of wji
Electric current Ixi
Detailed Description
In order to make the objects, principles, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail below with reference to the accompanying drawings and embodiments.
It is to be understood that the specific embodiments described herein are for purposes of illustration, but the invention may be practiced otherwise than as specifically described and that there may be variations which will occur to those skilled in the art without departing from the spirit of the invention and therefore the scope of the invention is not limited to the specific embodiments disclosed below.
Referring to fig. 1, for one general convolution operation as follows:
binary number x of multiple bitsiAn input matrix of i from 1 to N; weight wjiA formed convolution kernel, j represents a corresponding jth window after i is determined; when the input forms an input matrix of n x n and the convolution kernel is a weight matrix of m x m, j is 1-n-m +1 (n)>m, the window moves); the output is yjAll of yjForming a convolution operation result, namely extracting a layer of neural network features;
w isjiWhen represented as a binary number of multiple bits, wji,kIs wjiValue at k-th bit, two multi-bit binary ∑ xi*wjiThe convolution operation process is divided into two stages:
a multiplication stage: input xiMultiplied by a weight wjiEach bit of (a) is multiplied by the bit weight of the bit 2(k-1)I.e. xi*wji,k*2(k-1)And wji,kIs 0 or 1.
And (3) addition stage: accumulating and summing the result of each multiplication operation in the multiplication stage to obtain an output yj
When the inventive module is used for convolution calculation of a neural network, the multiplication stage weight wjiThe weight matrix is constructed to share that when j changes from 1 to n-m +1, the output yj is w under the size determination of the convolution kernelj1=wj2=wj3=...=wji。
For the above-mentioned convolution operation with multi-bit binary, the present invention needs to solve the bit weight change and the addition phase of the accumulation of the multiplication result when the multiplicand is multiplied by each bit of the multiplier in the multiplication phase.
The embodiment of the invention provides an operation module 10 for realizing the multi-bit convolution operation based on current integration and charge accumulation with variable capacitance capacity. The module 10 comprises: at least one digital input xiAt least one Digital to analog converter (DAC) 101 converts the Digital input to a current IxiTransmitting in the circuit; at least one weight wjiWhen the weight is expressed as a binary number, wji,kBinary representing the value at the k bit for it; a convolution operation array comprising a plurality of convolution operation units 102, the convolution operation array having a size of i x j k, each convolution operation unit 102(i, j, k) including a current IxiSwitch 1021, at least one control signal 103, node aji,kAnd at least one capacitor 1022, wherein the negative terminal of the capacitor 1022 is grounded, and the capacitor 1022 needs to be reset to a given direct current voltage before convolution operation. The array performs multiplication and addition of convolution operations and at least one output yj
The multiplication phase, bit k increments the directional capacitance 1022 by 1/2. In the embodiment, the convolution operation unit 102 for convolution in the memory or near memory based on the matrix unit is implemented, so that the power of the process related to memory access is reduced, and the matrix physical implementation is more compact. Specifically, referring to fig. 1, the digital-to-analog converter 101 of the module 10 of the present invention inputs a given number of binary digits into xiCurrent Ix converted to analog signaliResolution of DAC with digital input xiThe number of bits in the DAC may be adjusted, in some embodiments, before conversion. Current IxiMirrored or copied by a current mirror into j × k convolution operation units 102 corresponding to the same i, in the k direction wji,kEach convolution operation unit is arranged according to the position of the convolution operation unit from low to high in sequence and the weight wjiThe capacitance 1022 decreases by 1/2; the current integration for cells in different i x k planes, j directions can thus be performed simultaneously and ended simultaneously. In a further embodiment, the current Ix required to be converted by the DACiThe current value can be controlled not to exceed a certain threshold value by being scaled in the DAC and then transmitted in the circuit according to the requirement, and the power loss of transmission is reduced. Thereafter current IxiTo switch 1021, which may be a virtual switch or a current or a non-switching element such as a current or a virtual load, to mitigate kickback or transient effects on the current mirror.
Although, the implementation of the capacity decreasing allocation for the capacitors 1022 can be implemented by connecting capacitors 1022 of multiple units in parallel, so that the physical implementation of the present invention has higher speed and clocks of the same unit. However, this approach tends to create problems with interconnect capacitive parasitics. Each current Ix described by the inventive module 10iThe k-increment directional capacitance 1022 of each corresponding 1 xj k operation array is decreased by 1/2, and the weight w is decreasedjiThe capacitor 1022 in the convolution operation unit 102 corresponding to the next bit is made to be half of the previous bit directly, so that the module 10 of the present invention has higher speed and the same unit clock in the operation process. For binary weight wji(j indicates that the weight is the weight index of the jth window), wji,kIs the weight wjiThe value at the k-th bit, each wji,kCorresponding to one convolution operation unit 102, k ∈ [1, B]B is the highest bit of the binary, the capacitance 1022 of the previous convolution operation unit 102 is 2 times that of the next bit, k is 1, 2, and 3, the capacitance values in the corresponding convolution operation units 102 are Cu, 1/2 Cu, and 1/4 Cu, respectively, and the capacitance 1022 of the highest bit k is B in the convolution operation unit 102 is 1/2(B-1)Cu. In this embodiment, since there is no precise requirement for the integration time of the current, the weight w is setjiThe control does not require complex digital processing control, and the integration of the current can be controlled by the control signal 103, which is always on or off. Specifically, bit wji,kWhen 1, the control signal 103 is onIn the on state, the switch is in the closed state, and the current enters the capacitor 1022 for integration, generating a voltage across the capacitor. Bit wji,kWhen the value of (b) is 0, the control signal 103 is off, the switch is off, the current does not enter the capacitor to be integrated, and the voltage across the capacitor is 0 at any time.
For example, assume wji,1=wji,2=wji,31 (the corresponding subscripts i and j are the same), the corresponding k is 1, 2, and 3, the capacities of the capacitors 1022 in the convolution operation unit 102 are Cu, 1/2 Cu, and 1/4 Cu, respectively, and the capacity of the capacitor 1022 in the kth convolution operation unit 102 is 1/2(k-1)Cu, then current Ix in capacitor 1022iAfter the same integration time, the current Ix in the 3 capacitors 1022iThe integrated charge amounts are the same, and when the charge amounts are the same, and the voltages across the capacitors 1022 and the capacitor capacities are inversely proportional, the voltages across the corresponding capacitors 1022 are U, 2U, and 4U, and the value of the capacitor 1022 in the convolution operation unit 102, where k is equal to 1, is 2 of the voltage of the capacitor 1022 in the convolution operation unit 102, where k is equal to 1(k-1)Multiplication, i.e. implementation of the weight wjiOr the multiplier multiplying each bit by the input xiOr a change in the multiplicand with a weight bit. It should be noted that w has been used above for convenience only to illustrate the capacitance versus voltage relationshipjiIn fact, regardless of wji,kIs 0 or 1, the current Ix in each convolution operation unit 102iHas the same integration time except that wji,k0 corresponds to the integration of a current value of 0, w, performed in convolution operation section 102ji,k1 corresponds to the value 1/2 performed in the convolution operation section 102(k-1)The capacitance 1022 in each convolution unit 102 is only decreased by bit 1/2 due to the integration of Cu, and is not decreased by wji,kIs 0 or 1.
The node aji,kAt a voltage of xi*wji,k*2(k-1)The result of the multiplier is the value of the capacitance 1022 of the capacitor 1022 connected to the node and wji,kDetermining, and thus determining, the voltage across capacitor 1022, with the negative terminal of capacitor 1022 being grounded, then the capacitor1022 is connected to the positive plate of the capacitor 1022ji,kThe voltage of (d).
And in the addition stage, convolution output is obtained through charge sharing stored in the capacitor. After all convolution operation units 102 complete the current integration operation in the multiplication stage, j is 1, x1The corresponding k units finish x once1*w11Operation of (a), x1*w11The binary multiplication operation of (a) is broken down to see the input x1Are respectively multiplied by the weight w11Each bit w ofji,kAnd the bit weight of the bit 2(k-1)And then the results obtained respectively are added. For the same reason, xiCorresponding k units complete x oncei*wi1Then j 1, i ∈ N corresponding to all i 1 k arrays complete a convolution window multiplication operation, the invention described i 1 k array each unit node aji,kVoltage multiplication x for each stepi*wji,k*2(k-1)As a result, after the multiplication operation is completed, the capacitor 1022 is short-circuited, and the short-circuit j equal to 1 corresponds to the node a above all the capacitors 1022 in the arrayji,kDue to the discharging characteristic of the capacitor 1022 and the discharging characteristic of the capacitor 1022, the capacitors 1022 in different convolution operation units 102 share charges through respective connected nodes, after the charge sharing is finished, the charge amount in each capacitor 1022 is the same, but the total charge amount obtained by current integration in the multiplication stage is not changed, and the accumulated voltage at the combined node is the output of the convolution operation, namely the output y1. And similarly, shorting nodes a of the i x j x k arrays corresponding to other jji,kBy connecting capacitors 1022 in parallel, other corresponding outputs y can be obtainedj Equation 1 below. It is noted that, in the case of weight matrix sharing in the convolutional neural network, the convolution kernels corresponding to different windows are the same, that is, when convolution results of different windows are operated, the multiplicand, that is, the weight wjiThe weight matrices formed are identical, w1j=w2j=w3j=.....=wjiThe number of parameters participating in the operation is reduced.
Figure BDA0002439340560000101
Optionally, the output signal is converted. After the convolution operation array performs the accumulation operation of Analog multiplication, the output is Analog signal, when the output signal is required to be Digital signal, an Analog-to-Digital Converter (ADC) is added before the output, and the obtained output yjIs a digital signal. For example, the convolution module is applied to a convolution neural network, and the digital output yjAnd the digital input can be used as the digital input to enter a convolution operation array to carry out convolution operation of a neural network of the next layer. Furthermore, if the accumulated voltage swings or is too high in the input range of the analog-to-digital converter, it is possible to increase the unit capacitance C by adding the unit capacitance C in the multiplication stage as shown in FIG. 1uHowever, the number of capacitors required for each set of convolution operation units 102 increases, and a larger physical area is required, which is disadvantageous for miniaturization of the device. It is therefore contemplated that additional attenuation capacitors C may be connected simultaneously when connecting the combination nodesattAnd entering the combination node, so as to adjust the scale range of the accumulated voltage, so that the accumulated voltage is scaled to a certain scale range, and the input range of the digital-to-analog converter is met. Whenever yj is output, the node a above the capacitor is attenuated by using the attenuation capacitor 105att,jWith the original node aji,kConnected, this solution makes more efficient use of the area physically realized by the modules.
Fig. 3 and 4 show an embodiment of adding an offset operation unit 1041 when the convolution operation unit 102 is used for convolution neural network operation according to the present invention. The addition of the offset b in view of the convolution operation makes the convolution operation more efficient and accurate, typically for a given output yjAdding a binary offset bj. Then the corresponding convolution outputs yjFrom equation 1 to equation 2 below.
Figure BDA0002439340560000111
Figure 3 illustrates how this additional functionality is added in the multiplication stage. Since quantization of the bias bits is performed in a similar mannerWeights in FIG. 1 or FIG. 2, so that the bias is implemented as a given current IxiAdditional input of a fixed current Ib
Offset b of the inventionjConversion to a given current IxiAdditional input of a fixed current IbAdding additional bias operation units 1041 for independent operation, wherein the bias operation units 104 form a bias operation array 104 with the size of j × k, and each bias operation unit 1041(j, k) comprises a current IbSwitch 1021, node aj,kAt least one capacitor 1022, current IbIntegrated in the capacitor 1022.
Each current I in the bias operation array 104 of the present inventionbThe value of the capacitor 1022 in the k increasing direction of the bias operation array corresponding to 1 · k is decreased 1/2, that is, the value of the capacitor 1022 in the next bias operation unit 1041 is made to be half of the previous bit directly, and the capacitance in the bias operation unit 1041 corresponding to the k-th bit is Cu/2(k-1)The capacitors 1022 in the bias operation units 1041 corresponding to the same k bits with different weights are the same. Similarly, the switch 1021 in the offset operation array 104 is controlled by the control signal 103 which is always on or always off, bj,kWhen the value is 1, the bias current I is set in response to the control signal 103 in the bias operation unit 1041 being in the ON statebInto a capacitor 1022 in the bias arithmetic unit 1041, bj,kWhen the value is 0, the bias current I is set in accordance with the off state of the control signal 103 in the bias operation unit 1041bAnd does not enter the integration in capacitor 1022. With the convolution operation array, the voltage across the capacitor 1022 is the calculation result of the multiplication stage of the offset operation unit 1041.
Fig. 4 illustrates that during the accumulation phase, an additional capacitor 1022 needs to be added for charge sharing and node accumulation.
Similarly, short circuit is given k unit nodes a corresponding to jj,kDue to the discharging characteristic of the capacitor 1022, the capacitors 1022 in the shorted array perform charge sharing, the amount of charge stored in each capacitor 1022 is the same after sharing is completed, but the total charge value is not changed, and the voltage of the obtained combined node is the voltage of each multiplication stageMultiplication result node aji,kThe sum of voltages, i.e. yjOffset b of (1 k) sets of all nodes a of the cellj,kThe sum of the voltages is accumulated.
The invention also provides a multi-bit convolution analog operation method based on current integration and charge sharing with variable current values, which comprises the following steps: DAC inputting digital number x according to given bit numberiCurrent Ix converted to analog signaliTransmitting in the circuit; w is aji,kIs the weight wjiThe value at bit k, k ∈ [1, B]Wherein B refers to the highest bit of the binary system, each bit wji,kCorresponding to a convolution operation unit and wji,kIs 0 or 1; the k direction convolution operation unit depends on the weight wjiEach bit w ofji,kArranged from low to high; the capacitance value in the convolution operation unit in the k direction is decreased progressively according to the 1/2, the capacitance value in the convolution operation unit in the next bit is 1/2 of the previous bit, and the capacitance of the capacitor with k being 1 is CuAnd the capacitance value of the convolution operation unit is C when k is equal to Bu/2(B-1)(ii) a Current IxiControlling the current Ix using a control signal before flowing through a switch in a convolution operation unitiIs turned on, the switch is closed, the current IxiThrough node aji,kEntering the capacitor upper node a in the convolution operation unit with different k directions for capacitor internal integrationji,kVoltage dependent cell capacitance 2(k-1)Change, the control signal is in off state, the current IxiThe integrated charge is 0, and the node a above the capacitorji,kIs 0, the voltage is xi*wji,k*2(k-the multiplication result of 1); current IxiAfter the same integration time in the capacitor, all nodes a in the convolution operation unit of one i x k surface are short-circuitedji,kThe electric charge sharing between the capacitors in each convolution operation unit is carried out, and the obtained voltage of the combined node is the output result y of the convolutionj
In a further embodiment, the resolution of the DAC is adjusted to accommodate the change in the number of bits of the digital input xi before the DAC converts the digital input xi. In addition, before connecting the ADC output yj, the attenuation capacitor is connected in parallel to adjust the full scale range of the accumulated voltage, so that the accumulated voltage swing of the combination node is lower than the input range of the analog-to-digital converter.
It should be noted that, in the foregoing embodiment, each included module is only divided according to functional logic, but is not limited to the above division as long as the corresponding function can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (11)

1. The changeable current integration of electric capacity and the multiposition convolution operation module of charge sharing, its characterized in that includes:
at least one digital input xiAt least one Digital to analog converter (DAC), at least one weight wjiA convolution operation array composed of a plurality of convolution operation units, at least one output yj
The digital input xiCurrent Ix converted into analog signal by DAC according to given bit numberiTransmitting in the circuit;
the weight wjiJ indicates that the weight is the weight index of the jth window, wji,kIs the weight wjiThe value at the kth bit, 0 or 1; each wji,kCorresponding to a convolution operation unit, k ∈ [1, B ]]Wherein B is the highest bit of the binary system;
the convolution operation array has the scale of i x j x k, the direction of i is the input direction, the direction of j is the convolution window direction, and the convolution operation unit in the direction of k depends on the weight wjiEach bit w ofji,kArranged from low to high; each convolution operation unit (i, j, k) comprises a current IxiSwitch, at least one control signal, node aji,kAt least one capacitor, the capacitance value in the k-direction convolution operation unit is decreased progressively according to bit 1/2, and the capacitance in the latter one bit convolution operation unit1/2 with the value of the previous bit, and the capacitance of the capacitor is C when k is 1uAnd the capacitance value of the convolution operation unit is C when k is equal to Bu/2(B-1)
The control signal controls the current IxiIs turned on, the switch is closed, the current IxiThrough node aji,kEntering the capacitor upper node a in the convolution operation unit with different k directions for capacitor internal integrationji,kVoltage dependent cell capacitance 2(k-1)Change, the control signal is in off state, the current IxiThe integrated charge is 0, and the node a above the capacitorji,kIs 0, the voltage is xi*wji,k*2(k-1)The multiplication result of (2);
said yjBy shorting all nodes a in a convolution operation unit of an i x k surfaceji,kAnd sharing the charge among the capacitors in each convolution operation unit to obtain the voltage of the combined node, wherein the voltage is the output result of the convolution operation.
2. The arithmetic module of claim 1, wherein a number is input xiThe combined voltage of the corresponding 1 × k convolution operation units is xi*wjiAs a result, the voltage of the combined node of the convolution operation unit of an i × k plane is ∑ xi·wjiAs a result of (1), output yjAnd completing the operation of the convolution process of the convolution kernel and the input matrix.
3. The arithmetic module of claim 2, wherein the control signal is always on or always off.
4. The operational module of claim 3, wherein the current IxiMirrored or copied into an array of convolution operations, the corresponding j x k surfaces having the same current, current IxiMay be scaled in a digital-to-analog converter.
5. The computing module of claim 4, wherein the computing module comprises a plurality of memory cellsIn that each of said currents IxiThe capacitance of the next bit of capacitor is half of that of the previous bit in the k direction of the corresponding 1 x k operation units, and after the current in the capacitor passes through the same integration time, the adjacent w areji,kThe convolution operation unit is 1, the voltage at two ends of the capacitor at the next bit is 2 times of the voltage at the previous bit, and the integration time of all the capacitors starts and ends simultaneously.
6. The computing module of claim 5, wherein the capacitors are replaced by resistors, and the resistance of the k-direction resistors is increased by a factor of 2.
7. The operational module of any of claims 1-6, wherein the switch is a non-switching element such as a virtual switch or a current device to reduce kickback or transient effects on the current mirror.
8. The arithmetic module of claim 7, wherein the output y is output when the accumulated voltage swing at the combining node is higher than or too high than the input range of the analog-to-digital converterjThe full scale range of the accumulated voltage is adjusted by connecting an attenuation capacitor in parallel before connecting the analog-to-digital converter.
9. The operational module of claim 8, wherein the convolution array is capable of adding an offset comprising:
an offset cell array comprising a plurality of offset operation cells, said array of offset operation cells having a size j x k, each convolution operation cell (j, k) comprising a current IbSwitch, at least one control signal, node aj,kA capacitor;
the current IbIs a current IxiIs transmitted in the circuit, IbCorresponding to k convolution operation units;
bj,kis a multi-bit binary bias bjThe k-th bit of (a), the value of which is 0 or 1, the capacitance 1/2 in the k-direction convolution operation unit is decreased progressively, and the bias bjEach bias bit bj,kFrom low to highArranging by short-circuiting all nodes a in convolution operation unit of i x k surfaceji,kSharing the charge in the capacitor to obtain the voltage of the combined node; the voltage of the combined node is
Figure FDA0002439340550000031
The result of (1);
yjis biased to 1 × k groups of all nodes a of the cellj,kThe sum of the voltages is accumulated.
10. A multi-bit convolution analog operation method based on capacitance capacity variable current integration and charge sharing is characterized by comprising the following steps:
DAC inputting digital number x according to given bit numberiCurrent Ix converted to analog signaliTransmitting in the circuit;
wji,kis the weight wjiThe value at bit k, k ∈ [1, B]Wherein B refers to the highest bit of the binary system, each bit wji,kCorresponding to a convolution operation unit and wji,kIs 0 or 1; the k direction convolution operation unit depends on the weight wjiEach bit w ofji,kArranged from low to high;
the capacitance value in the convolution operation unit in the k direction is decreased progressively according to the 1/2, the capacitance value in the convolution operation unit in the next bit is 1/2 of the previous bit, and the capacitance of the capacitor with k being 1 is CuAnd the capacitance value of the convolution operation unit is C when k is equal to Bu/2(B-1)
Current IxiControlling the current Ix using a control signal before flowing through a switch in a convolution operation unitiIs turned on, the switch is closed, the current IxiThrough node aji,kEntering the capacitor upper node a in the convolution operation unit with different k directions for capacitor internal integrationji,kVoltage dependent cell capacitance 2(k-1)Change, the control signal is in off state, the current IxiThe integrated charge is 0, and the node a above the capacitorji,kIs 0, the voltage is xi*wji,k*2(k-1)The multiplication result of (2);
current IxiAfter the same integration time in the capacitor, all nodes a in the convolution operation unit of one i x k surface are short-circuitedji,kThe electric charge sharing between the capacitors in each convolution operation unit is carried out, and the obtained voltage of the combined node is the output result y of the convolutionj
11. The method of claim 10 wherein y is output from said ADC using a linkjPreviously, attenuation capacitors were connected in parallel to adjust the full scale range of the accumulated voltage, making the accumulated voltage swing at the combining node lower than the analog-to-digital converter input range.
CN202010261165.XA 2020-04-03 2020-04-03 Multi-bit convolution operation module with variable capacitance, current integration and charge sharing Active CN111611529B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010261165.XA CN111611529B (en) 2020-04-03 2020-04-03 Multi-bit convolution operation module with variable capacitance, current integration and charge sharing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010261165.XA CN111611529B (en) 2020-04-03 2020-04-03 Multi-bit convolution operation module with variable capacitance, current integration and charge sharing

Publications (2)

Publication Number Publication Date
CN111611529A true CN111611529A (en) 2020-09-01
CN111611529B CN111611529B (en) 2023-05-02

Family

ID=72199328

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010261165.XA Active CN111611529B (en) 2020-04-03 2020-04-03 Multi-bit convolution operation module with variable capacitance, current integration and charge sharing

Country Status (1)

Country Link
CN (1) CN111611529B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113314174A (en) * 2021-05-06 2021-08-27 安徽大学 Circuit structure for column shift multi-bit multiplication binary decomposition operation of SRAM array
WO2023207441A1 (en) * 2022-04-27 2023-11-02 北京大学 Sram storage and computing integrated chip based on capacitive coupling

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170368682A1 (en) * 2016-06-27 2017-12-28 Fujitsu Limited Neural network apparatus and control method of neural network apparatus
CN110008440A (en) * 2019-04-15 2019-07-12 合肥恒烁半导体有限公司 A kind of convolution algorithm and its application based on analog matrix arithmetic element
CN110288510A (en) * 2019-06-11 2019-09-27 清华大学 A kind of nearly sensor vision perception processing chip and Internet of Things sensing device
CN110543933A (en) * 2019-08-12 2019-12-06 北京大学 Pulse type convolution neural network based on FLASH memory array

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170368682A1 (en) * 2016-06-27 2017-12-28 Fujitsu Limited Neural network apparatus and control method of neural network apparatus
CN110008440A (en) * 2019-04-15 2019-07-12 合肥恒烁半导体有限公司 A kind of convolution algorithm and its application based on analog matrix arithmetic element
CN110288510A (en) * 2019-06-11 2019-09-27 清华大学 A kind of nearly sensor vision perception processing chip and Internet of Things sensing device
CN110543933A (en) * 2019-08-12 2019-12-06 北京大学 Pulse type convolution neural network based on FLASH memory array

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113314174A (en) * 2021-05-06 2021-08-27 安徽大学 Circuit structure for column shift multi-bit multiplication binary decomposition operation of SRAM array
CN113314174B (en) * 2021-05-06 2023-02-03 安徽大学 Circuit structure for column shift multi-bit multiplication binary decomposition operation of SRAM array
WO2023207441A1 (en) * 2022-04-27 2023-11-02 北京大学 Sram storage and computing integrated chip based on capacitive coupling

Also Published As

Publication number Publication date
CN111611529B (en) 2023-05-02

Similar Documents

Publication Publication Date Title
CN111144558B (en) Multi-bit convolution operation module based on time-variable current integration and charge sharing
CN110209375B (en) Multiply-accumulate circuit based on radix-4 coding and differential weight storage
CN111431536B (en) Subunit, MAC array and bit width reconfigurable analog-digital mixed memory internal computing module
US11809837B2 (en) Integer matrix multiplication based on mixed signal circuits
CN111611529B (en) Multi-bit convolution operation module with variable capacitance, current integration and charge sharing
CN115048075A (en) SRAM (static random Access memory) storage and calculation integrated chip based on capacitive coupling
CN113364462B (en) Analog storage and calculation integrated multi-bit precision implementation structure
CN113627601A (en) Subunit, MAC array and analog-digital mixed memory computing module with reconfigurable bit width
CN112181895A (en) Reconfigurable architecture, accelerator, circuit deployment and data flow calculation method
CN111611528B (en) Multi-bit convolution operation module with variable current value, current integration and charge sharing
CN114330694A (en) Circuit and method for realizing convolution operation
CN115879530B (en) RRAM (remote radio access m) memory-oriented computing system array structure optimization method
CN115691613B (en) Charge type memory internal calculation implementation method based on memristor and unit structure thereof
Yu et al. A 4-bit mixed-signal MAC array with swing enhancement and local kernel memory
US20220416801A1 (en) Computing-in-memory circuit
CN113741857A (en) Multiply-accumulate operation circuit
Zhang et al. An energy-efficient mixed-signal parallel multiply-accumulate (MAC) engine based on stochastic computing
CN115906976A (en) Full-analog vector matrix multiplication memory computing circuit and application thereof
Lin et al. A reconfigurable in-SRAM computing architecture for DCNN applications
CN111988031B (en) Memristor memory vector matrix operator and operation method
CN113672854A (en) Memory operation method based on current mirror and storage unit, convolution operation method and device and application of convolution operation method and device
CN112784971A (en) Neural network operation circuit based on digital-analog hybrid neurons
CN115658013B (en) ROM in-memory computing device of vector multiply adder and electronic equipment
Yin et al. A 65nm 8b-Activation 8b-Weight SRAM-Based Charge-Domain Computing-in-Memory Macro Using A Fully-Parallel Analog Adder Network and A Single-ADC Interface
CN116402106B (en) Neural network acceleration method, neural network accelerator, chip and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant