CN114911453A - Multi-bit multiply-accumulate full digital memory computing device - Google Patents

Multi-bit multiply-accumulate full digital memory computing device Download PDF

Info

Publication number
CN114911453A
CN114911453A CN202210844251.2A CN202210844251A CN114911453A CN 114911453 A CN114911453 A CN 114911453A CN 202210844251 A CN202210844251 A CN 202210844251A CN 114911453 A CN114911453 A CN 114911453A
Authority
CN
China
Prior art keywords
sram
accumulator
computing device
sram array
accumulate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210844251.2A
Other languages
Chinese (zh)
Other versions
CN114911453B (en
Inventor
乔树山
曹景楠
尚德龙
周玉梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongke Nanjing Intelligent Technology Research Institute
Original Assignee
Zhongke Nanjing Intelligent Technology Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongke Nanjing Intelligent Technology Research Institute filed Critical Zhongke Nanjing Intelligent Technology Research Institute
Priority to CN202210844251.2A priority Critical patent/CN114911453B/en
Publication of CN114911453A publication Critical patent/CN114911453A/en
Application granted granted Critical
Publication of CN114911453B publication Critical patent/CN114911453B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/5443Sum of products
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Static Random-Access Memory (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)
  • Tests Of Electronic Circuits (AREA)

Abstract

The invention relates to a multi-bit multiply-accumulate all-digital memory computing device. The device includes: the device comprises a driving module, an SRAM array, an accumulator and a clock control module; the driving module is connected with the SRAM array; the SRAM array is connected with the clock control module; the SRAM array is connected to an accumulator, and a calculation result in a clock cycle is determined by the accumulator based on the weight data and the input data stored in the SRAM array. Therefore, the digital circuit is formed by adopting the driving module, the SRAM array, the accumulator and the clock control module, the problem of an analog circuit can be avoided, and the full precision of calculation is further realized. In addition, the invention can support the input of any bit by reasonably setting the accumulator, and simultaneously can send out full-precision data to an external circuit without loss, thereby greatly solving the problem that the simulation intensive circuit in the prior art is easy to be interfered by the outside.

Description

Multi-bit multiply-accumulate full-digital memory computing device
Technical Field
The invention relates to the technical field of electronic devices, in particular to a multi-bit multiply-accumulate all-digital memory computing device.
Background
Under the big background that the data volume of the lower edge calculation is increased rapidly, the operation and data separation mode in the original von Neumann architecture is not suitable for the current times, and the problems of a storage wall and a power consumption wall are easy to generate. To address the large data volume and large throughput of convolutional neural networks, in-memory computing architectures have emerged.
The current mainstream memory computing architecture is still an analog intensive circuit, and the circuit structure is easily interfered by the outside world and has a large precision problem.
Disclosure of Invention
In order to solve the above problems in the prior art, the present invention provides a multi-bit multiply-accumulate all-digital memory computing device.
In order to achieve the purpose, the invention provides the following scheme:
a multi-bit multiply-accumulate (multiply-accumulate) all-digital memory computing device, comprising: the device comprises a driving module, an SRAM array, an accumulator and a clock control module;
the driving module is connected with the SRAM array; the driving module is used for providing word line excitation and input excitation for the SRAM array; the SRAM array stores weight data;
the driving module and the accumulator are both connected with the clock control module; the clock control module is used for controlling the clock period of the SRAM array;
the SRAM array is connected with the accumulator; the accumulator is used for determining a calculation result in a clock cycle based on the weight data and the input data stored in the SRAM array.
Preferably, the SRAM array comprises: a plurality of SRAM modules;
each of the SRAM modules includes: an adder tree array and a plurality of SRAM cells; a plurality of the SRAM cells are all connected with the addition tree structure;
the addition tree structure is connected to the accumulator.
Preferably, the SRAM cell comprises: a multiplier and a 6T SRAM;
a first input of the multiplier is for receiving the input data; the second input end of the multiplier is connected with a weight storage point Q in the 6T SRAM; the output of the multiplier is connected to the adder tree array.
Preferably, the addition tree structure comprises: a plurality of adders.
Preferably, the accumulator is a digital shift adder.
Preferably, the number of the SRAM modules is 64.
Preferably, the number of the SRAM cells is 64.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
according to the multi-bit multiply-accumulate all-digital memory computing device, a digital circuit is formed by adopting the driving module, the SRAM array, the accumulator and the clock control module, so that the problem of an analog circuit can be avoided, and the full precision of computing is further realized. In addition, the invention can support the input of any bit by reasonably setting the accumulator, and simultaneously can send out full-precision data to an external circuit without loss, thereby greatly solving the problem that the simulation intensive circuit in the prior art is easy to be interfered by the outside.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a schematic structural diagram of a multi-bit multiply-accumulate all-digital memory computing device provided by the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention aims to provide a multi-bit multiply-accumulate all-digital memory computing device which can improve the computing precision.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
As shown in fig. 1, the multi-bit multiply-accumulate all-digital memory computing device provided by the present invention includes: the device comprises a driving module, an SRAM array, an accumulator and a clock control module.
The driving module is connected with the SRAM array. The driving module and the accumulator are both connected with the clock control module, and the whole circuit structure can smoothly run through time sequence control. The SRAM array is coupled to an accumulator. In the present invention, the driving module preferably employs an SRAM WL driver & input activation driver, which can provide a Word Line (WL) stimulus and an input stimulus to the whole SRAM array, and provide different stimuli at different stages.
The read-write control of the SRAM array is connected with a Bit Line (BL) and a bit line not (BLB), so that the BL drive and the BLB drive in the storage mode of the SRAM array are provided, and a data transmission path is provided.
In order to further improve the accuracy of the calculation, in the present invention, the accumulator preferably uses a digital shift adder, and each clock cycle accumulates the data of the previous clock cycle and the clock data of the previous clock cycle. And the period is controlled by the clock control module provided above.
Further, in the present invention, the SRAM array may be a 64 × 64 SRAM array, that is, it includes 64 SRAM modules, each of which includes an adder-tree array and 64 SRAM cells. Each SRAM cell can consist of, but is not limited to, the most basic 6T SRAM cell and a 1bit multiplier. Based on the structural arrangement of the SRAM array, different weight mode configurations can be realized according to different weight precisions. As shown in fig. 1, BL (0) -BL (63) represent bit lines of 64 SRAM modules, and BLB (0) -BLB (63) represent bit line negations of 64 SRAM modules. WL (0) -WL (63) represent the word lines of 64 SRAM cells, IN0(0) -IN63(0) represent the inputs to the multipliers IN the first row of SRAM cells, IN0(1) -IN63(1) represent the inputs to the multipliers IN the first row of SRAM cells, IN0(0) -IN0(63) represent the inputs to the multipliers IN the first column of SRAM cells, IN1(0) -IN1(63) represent the inputs to the multipliers IN the second column of SRAM cells, … …, and so on.
And the result of the adder in each column of the adder tree array is connected with the accumulator, and the registered result is shifted and added with the result of the next beat every clock period.
The gates of two switching transistors in the 6T SRAM unit are connected with WL, the source of one switching transistor is connected with BL, the source of the other switching transistor is connected with BLB, the drain of one switching transistor is connected with weight storage point Q, and the drain of the other switching transistor is connected with weight storage point QB. Two inverters in a 6T SRAM cell are connected end to end.
The multiplier has one input end with the weight data led from the weight storing point Q and one input end with the bitwise multiplication. The adder in the addition tree array receives the output of the multiplier in each SRAM unit, the output of every two multipliers enters one adder for addition, the output of the adder enters the next-stage adder again for continuous addition, and so on.
Based on the above, the multi-bit multiply-accumulate all-digital memory computing device provided by the invention can have two operation modes: a storage mode and a calculation mode. The operation process of each operation mode is as follows:
a storage mode: the operation process of the storage mode is similar to that of the traditional SRAM array, the row and column selection operation is completed through SRAM read-write control to write data, then BL and BLB are precharged, finally SRAM WL drive is used for opening the SRAM unit for reading data, and the final reading result is output through SRAM read-write control, so that a read-write operation is completed.
B, calculating mode: before the calculation mode is started, the weights are written into the SRAM array through the process in the storage mode, and input data are fed into one input end of the multiplier through the input activation driver. The other input of the multiplier is directly connected with a weight storage point Q of the 6T SRAM, and once the data writing operation is completed and the data input is sent, the bitwise multiplication operation is started immediately.
For example, in the first clock cycle, the final bitwise multiplication result is fed into each column of adders in the addition tree array, the final result is obtained through the adders, and the result is simultaneously fed into the accumulators for storage. When the second clock cycle arrives, the input activates the input of the higher-weight bit into the multiplier of each SRAM cell, and the accumulator shifts the previously stored last beat of data to the left, adding the results that come at this time.
Aiming at different input precisions, the method is adopted until all input activation processes are finished, and finally a final calculation result is obtained, namely the final output is obtained. Because the multipliers and the adder tree array in the SRAM array do not need to be clocked, combinational logic is formed and all data can be fully computed in one beat of a clock cycle.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (7)

1. A multi-bit multiply-accumulate all-digital memory computing device, comprising: the device comprises a driving module, an SRAM array, an accumulator and a clock control module;
the driving module is connected with the SRAM array; the driving module is used for providing word line excitation and input excitation for the SRAM array; the SRAM array stores weight data;
the driving module and the accumulator are both connected with the clock control module; the clock control module is used for controlling the clock period of the SRAM array;
the SRAM array is connected with the accumulator; the accumulator is used for determining a calculation result in a clock cycle based on the weight data and the input data stored in the SRAM array.
2. The multi-bit multiply-accumulate all-digital memory computing device of claim 1, wherein the SRAM array comprises: a plurality of SRAM modules;
each of the SRAM modules includes: an adder tree array and a plurality of SRAM cells; a plurality of the SRAM cells are all connected with the addition tree structure;
the addition tree structure is connected to the accumulator.
3. The multi-bit multiply-accumulate all-digital memory computing device of claim 2, wherein the SRAM cell comprises: a multiplier and a 6T SRAM;
a first input of the multiplier is for receiving the input data; the second input end of the multiplier is connected with a weight storage point Q in the 6T SRAM; the output of the multiplier is connected to the adder tree array.
4. The multi-bit multiply-accumulate all-digital memory computing device of claim 2, wherein the adder tree structure comprises: a plurality of adders.
5. The multi-bit multiply-accumulate all-digital memory computing device of claim 1, wherein the accumulator is a digital shift adder.
6. The multi-bit multiply-accumulate all-digital memory computing device of claim 2, wherein the number of SRAM modules is 64.
7. The multi-bit multiply-accumulate all-digital memory computing device according to claim 2, wherein the number of SRAM cells is 64.
CN202210844251.2A 2022-07-19 2022-07-19 Multi-bit multiply-accumulate full-digital memory computing device Active CN114911453B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210844251.2A CN114911453B (en) 2022-07-19 2022-07-19 Multi-bit multiply-accumulate full-digital memory computing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210844251.2A CN114911453B (en) 2022-07-19 2022-07-19 Multi-bit multiply-accumulate full-digital memory computing device

Publications (2)

Publication Number Publication Date
CN114911453A true CN114911453A (en) 2022-08-16
CN114911453B CN114911453B (en) 2022-10-04

Family

ID=82772525

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210844251.2A Active CN114911453B (en) 2022-07-19 2022-07-19 Multi-bit multiply-accumulate full-digital memory computing device

Country Status (1)

Country Link
CN (1) CN114911453B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112509620A (en) * 2020-11-30 2021-03-16 安徽大学 Data reading circuit based on balance pre-charging and group decoding
US20210081175A1 (en) * 2019-09-17 2021-03-18 Anaflash Inc. Multiply-Accumulate Unit
CN112711394A (en) * 2021-03-26 2021-04-27 南京后摩智能科技有限公司 Circuit based on digital domain memory computing
CN113703718A (en) * 2021-10-14 2021-11-26 中科南京智能技术研究院 Multi-bit memory computing device with variable weight
US20220012016A1 (en) * 2021-09-24 2022-01-13 Intel Corporation Analog multiply-accumulate unit for multibit in-memory cell computing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210081175A1 (en) * 2019-09-17 2021-03-18 Anaflash Inc. Multiply-Accumulate Unit
CN112509620A (en) * 2020-11-30 2021-03-16 安徽大学 Data reading circuit based on balance pre-charging and group decoding
CN112711394A (en) * 2021-03-26 2021-04-27 南京后摩智能科技有限公司 Circuit based on digital domain memory computing
US20220012016A1 (en) * 2021-09-24 2022-01-13 Intel Corporation Analog multiply-accumulate unit for multibit in-memory cell computing
CN113703718A (en) * 2021-10-14 2021-11-26 中科南京智能技术研究院 Multi-bit memory computing device with variable weight

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘甲: "基于最小二乘参数辨识的矿井漏电保护系统", 《现代机械》 *
樊迪等: "FPGA中适用于低位宽乘累加的DSP块", 《复旦学报(自然科学版)》 *

Also Published As

Publication number Publication date
CN114911453B (en) 2022-10-04

Similar Documents

Publication Publication Date Title
CN109979503B (en) Static random access memory circuit structure for realizing Hamming distance calculation in memory
CN110633069B (en) Multiplication circuit structure based on static random access memory
CN113035251B (en) Digital memory computing array device
CN110058839B (en) Circuit structure based on static random access memory internal subtraction method
CN112151091A (en) 8T SRAM unit and memory computing device
CN110176264B (en) High-low bit merging circuit structure based on internal memory calculation
CN114546335B (en) Memory computing device for multi-bit input and multi-bit weight multiplication accumulation
CN112992232B (en) Multi-bit positive and negative single-bit memory computing unit, array and device
CN112884140B (en) Multi-bit memory internal computing unit, array and device
CN110970071B (en) Memory cell of low-power consumption static random access memory and application
US20220269483A1 (en) Compute in memory accumulator
CN116364137A (en) Same-side double-bit-line 8T unit, logic operation circuit and CIM chip
CN113077050B (en) Digital domain computing circuit device for neural network processing
CN112233712B (en) 6T SRAM (static random Access memory) storage device, storage system and storage method
CN114038492A (en) Multi-phase sampling memory computing circuit
CN114911453B (en) Multi-bit multiply-accumulate full-digital memory computing device
CN114895869B (en) Multi-bit memory computing device with symbols
CN114944180B (en) Weight-configurable pulse generating device based on copy column
CN114882921B (en) Multi-bit computing device
CN113258910B (en) Computing device based on pulse width modulation
KR102555621B1 (en) In-memory computation circuit and method
CN115424645A (en) Computing device, memory controller and method of performing computations in memory
CN115223619A (en) Memory computing circuit
CN114816327B (en) Adder and full-digital memory computing device
CN114647398B (en) Carry bypass adder-based in-memory computing device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant