CN115588446A - Memory operation circuit, memory calculation circuit and chip thereof - Google Patents

Memory operation circuit, memory calculation circuit and chip thereof Download PDF

Info

Publication number
CN115588446A
CN115588446A CN202211244850.7A CN202211244850A CN115588446A CN 115588446 A CN115588446 A CN 115588446A CN 202211244850 A CN202211244850 A CN 202211244850A CN 115588446 A CN115588446 A CN 115588446A
Authority
CN
China
Prior art keywords
circuit
memory
alu1
alu2
storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211244850.7A
Other languages
Chinese (zh)
Inventor
蔺智挺
范星
吴秀龙
彭春雨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Original Assignee
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University filed Critical Anhui University
Priority to CN202211244850.7A priority Critical patent/CN115588446A/en
Publication of CN115588446A publication Critical patent/CN115588446A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C11/00Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor
    • G11C11/21Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements
    • G11C11/34Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices
    • G11C11/40Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices using transistors
    • G11C11/41Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices using transistors forming static cells with positive feedback, i.e. cells not needing refreshing or charge regeneration, e.g. bistable multivibrator or Schmitt trigger
    • G11C11/413Auxiliary circuits, e.g. for addressing, decoding, driving, writing, sensing, timing or power reduction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/57Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups G06F7/483 – G06F7/556 or for performing logical operations
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Computational Mathematics (AREA)
  • Computing Systems (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Static Random-Access Memory (AREA)

Abstract

The invention belongs to the technical field of integrated circuits, and particularly relates to a storage operation circuit, and an SRAM memory calculation circuit and a chip with TCAM and logic operation functions. Each basic storage arithmetic circuit comprises two storage units T1 and T2 for storing data and two arithmetic logic units ALU1 and ALU2; the ALU1 and the ALU2 comprise a control end, a first input end, a second input end and an output end; the arithmetic logic unit ALU1 is an AND gate when the control end is accessed to VSS, and is an exclusive OR gate when the control end is accessed to VDD; the arithmetic logic unit ALU2 is an exclusive OR gate when the control terminal is connected to VSS, and is an OR gate when the control terminal is connected to VDD. The input end of the arithmetic logic unit ALU1 is connected with an external input IN and a storage unit T1; the input of the arithmetic logic unit ALU2 is terminated by the output of ALU1 and by a memory unit T2. The invention solves the problem that the conventional SRAM memory circuit is difficult to realize composite Boolean logic operation in a single period, and improves the operational performance and stability of the memory circuit.

Description

Memory operation circuit, memory calculation circuit and chip thereof
Technical Field
The invention belongs to the technical field of integrated circuits, and particularly relates to a storage operation circuit, and an SRAM memory calculation circuit and a chip which adopt the storage operation circuit as a basic circuit and have TCAM and logic operation functions.
Background
Machine learning, image recognition and edge calculation are data processing tasks with large data calculation scale, and most of the existing various artificial intelligence applications need to be based on the technologies. The continuous development and widespread use of artificial intelligence have led to a development bottleneck in data processing devices based on the traditional von Neumann architecture, namely: data processing devices cannot achieve an effective balance in computational efficiency and power consumption; the increase in computational efficiency generally causes a significant increase in the operating power consumption of the device.
In this context, the concept of Memory Computing (CIM) is proposed. Since in-memory computations do not require data to be transferred from memory to the processor and can be performed within the memory array, in-memory computations greatly reduce access energy consumption during the computations, greatly increase throughput, and greatly improve computation speed and energy efficiency. In a word, the birth of the memory computing technology breaks through the bottle neck of von Neumann, breaks through the storage wall in the traditional computing architecture, and has revolutionary significance for the computational age.
In consideration of the fast data reading speed of Static Random-Access Memory (SRAM) and the good compatibility with advanced logic processes, SRAM-based Memory computing technologies are first of all paid attention by both domestic and foreign researchers. The memory calculation design based on the SRAM can realize the basic read-write function of the SRAM, and can also realize the calculation function by setting bit line voltage, changing a unit structure and repairing a peripheral circuit. Due to the repeatability of the array unit structure, simple and repeated operation in memory calculation is easy to realize.
The boolean operation is a simple repetitive operation, so that the realization of the boolean operation in the memory calculation of the SRAM memory is very promising. The complex boolean operations require multiple operands to participate in common, and thus performing the complex boolean operations in SRAM memories often requires multiple rows of word lines in the memory array to be turned on simultaneously. However, turning on multiple rows of word lines simultaneously in a memory can cause read disturb and other problems. Meanwhile, in the existing SRAM memory circuit, two or more cycles are almost required to realize the complex boolean logic operation, which also brings difficulty to the timing control in the circuit operation process. In summary, a circuit structure capable of implementing a complex boolean logic operation with a single cycle is lacking in the conventional SRAM memory circuit.
Disclosure of Invention
In order to solve the problem that the conventional SRAM memory circuit is difficult to realize composite Boolean logic operation in a single cycle, the invention provides an SRAM memory computing circuit and a chip which adopt a memory computing circuit as a basic circuit and have TCAM and logic operation functions.
The invention is realized by adopting the following technical scheme:
a kind of storage arithmetic circuit, this storage arithmetic circuit includes two memory cell T1 and T2 used for storing the data, and two arithmetic logic units ALU1 and ALU2; ALU1 and ALU2 each comprise a control terminal, a first input terminal, a second input terminal and an output terminal.
The arithmetic logic unit ALU1 is an AND gate when the control end is accessed to VSS, and is an exclusive OR gate when the control end is accessed to VDD. The arithmetic logic unit ALU2 is an exclusive OR gate when the control terminal is connected to VSS, and is an OR gate when the control terminal is connected to VDD.
The first input end of the arithmetic logic unit ALU1 is connected to Q1 on one of the storage nodes of the storage unit T1; the second input end of the arithmetic logic unit ALU1 is connected with an independent input signal line IN; the output of the arithmetic logic unit ALU1 is Y1.
The first input end of the arithmetic logic unit ALU2 is connected with the output end Y1 of the arithmetic logic unit ALU 1; the second input end of the arithmetic logic unit ALU2 is connected to one storage node Q2 of the storage unit T2; the output of the arithmetic logic unit ALU2 is Y2.
The output terminal Y1 of the ALU1 is used in a manner of being an operation result output terminal when the ALU1 performs the exclusive and/or operation. There are three application modes of the output end Y2 of the ALU2, which are: (1) As the result output of the operation when the ALU2 alone performs the or/xor operation. (2) The output end of the operation result is used for TI, T2, ALU1 and ALU2 to commonly execute four kinds of complex Boolean logic operation. (3) The output end of the matching result is used for realizing TCAM addressing operation by TI, T2, ALU1 and ALU2.
As a further improvement of the invention, the ALU1 circuit comprises two PMOS transistors PM3 and PM4 and two NMOS transistors NM5 and NM6. Wherein, the source electrode of PM3 is connected with the control signal SD1; the grid electrode of the PM3, the source electrode of the NM5 and the grid electrode of the NM5 are connected and connected to the input signal line IN; the grid electrode of the PM4, the grid electrode of the NM5 and the source electrode of the NM6 are connected in parallel and connected to a storage node Q1 of the storage unit T1; the drain electrode of the PM4, the drain electrode of the NM5 and the drain electrode of the NM6 are connected and used as an output node Y1; the drain of PM3 is connected to the source of PM 4.
As a further improvement of the present invention, the circuit of ALU2 includes two PMOS transistors PM7, PM8, and two NMOS transistors NM11, NM12. Wherein, the grid of PM7, the source of PM8, the grid of NM11 link to each other and connect in the output node Y1 of ALU 1; the grid electrode of the PM8, the source electrode of the PM7 and the grid electrode of the NM12 are connected to the storage node Q2 of the storage unit T2; the drain electrode of the PM8, the drain electrode of the PM7 and the drain electrode of the NM11 are connected and used as an output node Y2 of the ALU2; the source of NM11 is connected with the drain of NM12; the source of NM12 is connected to control signal SD2.
As a further improvement of the present invention, the memory cells T1 and T2 are 6T memory cells including 6 transistors. The 6T memory cell comprises 2 PMOS tubes PM1 and PM2 and 4 NMOS tubes NM1, NM2, NM3 and NM4. Wherein, PM1 and NM1 form one inverter structure, PM2 and NM2 form another inverter structure, NM3 and NM4 are used as transmission tubes respectively. The source electrodes of the PM1 and the PM2 are connected with VDD, and the source electrodes of the NM1 and the NM2 are connected with VSS; the drain electrode of PM1, the drain electrode of NM1, the grid electrode of PM2 and the grid electrode of NM2 are connected as a storage node Q1 and connected with the drain electrode of NM3, the grid electrode of NM3 is connected with a word line WL, and the source electrode of NM3 is connected with a bit line BL; the drain of PM2, the drain of NM2, the gate of PM1, and the gate of NM1 are connected to a storage node QB1 and to the drain of NM4, the gate of NM4 is connected to a word line WL, and the source of NM4 is connected to a bit line BLB.
In the memory operation circuit provided by the invention, the achievable operation function is as follows:
1. the memory unit T1 and the arithmetic logic unit ALU1 are used as basic circuits for realizing an and operation and an or operation in the memory arithmetic circuit. Where IN and Q1 are two operands and Y1 is the output result.
ALU1 performs and when SD1= VSS; namely:
Y1=AND(IN,Q1)
ALU1 implements exclusive nor operation when SD1= VDD; namely:
Y1=XNOR(IN,Q1)。
2. the memory unit T2 and the arithmetic logic unit ALU2 are used as basic circuits for realizing an or operation and an xor operation in the memory arithmetic circuit. Where Y1 and Q2 are two operands and Y2 is the output result.
When SD2= VSS, ALU2 implements or operates, namely:
Y2=XOR(Y1,Q2)
ALU2 implements an exclusive-or operation when SD1= VDD;
Y2=OR(Y1,Q2)
3. the storage units T1 and T2 and the arithmetic logic units ALU1 and ALU2 are used as basic circuits for realizing four kinds of composite Boolean logic operation, wherein IN, Q1 and Q2 are three operands, and Y2 is an output result; the expressions for the four composite boolean logic operations are as follows:
Figure BDA0003885812610000031
in the storage operation circuit provided by the invention, the mode for realizing TCAM addressing operation is as follows:
control signals of the memory operation circuit including T1, T2, ALU1, and ALU2 are set to SD1= VSS and SD2= VSS. A two-bit binary number made up of the node data Q1 and Q2 stored in the two memory cells T1 and T2 can be used to represent three state bits; where "10" represents TCAM state 1, "11" represents TCAM state 2, and "01" represents TCAM state x. Taking two-bit binary numbers represented by Q1 and Q2 as target data, wherein IN represents search data; then the value of Y2 output is "1" indicating that the target data matches the search data, and the value of Y2 output is "0" indicating that the target data does not match the search data.
The invention also provides an SRAM memory computing circuit with TCAM and logic operation functions, and the large-scale circuit comprises an SRAM memory array, a logic unit array, a time sequence control circuit, an input signal line, a bit line pair, a pre-charging circuit, a word line driving module, a row decoding module and a column output circuit.
Wherein, the SRAM memory array is composed of 4N 2 The same storage units form a 2N multiplied by 2N array form; each memory cell contains 2 inverted storage nodes Q and QB.
The logic cell array is composed of a plurality of arithmetic logic units corresponding to the memory cells one by one. In the logic unit array, the operation logic units corresponding to the memory units in the odd-numbered columns are all operation logic units ALU1, and the operation logic units corresponding to the memory units in the even-numbered columns are all operation logic units ALU2.
The time sequence control circuit is used for generating clock signals required by the functional modules.
The input signal line is connected to the arithmetic logic units ALU1 IN each column, and is used for inputting a corresponding input signal IN to each arithmetic logic unit ALU 1.
The bit line pair comprises 2N pairs of bit lines BL and BLB; the individual memory cells in each column are connected to the same set of bit lines BL and BLB.
The precharge circuit is used for performing precharge operation on bit lines BL and BLB connected with each column of memory cells in the SRAM memory array.
The word lines WL are used to input corresponding word line signals to the respective memory cells in the SRAM memory array.
The word line driving module is used for controlling the on or off of a word line WL connected with each memory unit in the memory array.
The row decoding module is used for decoding the input signal and controlling the word line driving module according to the decoding result.
Bit lines BL and BLB connected with each column of memory cells in the SRAM memory array by a column output circuit are connected through a sense amplifier SA; and then the storage data of the storage units in any column or the calculation result of the arithmetic logic unit is output.
In particular, two memory cells T1 and T2 and corresponding ALU1 and ALU2 located in two adjacent columns in the same row in the SRAM memory array constitute the memory operation circuit in embodiment 1, and can implement the complete function of the basic circuit.
The number of the sensitive amplifiers in the SRAM memory computing circuit is 4N, BL or BLB in each column memory cell is respectively used as one input of a group of corresponding sensitive amplifiers, and the other input of the sensitive amplifiers is a reference level Vref; when the level of BL or BLB is higher than the reference level Vref, the sense amplifier outputs a high level, otherwise, a low level is output.
The invention also comprises a memory chip, which is an integrated circuit formed by packaging the SRAM memory computing circuit with the TCAM and the logic operation function. The interface of the memory chip at least comprises: the device comprises a power supply interface, a ground wire interface, a charging interface, a control signal interface, an input signal interface, a switching signal, a word line interface and an output signal interface.
The power interface VDD is used for connecting with a power supply. The ground line interface VSS is used for grounding. The charging interface PRE is used to input a control signal for adjusting the charging state of each bit line. The control signal interface SD is used for inputting control signals for adjusting the operating states of the respective arithmetic logic units. The input signal interface IN is used for inputting a corresponding input signal to each arithmetic logic unit ALU 1. The switching signal SW is used for inputting a switching signal to the circuit, and the switching signal is used for adjusting the access states of the memory cells and the budget logic cells on the bit lines. And further, the circuit is switched between two working modes of normal reading and writing and memory operation.
The word line interface WL is used to input a corresponding word line signal to each memory cell, and the word line signal is used to adjust the access state of each memory cell on each bit line. The output signal interface Y is used for reading the data stored in each memory cell or reading the logical operation result of each operation logical unit.
The technical scheme provided by the invention has the following beneficial effects:
the invention designs a new storage operation circuit structure through a few transistors, and the circuit structure can simultaneously realize the storage and the read-write of data, simple logic operation, complex Boolean logic operation and TCAM addressing operation. The circuit designed by the invention can be applied to a large-scale SRAM memory circuit, and further an SRAM memory computing circuit with TCAM addressing and complex logic operation functions is obtained. One outstanding advantage of the circuit is that the composite Boolean logic operation can be realized in one single cycle; and conflict-free operation of each function can be realized by reasonably arranging various signal wires, and the stability of the circuit is improved. And then overcome the reading interference that the existing memory circuit exists, and need a plurality of cycles to realize the problem such as a complicated Boolean logic operation.
Drawings
Fig. 1 is a schematic circuit diagram of a memory operation circuit provided in embodiment 1 of the present invention.
Fig. 2 is a detailed circuit diagram of the 6T memory cell of fig. 1.
Fig. 3 shows a possible circuit configuration of the ALU1 used in fig. 1.
Fig. 4 shows a possible circuit configuration of the ALU2 used in fig. 1.
Fig. 5 is a complete circuit diagram of a memory operation circuit in embodiment 1 of the present invention when the specific circuit structures in fig. 2 to 4 are adopted.
Fig. 6 is a circuit architecture diagram of an SRAM memory computing circuit with TCAM and logic operation functions provided in embodiment 2 of the present invention.
Fig. 7 is an operation schematic diagram of the circuit provided in embodiment 2 of the present invention when implementing TCAM addressing operation.
Fig. 8 shows the results of 5000 monte carlo simulations when the memory chip executes the TCAM operation when the target data is 1001.
Fig. 9 shows the results of 5000 monte carlo simulations when the target data is 100X and the TCAM operation is performed by the memory chip.
Fig. 10 is a statistical graph of average power consumption of the memory chip provided in this embodiment for performing four complex logic operations under different process corners.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Example 1
A storage operation circuit of the present embodiment, as shown in fig. 1, includes two storage units T1 and T2 for storing data, and two operation logic units ALU1 and ALU2; ALU1 and ALU2 each include a control terminal, a first input terminal, a second input terminal, and an output terminal.
In the basic circuit provided in this embodiment, the arithmetic logic unit ALU1 is an and gate when the control terminal is connected to VSS and an or gate when the control terminal is connected to VDD. The arithmetic logic unit ALU2 is an exclusive OR gate when the control terminal is connected to VSS, and is an OR gate when the control terminal is connected to VDD.
The first input end of the arithmetic logic unit ALU1 is connected to Q1 on one of the storage nodes of the storage unit T1; the second input end of the arithmetic logic unit ALU1 is connected with an independent input signal line IN; the output of the arithmetic logic unit ALU1 is Y1.
The first input end of the arithmetic logic unit ALU2 is connected with the output end Y1 of the arithmetic logic unit ALU 1; the second input end of the arithmetic logic unit ALU2 is connected to one storage node Q2 of the storage unit T2; the output of the arithmetic logic unit ALU2 is Y2.
The output terminal Y1 of the ALU1 is used in a manner of being an operation result output terminal when the ALU1 performs the exclusive and/or operation. There are three application modes of the output end Y2 of the ALU2, which are: (1) As the result output of the operation when the ALU2 alone performs the or/xor operation. (2) The output end of the operation result is used for TI, T2, ALU1 and ALU2 to commonly execute four kinds of complex Boolean logic operation. (3) The output end of the matching result is used for realizing TCAM addressing operation by TI, T2, ALU1 and ALU2.
Based on the foregoing circuit configuration, when the arithmetic logic units ALU1 and ALU2 do not participate, the memory units T1 and T2 can be used as conventional memory circuits for storing 1 or 2 bytes of data. When the memory units T1 and T2 cooperate with the arithmetic logic units ALU1 and ALU2, the memory arithmetic circuit can also implement the following arithmetic functions:
1. the memory unit T1 and the arithmetic logic unit ALU1 are used as basic circuits for realizing an and operation and an or operation in the memory arithmetic circuit. Where IN and Q1 are two operands and Y1 is the output result.
ALU1 performs and when SD1= VSS; namely:
Y1=AND(IN,Q1)
ALU1 implements exclusive nor operation when SD1= VDD; namely:
Y1=XNOR(IN,Q1)。
2. the memory unit T2 and the arithmetic logic unit ALU2 are used as basic circuits for realizing an or operation and an xor operation in the memory arithmetic circuit. Where Y1 and Q2 are two operands and Y2 is the output result.
When SD2= VSS, ALU2 implements or operates, i.e.:
Y2=XOR(Y1,Q2)
ALU2 implements an exclusive-or operation when SD1= VDD;
Y2=OR(Y1,Q2)
3. the storage units T1 and T2 and the arithmetic logic units ALU1 and ALU2 are used as basic circuits for realizing four kinds of composite Boolean logic operation, wherein IN, Q1 and Q2 are three operands, and Y2 is an output result; the expressions for the four composite boolean logic operations are as follows:
Figure BDA0003885812610000071
in particular, a first complex Boolean logic operation Y 2 =(IN·Q 1 )⊕Q 2 The truth table of (1) is as follows:
table 1: truth table of composite Boolean logic operation one
Y2=(IN·Q1)⊕Q2
Figure BDA0003885812610000073
Second composite Boolean logic operation Y 2 =(IN·Q 1 )+Q 2 The truth table of (1) is as follows:
table 2: truth table of composite Boolean logic operation two
Y2=(IN·Q1)+Q2
Figure BDA0003885812610000081
A third complex Boolean logic operation
Figure BDA0003885812610000082
The truth table of (1) is as follows:
table 3: truth table of composite Boolean logic operation three
Figure BDA0003885812610000083
Figure BDA0003885812610000084
Fourth complex Boolean logic operation
Figure BDA0003885812610000085
The truth table of (1) is as follows:
table 4: truth table of composite Boolean logic operation four
Figure BDA0003885812610000091
Figure BDA0003885812610000092
In particular, the simple circuit provided by the present embodiment and composed of the memory units T1, T2 and the arithmetic logic units ALU1, ALU2 may also be used to provide an addressing function. In the memory operation circuit provided in this embodiment, the TCAM addressing operation is implemented as follows:
control signals of the memory operation circuit including T1, T2, ALU1, and ALU2 are set to SD1= VSS and SD2= VSS. A two-bit binary number made up of the node data Q1 and Q2 stored in the two memory cells T1 and T2 can be used to represent three state bits; where "10" represents TCAM state 1, "11" represents TCAM state 2, and "01" represents TCAM state x. Taking two-bit binary numbers represented by Q1 and Q2 as target data, wherein IN represents search data; then the value of Y2 output is "1" indicating that the target data matches the search data, and the value of Y2 output is "0" indicating that the target data does not match the search data.
The foregoing describes the functions of each basic unit in the memory arithmetic circuit in the embodiment, so that the circuit functions can be realized by designing, connecting and using a circuit with corresponding functions. That is, the memory units T1 and T2, and the arithmetic logic units ALU1 and ALU2 in the present embodiment are basic functional units of the memory arithmetic circuit. The elements and circuit connections of the basic functional units are not limited to a particular form.
Specifically, the memory cells T1 and T2 in the embodiment may adopt various memory cell circuits with different numbers of transistors, such as conventional 6T, 8T, 10T, 12T … …. For example, in the present embodiment, a 6T memory cell including 6 transistors is taken as an example, and a corresponding memory operation circuit is designed, and the 6T memory cell includes two inverted storage nodes Q1 and QB1. As shown in fig. 2, the 6T memory cell includes 2 PMOS transistors PM1 and PM2, and 4 NMOS transistors NM1, NM2, NM3, NM4. Wherein, PM1 and NM1 form one inverter structure, PM2 and NM2 form another inverter structure, and NM3 and NM4 respectively serve as transmission tubes. The source electrodes of the PM1 and the PM2 are connected with VDD, and the source electrodes of the NM1 and the NM2 are connected with VSS; the drain of PM1, the drain of NM1, the gate of PM2, and the gate of NM2 are connected as a storage node Q1 and connected to the drain of NM3, the gate of NM3 is connected to a word line WL, and the source of NM3 is connected to a bit line BL. The drain of PM2, the drain of NM2, the gate of PM1, and the gate of NM1 are connected to a storage node QB1 and to the drain of NM4, the gate of NM4 is connected to a word line WL, and the source of NM4 is connected to a bit line BLB.
ALU1 and ALU2 are special gates designed and applied in this embodiment. Specifically, as shown in fig. 3, the ALU1 circuit provided in the present embodiment includes two PMOS transistors PM3 and PM4, and two NMOS transistors NM5 and NM6. Wherein, the source of PM3 is connected to the control signal SD1. The gate of PM3, the source of NM5, and the gate of NM5 are connected to the input signal line IN. The grid electrode of the PM4, the grid electrode of the NM5 and the source electrode of the NM6 are connected in parallel and connected to a storage node Q1 of the storage unit T1; the drain of PM4, the drain of NM5, and the drain of NM6 are connected to serve as an output node Y1. The drain of PM3 is connected to the source of PM 4.
As shown in fig. 4, the circuit of ALU2 includes two PMOS transistors PM7, PM8, and two NMOS transistors NM11, NM12. The gate of PM7, the source of PM8 and the gate of NM11 are connected to the output node Y1 of ALU 1. The gate of PM8, the source of PM7, and the gate of NM12 are connected to a storage node Q2 of the memory cell T2. The drain of PM8, the drain of PM7, and the drain of NM11 are connected and serve as the output node Y2 of ALU2. The source of NM11 is connected with the drain of NM12; the source of NM12 is connected to control signal SD2.
Based on the above design, the final circuit diagram including T1, T2, ALU1 and ALU2 provided by the present embodiment is shown in fig. 5. It is to be emphasized that: the scheme of fig. 5 provided in this embodiment is only one of the modes of the present invention for protecting the memory operation circuit, and is not a feature for limiting the scope of the present invention. For example, ALU1 and ALU2 in fig. 3 and 4 are just one form of gate circuits with corresponding editable functions given in this implementation, and other circuits with the same function and composed of different elements may be designed in other embodiments to realize the same circuit function.
Example 2
The scheme provided by embodiment 1 is a basic circuit for implementing a memory function, a simple logic operation, a complex boolean logic operation, and a TCAM addressing function. In addition to embodiment 1, this embodiment further provides a large-scale memory computing circuit including a large number of basic circuits as in embodiment 1.
Specifically, as shown IN fig. 6, the SRAM memory computing Circuit with TCAM and logic operation functions provided IN this embodiment includes an SRAM memory array, a logic cell array (Embedded Multiplexed ALU), a Timing Control Circuit (Timing Control), an input signal line (IN), a bit line pair (BL, BLB), a Precharge Circuit (Precharge Control), a Word Line (WL), a word line driving module, a Row decoding module (Row Decoder), and a Column Output Circuit (Column Output Circuit).
Wherein, the SRAM memory array is composed of 4N 2 The same storage units form a 2N multiplied by 2N array form; each memory cell contains 2 inverted storage nodes Q and QB. The SRAM memory array in this embodiment includes a plurality of memory cells 6T, and further includes sense amplifiers connected to respective bit lines. For example, in this embodiment, the number of the sense amplifiers is 4N, BL or BLB in each column memory cell is respectively used as one input of the sense amplifier, and the other input of the sense amplifier is the reference level Vref; when the level of BL or BLB is higher than the reference level Vref, the sense amplifier outputs a high level, otherwise, a low level is output. The sense amplifier is mainly used to detect and amplify a small signal (a voltage signal or a current signal). In the SRAM, a sense amplifier can be used for detecting a small-swing signal on a bit line of a storage unit of the SRAM and amplifying the small-swing signal, so that the data reading speed and the data reading accuracy are improved.
The logic cell array is composed of a plurality of arithmetic logic units corresponding to the memory cells one to one. In the logic unit array, the operation logic units corresponding to the memory units in the odd-numbered columns are all operation logic units ALU1, and the operation logic units corresponding to the memory units in the even-numbered columns are all operation logic units ALU2.
The time sequence control circuit is used for generating clock signals required by the functional modules. The input signal line is connected to the arithmetic logic units ALU1 IN each column, and is used to input a corresponding input signal IN to each arithmetic logic unit ALU 1. The bit line pair comprises 2N pairs of bit lines BL and BLB; the individual memory cells in each column are connected to the same set of bit lines BL and BLB.
The precharge circuit is used for performing precharge operation on bit lines BL and BLB connected with each column of memory cells in the SRAM memory array. The pre-charging operation is to pre-charge each bit line to high level during operation or data reading, and if the bit line voltage is kept at high level in the later period, the output is 1; if the bit line is discharged, the representative output is 0.
The word lines WL are used to input corresponding word line signals to the respective memory cells in the SRAM memory array. The word line driving module is used for controlling the on or off of a word line WL connected with each memory unit in the memory array. When the memory cells and/or the arithmetic logic units in any row need to be selected firstly to work, only the corresponding word line signals need to be input.
The row decoding module is used for decoding the input signal and controlling the word line driving module according to the decoding result. Bit lines BL and BLB connected with each column of memory cells in the SRAM memory array by a column output circuit are connected through a sense amplifier SA; and then the storage data of the storage units in any column or the calculation result of the arithmetic logic unit is output.
In particular, two memory cells T1 and T2 and corresponding ALU1 and ALU2 located in two adjacent columns in the same row in the SRAM memory array constitute the memory operation circuit as described above, and can implement the complete function of the basic circuit.
This example differs from example 1 in that: in this embodiment, in a conventional large-scale 6T-SRAM memory circuit, the memory operation circuit described in embodiment 1 is completely designed by using the memory cells in any adjacent rows. Therefore, a series of functions such as simple logic operation, complex boolean logic operation, and TCAM addressing operation in embodiment 1 can be realized on the SRAM memory circuit.
The following describes the scheme of this embodiment in detail with reference to the circuit architecture of fig. 6: in fig. 6, the original 6T-SRAM memory array is 64 × 64, and considering that the scheme in this embodiment needs to use two memory cells in adjacent columns as the unit circuits constituting the basic circuit, the "calculation unit array" finally designed is actually 64 × 32, because each adjacent odd column and even column in the original memory cell array together constitute the same column in the new functional circuit array. In the new array, 2 identical 6T SRAM units and 1 arithmetic logic unit ALU1 and 1 arithmetic logic unit ALU2 are included in the basic circuit of any row and column. In particular, in the present embodiment, all the transistor sizes in the circuit structures of the 6T units ALU1, ALU2 are the same.
As shown in FIG. 6, 2 identical 6T SRAM cells share 1 word line WL <63>, the left 6T cell is connected to bit lines BL1<31> and BLB1<31>, and the storage nodes are denoted as Q1, QB1; the right 6T cell is coupled to bit lines BL2<31> and BLB2<31>, and the storage nodes are designated as Q2, QB2.ALU1 is connected to an input signal line IN <63>, and ALU1 is connected to bit lines BL1<31> and BLB1<31> simultaneously; ALU2 is connected to bit lines BL2<31> and BLB2<31>.
The storage nodes Q1 and IN <63> of the left 6T unit are input to ALU1, the output of ALU1 and the storage node Q2 of the right 6T unit are input to ALU2, the output of ALU2 is connected to the bit line BL <63>, so that the logical operation of the three operands IN <63>, Q1 and Q2 is formed, and the bit line BL <63> and the reference voltage Vref are used as two input signals of the sense amplifier SA.
It should be noted that for a row of compute units, their 6T cells are all connected to the same word line WL and their ALU1 is connected to the same input IN. For the compute units of a column, the 6T units and ALU1 to their left are all connected to bit lines BL1, BLB1; the 6T cells and ALU2 to their right are connected to bit lines BL2, BLB2. The bit line BL2<31> and the reference voltage Vref are used as two input signals of SA, wherein SA outputs high level when the voltage on the bit line is higher than the reference voltage, and SA outputs low level when the voltage on the bit line is lower than the reference voltage.
For example, for all the compute units IN row 64, their 6T units are connected to the same wordline WL <63>, and their ALU1 is connected to the same input IN <63>; for all compute units in column 32, the 6T cells and ALU1 to their left are connected to bit lines BL1<31>, BLB1<31>; the 6T cells and ALU2 to their right are all connected to bit lines BL2<31>, BLB2<31>.
Based on the circuit structure, the realization of the storage function is still consistent with that of the original SRAM storage unit, namely, the storage unit is positioned through a sub-line and a bit line, and the data in the storage node corresponding to the storage unit is obtained. The following describes in detail the procedure for implementing the four complex logic operation functions and for implementing the TCAM addressing operation.
1. Realizing four composite Boolean logic operation functions
The memory computing circuit provided in this embodiment includes a large number (64 × 32 sets) of unit structures of the basic memory computing circuit shown in fig. 1. By multiplexing the ground terminal and the power supply terminal in the two embedded arithmetic logic units ALU1 and ALU2, the AND/XNOR operation and the XOR/OR operation can be respectively realized in one period, and four kinds of composite Boolean logic operations can be realized by combining with the two SRAM 6T storage units.
For ALU1, the operands are the external input IN and the storage node Q1 of the previous 6T cell, with the output Y1. When SD1= VSS, an AND operation may be implemented, AND when SD1= VDD, an XNOR operation may be implemented. For ALU2, the operands are the output Y1 of ALU1 and the storage node Q2 of the next 6T unit, the output being Y2. An XOR operation may be implemented when SD2= VSS, and an OR operation may be implemented when SD2= VDD.
The two embedded arithmetic logic units ALU1, ALU2, in combination with the two SRAM 6T memory cells, can implement the following four complex boolean logic operations. Namely: when SD1= VSS and SD2= VSS, the formula is implemented
Figure BDA0003885812610000131
The complex logical operation function of (1). Equation Y is implemented when SD1= VSS, SD2= VDD 2 =(IN·Q 1 )+Q 2 The complex logical operation function of (1). The formula is implemented when SD1= VDD, SD2= VSS
Figure BDA0003885812610000132
The complex logical operation function of (1). The formula is implemented when SD1= VDD, SD2= VDD
Figure BDA0003885812610000133
The complex logical operation function of (2).
The truth tables for the four complex boolean logic operations are shown in tables 1-4 of example 1. In fact, the value of Y2 may have strong 1, weak 1, strong 0, and weak 0 output results according to different inputs, so it is necessary to set the reference voltage of SA reasonably so that it can correctly read the calculation result. Based on the above, it can be seen that: in the memory computing circuit designed in this embodiment, it is able to implement complex boolean logic operations in a single cycle by using two different memory units T1, T2 and two different arithmetic logic units ALU1, ALU2,
2. implementing TCAM addressing operations
In the circuit structure provided by the embodiment, by reasonably configuring the signal lines, it can also constitute a ternary content addressable memory. When the circuit is in a mode of SD1= VSS and SD2= VSS, the T1, T2, ALU1 and ALU2 integrated circuit constitutes an implementation
Figure BDA0003885812610000134
Complex logic of operation runs the module. Thus, TCAM can be formed by configuring the signal line IN. The data stored in the two SRAM 6T cells are labeled Q1 and Q2, as shown in FIG. 4. Q1 and Q2, which represent TACM state 1 when Q1 and Q2 are 1 and 0, respectively, the values 1 and 1 of Q1 and Q2 represent TACM state 0, and the values 0 and 1 of Q1 and Q2 represent TACM state x. IN denotes search data. The condition of Y2=1 indicates that the target data (IN) matches the search data (two-bit binary value made up of Q1, Q2), and the condition of Y2=0 indicates that the data does not match.
In the practical application process, the data searching process is as follows: 1. precharging the bit line to a high level; 2. the various calculated values Y2 are connected to the same bit line. A match is indicated if the bit line voltage remains high, and a mismatch is indicated if the bit line discharges; 3. the final matching result is read out by a sense amplifier.
Taking 4*4 array as an example, if the external input is 1001 and the internal storage is 1001 or 100X, then the Y2 output is 1, indicating a match. When the internal storage is 1010 or 1110, the Y2 output is 0, indicating a mismatch. Specifically, the operating logic of the data matching process is shown in fig. 7.
It should be noted that the 6T-SRAM memory cell, the 64 × 64 memory array, and the like are examples given for explaining the present embodiment, and are not limited to the present application, and in other embodiments, other types of memory cells can be used to form other memory arrays of larger sizes based on the same technical idea, and a desired "calculation cell array" can be obtained.
Example 3
This embodiment provides an integrated circuit packaged by the SRAM memory computing circuit having TCAM and logic operation functions provided in embodiment 2. The interface of the memory chip at least comprises: the device comprises a power supply interface, a ground wire interface, a charging interface, a control signal interface, an input signal interface, a switching signal, a word line interface and an output signal interface.
The power interface VDD is used for connecting with a power supply. The ground line interface VSS is used for grounding. The charging interface PRE is used to input a control signal for adjusting the charging state of each bit line. The control signal interface SD is used for inputting control signals for adjusting the operating states of the respective arithmetic logic units. The input signal interface IN is used for inputting a corresponding input signal to each arithmetic logic unit ALU 1. The switching signal SW is used for inputting a switching signal to the circuit, and the switching signal is used for adjusting the access states of the memory unit and the budget logic unit on the bit line; and further, the circuit is switched between two working modes of normal reading and writing and memory operation. The word line interface WL is used to input a corresponding word line signal to each memory cell, and the word line signal is used to adjust the access state of each memory cell on each bit line. The output signal interface Y is used for reading the data stored in each memory cell or reading the logical operation result of each operation logical unit.
Performance testing
In order to verify the effectiveness of the solution provided in this embodiment, this embodiment further designs the integrated circuit in embodiment 3 by adopting an SMIC 55nm process on Cadence Virtuoso software, and performs a comprehensive test on various performances of the circuit in a simulation system. The test contents and results are as follows:
1. and verifying the 5000-time Monte Carlo simulation result of the circuit through TCAM operation. Because all logic operation modes of the circuit are used in the TCAM operation process, the output result of 5000 Monte Carlo simulations has reference value for evaluating the performance and stability of the circuit.
IN this embodiment, taking a 4 × 4 array as an example, the target data IN input from the outside is set to 1001, and only when the internal storage data is 1001 or 100X, the output is 1, which indicates that the target data and the internal storage data match, otherwise, it indicates that the target data and the internal storage data do not match. Specifically, the method comprises the following steps: when the value of the internally stored data is X (i.e., Q1, Q2 is 01), which means that the value of the output result output through the sense amplifier SA is 1 regardless of whether the value of the external input is 0 or 1. When the externally input data is 1, if the internal data stores a value of 1 (i.e., Q1, Q2 have a value of 10) or X (i.e., Q1, Q2 have a value of 01), then a match is represented and the SA output is 1; if the internal data store has a value of 0 (i.e., Q1, Q2 has a value of 11), this represents a mismatch and the SA output is 0. When the externally input data is 0, if the internal data stores a value of 0 (i.e., Q1, Q2 have a value of 11) or X (i.e., Q1, Q2 have a value of 01), representing a match, the SA output is 1; if the internal data store has a value of 1 (i.e., Q1, Q2 has a value of 10), this represents a mismatch and the SA output is 0.
During the test, when the internally stored data is 1001 and 100X, the Y2 outputs are shown in fig. 8 and 9, respectively. Based on the results shown in fig. 8 and fig. 9, the circuit provided in this embodiment outputs correct results in 5000 times of operation processes, the contention rate is 100%, and no error occurs, which shows that the circuit provided in this embodiment has excellent logic operation capability and circuit stability.
2. In integrated circuit design, the range of performance offered to designers is typically only applicable to digital circuits and is given in the form of a "Process Corner". The idea is to limit the speed fluctuation range of the NMOS and PMOS transistors to within a rectangle defined by the four corners. The four angles are: fast NFETs and fast PFETs, slow NFETs and slow PFETs, fast NFETs and slow PFETs, and slow NFETs and fast PFETs. The test structures for on-chip NMOS and PMOS show different gate delays when device models corresponding to each corner are extracted from the wafer, and the corners are actually selected to achieve acceptable yield. The simulation of circuits under various process corner and limiting temperature conditions is the basis for determining the yield. So we say ss, tt and ff refer to the corner at the lower left, the center and the corner at the upper right, respectively.
In order to test the performance of the designed circuit under different Process angles (Process Corner), the present embodiment also obtains the average power consumption of four load logic operations corresponding to each Process angle condition through different simulation schemes. The power consumption test results are shown in fig. 10. Analyzing the data in fig. 10 shows that: the energy consumption fluctuation of the same logic operation under different process angles does not exceed 5.07 percent, namely the circuit has good adaptability to different process angles. Meanwhile, in three process corners of ss, tt and ff, the average power consumption of the ss process corner is the lowest, and the average power consumption of the ff process corner is the highest; that is, the circuit design provided by the present application achieves the best performance in the ss process corner.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (10)

1. A memory arithmetic circuit, characterized by: the storage operation circuit comprises two storage units T1 and T2 for storing data and two operation logic units ALU1 and ALU2; the ALU1 and the ALU2 comprise a control end, a first input end, a second input end and an output end;
the arithmetic logic unit ALU1 is an AND gate when the control end is accessed to VSS, and is an exclusive OR gate when the control end is accessed to VDD; the arithmetic logic unit ALU2 is an exclusive OR gate when the control end is accessed to VSS, and is an OR gate when the control end is accessed to VDD;
the first input end of the arithmetic logic unit ALU1 is connected to a storage node Q1 of the storage unit T1; the second input end of the arithmetic logic unit ALU1 is connected with an independent input signal line IN; the first input end of the arithmetic logic unit ALU2 is connected with the output end Y1 of the arithmetic logic unit ALU 1; the second input end of the arithmetic logic unit ALU2 is connected to the storage node Q2 of the storage unit T2; the output end of the arithmetic logic unit ALU2 is Y2;
an output end Y1 of the ALU1 is used as an operation result output end when the ALU1 independently executes the exclusive OR operation of the ' AND '/'; the output end Y2 of the ALU2 is used as an operation result output end when the ALU2 singly executes the ' or ' exclusive OR ' operation, or used as an operation result output end when the TI, T2, ALU1 and ALU2 jointly execute four kinds of composite Boolean logic operation, or used as a matching result output end when the TI, T2, ALU1 and ALU2 jointly realize TCAM addressing operation.
2. The memory arithmetic circuit according to claim 1, wherein the ALU1 circuit includes two PMOS transistors PM3, PM4, and two NMOS transistors NM5, NM6; wherein, the source electrode of PM3 is connected with the control signal SD1; the grid electrode of the PM3, the source electrode of the NM5 and the grid electrode of the NM5 are connected and connected to the input signal line IN; the grid electrode of the PM4, the grid electrode of the NM5 and the source electrode of the NM6 are connected in parallel and connected to a storage node Q1 of the storage unit T1; the drain electrode of the PM4, the drain electrode of the NM5 and the drain electrode of the NM6 are connected and used as an output node Y1; the drain of PM3 is connected to the source of PM 4.
3. The memory arithmetic circuit according to claim 1, wherein the circuit of the ALU2 includes two PMOS transistors PM7, PM8, and two NMOS transistors NM11, NM12; wherein, the grid of PM7, the source of PM8, the grid of NM11 link to each other and connect in the output node Y1 of ALU 1; the grid electrode of the PM8, the source electrode of the PM7 and the grid electrode of the NM12 are connected and connected to a storage node Q2 of the storage unit T2; the drain electrode of the PM8, the drain electrode of the PM7 and the drain electrode of the NM11 are connected and used as an output node Y2 of the ALU2; the source of NM11 is connected with the drain of NM12; the source of NM12 is connected to control signal SD2.
4. The memory arithmetic circuit of claim 1, wherein: the memory cells T1 and T2 adopt 6T memory cells comprising 6 transistors; the 6T storage unit comprises 2 PMOS tubes PM1 and PM2 and 4 NMOS tubes NM1, NM2, NM3 and NM4; PM1 and NM1 form an inverter structure, PM2 and NM2 form another inverter structure, and NM3 and NM4 are used as transmission tubes respectively; the source electrodes of the PM1 and the PM2 are connected with VDD, and the source electrodes of the NM1 and the NM2 are connected with VSS; the drain electrode of PM1, the drain electrode of NM1, the grid electrode of PM2 and the grid electrode of NM2 are connected as a storage node Q1 and connected with the drain electrode of NM3, the grid electrode of NM3 is connected with a word line WL, and the source electrode of NM3 is connected with a bit line BL; the drain of PM2, the drain of NM2, the gate of PM1, and the gate of NM1 are connected as a storage node QB1 and connected to the drain of NM4, the gate of NM4 is connected to the word line WL, and the source of NM4 is connected to the bit line BLB.
5. A memory-arithmetic circuit as claimed in any one of claims 1 to 4, characterized in that the memory-arithmetic circuit operates as follows:
the storage unit T1 and the arithmetic logic unit ALU1 are used as basic circuits for realizing AND operation and XNOR operation IN the storage arithmetic circuit, wherein IN and Q1 are two operands, and Y1 is an output result; ALU1 performs and when SD1= VSS; ALU1 implements exclusive nor operation when SD1= VDD;
the storage unit T2 and the arithmetic logic unit ALU2 are used as basic circuits for implementing an or operation and an xor operation in the storage arithmetic circuit, where Y1 and Q2 are two operands and Y2 is the output result; ALU2 implements or operates when SD2= VSS; ALU2 implements an exclusive-or operation when SD1= VDD;
the storage units T1 and T2 and the arithmetic logic units ALU1 and ALU2 are used as basic circuits for realizing four kinds of composite Boolean logic operation, wherein IN, Q1 and Q2 are three operands, and Y2 is an output result; the expressions for the four composite boolean logic operations are as follows:
Figure FDA0003885812600000021
6. the memory arithmetic circuit of any one of claims 1-4, wherein: the storage operation circuit realizes TCAM addressing operation in the following way:
setting control signals of a memory operation circuit including T1, T2, ALU1, and ALU2 to SD1= VSS, and SD2= VSS; a two-bit binary number made up of the node data Q1 and Q2 stored in the two memory cells T1 and T2 can be used to represent three state bits; wherein "10" represents TCAM state 1, "11" represents TCAM state 2, and "01" represents TCAM state x; taking two-bit binary numbers represented by Q1 and Q2 as target data, wherein IN represents search data; then the value of Y2 output is "1" indicating that the target data matches the search data, and the value of Y2 output is "0" indicating that the target data does not match the search data.
7. An SRAM memory computation circuit having TCAM and logic operation functions, comprising:
SRAM memory array consisting of 4N 2 The same storage units form a 2N multiplied by 2N array form; each storage unit comprises 2 inverted storage nodes Q and QB;
a logic cell array including a plurality of arithmetic logic cells corresponding to the memory cells one to one; in the logic unit array, the operation logic units corresponding to the storage units in the odd-numbered columns are all operation logic units ALU1, and the operation logic units corresponding to the storage units in the even-numbered columns are all operation logic units ALU2;
a time sequence control circuit used for generating clock signals required by each functional module;
an input signal line connected to the arithmetic logic units ALU1 of each column, for inputting a corresponding input signal IN to each of the arithmetic logic units ALU 1;
a bit line pair including 2N pairs of bit lines BL and BLB; each memory cell in each column and the corresponding arithmetic logic unit ALU1 or ALU2 are connected to the same group of bit lines BL and BLB;
the pre-charging circuit is used for performing pre-charging operation on bit lines BL and BLB connected with each column of memory cells in the SRAM memory array;
a word line WL for inputting a corresponding word line signal to each memory cell in the SRAM memory array;
the word line driving module is used for controlling the on or off of a word line WL connected with each memory cell in the memory array;
the row decoding module is used for decoding the input signal and controlling the word line driving module according to a decoding result;
the switching circuit is used for switching access states of the SRAM storage array and the logic unit array on the bit lines so as to adjust different working modes of the circuit; and
the column output circuit is connected with bit lines BL and BLB connected with each column of storage units in the SRAM storage array through a sense amplifier SA, and further outputs storage data of the storage units in any column or calculation results of the operational logic units;
the two memory units T1 and T2 and the corresponding ALU1 and ALU2 located in two adjacent columns in the same row in the SRAM memory array form the memory operation circuit as claimed in any one of claims 1 to 4, and can realize the complete function of the basic circuit.
8. The SRAM memory compute circuit with TCAM and logic operation functionality of claim 7, wherein: the number of the sensitive amplifiers is 4N, BL or BLB in each column storage unit is respectively used as one path of input of one sensitive amplifier, and the other path of input of the sensitive amplifier is a reference level Vref; when the level of BL or BLB is higher than the reference level Vref, the sense amplifier outputs a high level, otherwise, a low level is output.
9. A memory chip, comprising: the integrated circuit is packaged by the SRAM memory computing circuit with TCAM and logic operation function according to claim 7 or 8.
10. The memory chip of claim 9, wherein the interface of the memory chip comprises at least:
the power interface VDD is used for connecting a power supply;
a ground line interface VSS for grounding;
a charging interface PRE for inputting a control signal for adjusting the charging state of each bit line;
a control signal interface SD for inputting control signals for adjusting the operating states of the respective arithmetic logic units;
an input signal interface IN for inputting a corresponding input signal to each arithmetic logic unit ALU 1;
a switching signal SW for inputting a switching signal to the circuit, the switching signal being used for adjusting the access state of the memory cell and the budget logic cell on the bit line; so that the circuit can be switched between two working modes of normal reading and writing and memory operation;
a word line interface WL for inputting a corresponding word line signal to each memory cell, the word line signal being used for adjusting an access state of each memory cell on each bit line; and
and the output signal interface Y is used for reading the data stored in each storage unit or reading the logic operation result of each operation logic unit.
CN202211244850.7A 2022-10-12 2022-10-12 Memory operation circuit, memory calculation circuit and chip thereof Pending CN115588446A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211244850.7A CN115588446A (en) 2022-10-12 2022-10-12 Memory operation circuit, memory calculation circuit and chip thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211244850.7A CN115588446A (en) 2022-10-12 2022-10-12 Memory operation circuit, memory calculation circuit and chip thereof

Publications (1)

Publication Number Publication Date
CN115588446A true CN115588446A (en) 2023-01-10

Family

ID=84780356

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211244850.7A Pending CN115588446A (en) 2022-10-12 2022-10-12 Memory operation circuit, memory calculation circuit and chip thereof

Country Status (1)

Country Link
CN (1) CN115588446A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117235003A (en) * 2023-09-26 2023-12-15 海光信息技术(苏州)有限公司 Memory readout circuit, data operation method in memory and related equipment
CN117437944A (en) * 2023-12-20 2024-01-23 长鑫存储技术有限公司 Memory device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117235003A (en) * 2023-09-26 2023-12-15 海光信息技术(苏州)有限公司 Memory readout circuit, data operation method in memory and related equipment
CN117437944A (en) * 2023-12-20 2024-01-23 长鑫存储技术有限公司 Memory device
CN117437944B (en) * 2023-12-20 2024-03-08 长鑫存储技术有限公司 Memory device

Similar Documents

Publication Publication Date Title
US11335387B2 (en) In-memory computing circuit for fully connected binary neural network
US11568223B2 (en) Neural network circuit
CN115588446A (en) Memory operation circuit, memory calculation circuit and chip thereof
JP2836596B2 (en) Associative memory
CN111816231A (en) Memory computing device with double-6T SRAM structure
Chen et al. Analysis and optimization strategies toward reliable and high-speed 6T compute SRAM
US11580059B2 (en) Multi-port memory architecture for a systolic array
CN116364137A (en) Same-side double-bit-line 8T unit, logic operation circuit and CIM chip
CN110176264A (en) A kind of high-low-position consolidation circuit structure calculated interior based on memory
CN115810374A (en) Memory circuit and memory computing circuit with BCAM addressing and logic operation functions
CN116126779A (en) 9T memory operation circuit, multiply-accumulate operation circuit, memory operation circuit and chip
CN117316237B (en) Time domain 8T1C-SRAM memory cell and memory circuit for timing tracking quantization
CN113205846A (en) SRAM cell suitable for high speed content addressing and memory Boolean logic computation
Chen et al. Reconfigurable 2T2R ReRAM with split word-lines for TCAM operation and in-memory computing
CN116204490A (en) 7T memory circuit and multiply-accumulate operation circuit based on low-voltage technology
CN105336361B (en) A kind of SRAM autotrackings duplication bit line circuit
CN116340256A (en) In-memory computing unit and array based on SRAM (static random Access memory) with DICE (digital computer aided design) structure
Izhar et al. Logic Circuit Implementation for Enabling SRAM Based In Memory Computing
Monga et al. A Novel Decoder Design for Logic Computation in SRAM: CiM-SRAM
CN112214197B (en) SRAM full adder and multi-bit SRAM full adder
CN112951290B (en) Memory computing circuit and device based on nonvolatile random access memory
CN111883192B (en) Circuit for realizing Hamming distance calculation in memory based on 9T SRAM unit and 9T SRAM unit
CN116913342B (en) Memory circuit with in-memory Boolean logic operation function, and module and chip thereof
CN117807021B (en) 2T-2MTJ memory cell and MRAM in-memory computing circuit
CN118038936A (en) Internal copying circuit and method for three-dimensional integrated static random access memory array

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination