CN118312468B

CN118312468B - In-memory operation circuit with symbol multiplication and CIM chip

Info

Publication number: CN118312468B
Application number: CN202410735739.0A
Authority: CN
Inventors: 彭春雨; 汪婷; 王思民; 关立军; 蔺智挺; 陈军宁; 吴秀龙
Original assignee: Anhui University
Current assignee: Anhui University
Priority date: 2024-06-07
Filing date: 2024-06-07
Publication date: 2024-08-16
Anticipated expiration: 2044-06-07
Also published as: CN118312468A

Abstract

The present invention belongs to the technical field of integrated circuits, and specifically relates to an in-memory operation circuit with signed multiplication and a CIM chip thereof. The in-memory operation circuit includes at least one column of operation units, and the operation unit includes a weight storage part and a calculation part; the weight storage part adopts an SRAM unit with a double word line; the circuit connection relationship of the calculation part is: the drains of P1 and N3 are connected to the calculation bit line CBL; the gate of N3 is connected to the bit line BL, and the gate of P1 is connected to the bit line BLB; the source of N3 is connected to the drain of N4; the source of P1 is connected to the drain of P2; the gate of N4 is connected to the input word line INN; the gate of P2 is connected to the input word line INP; the source of N4 is connected to VSS; the source of P2 is connected to VDD; the capacitor C is connected between CBL and VSS; the scheme solves the problems of large area overhead and low operation efficiency that are common in various existing CIM circuits with signed multiplication and multiplication-accumulation operation functions.

Description

A signed multiplication in-memory operation circuit and CIM chip

技术领域Technical Field

本发明属于集成电路技术领域，具体涉及一种带符号乘法的存内运算电路以及采用该电路的CIM芯片。The invention belongs to the technical field of integrated circuits, and in particular relates to an in-memory operation circuit with signed multiplication and a CIM chip using the circuit.

背景技术Background Art

随着人工智能的快速发展和普及，卷积神经网络（CNN）和深度神经网络（DNN）已经成为计算机视觉领域最具影响力的创新之一。CNN和DNN等神经网络进行数据处理时需要进行大量的乘法和乘累加（MAC）运算，这种运算在基于冯诺依曼架构的计算机中进行处理时。由于需要在处理器和存储器间频率搬运数据，因而造成了较高的能量消耗和延迟，这个问题被称为冯诺依曼瓶颈或内存墙。基于冯诺依曼架构的DNN处理器和加速器的演示表明，能量消耗和延迟主要取决于处理器和存储器之间输入数据。因此，传统的冯诺依曼计算机并不适合处理神经网络等人工智能相关的计算任务。With the rapid development and popularization of artificial intelligence, convolutional neural networks (CNN) and deep neural networks (DNN) have become one of the most influential innovations in the field of computer vision. Neural networks such as CNN and DNN need to perform a large number of multiplication and multiply-accumulate (MAC) operations when processing data. When such operations are processed in computers based on the von Neumann architecture. The need to frequently move data between the processor and the memory results in high energy consumption and latency, a problem known as the von Neumann bottleneck or memory wall. Demonstrations of DNN processors and accelerators based on the von Neumann architecture show that energy consumption and latency depend mainly on the input data between the processor and the memory. Therefore, traditional von Neumann computers are not suitable for processing artificial intelligence-related computing tasks such as neural networks.

为了克服冯诺依曼瓶颈，技术人员提出一种基于存储器的存内运算（CIM）架构，这种新型的计算机架构直接利用存储器来实现逻辑运算，无需在存储器和处理器进行数据搬运，因而可以大幅提高数据处理效率，并降低设备运行功耗。In order to overcome the von Neumann bottleneck, technicians proposed a memory-based computing-in-memory (CIM) architecture. This new computer architecture directly uses memory to perform logical operations, eliminating the need to move data between memory and processor. It can greatly improve data processing efficiency and reduce device operating power consumption.

卷积神经网络中包含大量带符号的乘法与乘累加操作。既有的各类可以实现多比特带符号乘法或乘累加的CIM电路普遍将计算过程中正负权重分离，在不同的SRAM单元中分别执行正权重的乘法和负权重的乘法，这种设计会增加电路的面积开销并降低运算效率，进而显著延长神经网络的推理时间。为了解决现有各类具有带符号乘法与乘累加运算功能的CIM电路普遍存在的面积开销大，运算效率低的问题，本发明提供一种带符号乘法的存内运算电路以及采用该电路的CIM芯片。Convolutional neural networks contain a large number of signed multiplication and multiplication-accumulation operations. Existing CIM circuits that can implement multi-bit signed multiplication or multiplication-accumulation generally separate positive and negative weights in the calculation process, and perform positive weight multiplication and negative weight multiplication in different SRAM units respectively. This design will increase the area overhead of the circuit and reduce the operation efficiency, thereby significantly prolonging the reasoning time of the neural network. In order to solve the problems of large area overhead and low operation efficiency commonly existing in various existing CIM circuits with signed multiplication and multiplication-accumulation functions, the present invention provides a signed multiplication in-memory operation circuit and a CIM chip using the circuit.

发明内容Summary of the invention

本发明提供的技术方案为：The technical solution provided by the present invention is:

一种带符号乘法的存内运算电路，其包括至少一列运算单元，每列的运算单元中包括一个权重存储部分以及一个计算部分。其中，权重存储部分采用具有双字线的SRAM单元。SRAM单元中两个传输管的漏极分别连接在两条位线上，两个传输管的栅极分别连接两条字线上。计算部分由两个NMOS管N3、N4，两个PMOS管P1、P2，以及一个电容C构成；电路连接关系为：A signed multiplication in-memory operation circuit includes at least one column of operation units, each column of operation units includes a weight storage part and a calculation part. The weight storage part uses an SRAM unit with a double word line. The drains of the two transmission tubes in the SRAM unit are respectively connected to the two bit lines, and the gates of the two transmission tubes are respectively connected to the two word lines. The calculation part is composed of two NMOS tubes N3 and N4, two PMOS tubes P1 and P2, and a capacitor C; the circuit connection relationship is:

P1和N3的漏极连接在计算位线CBL上；N3的栅极接位线BL，P1的栅极接位线BLB；N3的源极与N4的漏极相连；P1的源极与P2的漏极相连；N4的栅极接输入字线INN；P2的栅极接输入字线INP；N4的源极接VSS；P2的源极接VDD；电容C的一端连接在计算位线CBL上，另一端接VSS。The drains of P1 and N3 are connected to the calculation bit line CBL; the gate of N3 is connected to the bit line BL, and the gate of P1 is connected to the bit line BLB; the source of N3 is connected to the drain of N4; the source of P1 is connected to the drain of P2; the gate of N4 is connected to the input word line INN; the gate of P2 is connected to the input word line INP; the source of N4 is connected to VSS; the source of P2 is connected to VDD; one end of capacitor C is connected to the calculation bit line CBL, and the other end is connected to VSS.

每列的运算单元用于实现带符号的2bit第一操作数与无符号的第二操作数之间的乘法运算；执行乘法运算的操作逻辑为：The operation unit of each column is used to implement the multiplication operation between the signed 2-bit first operand and the unsigned second operand; the operation logic for performing the multiplication operation is:

将第二操作数预先存储在SRAM单元中，计算位线CBL预充到中间电位；通过WLL、WLR、INN和INP的电平状态编码输入的第一操作数；第一操作数和第二操作数的乘积体现在计算位线CBL的位线电压的变化上。The second operand is pre-stored in the SRAM cell, and the bit line CBL is pre-charged to an intermediate potential; the first operand is input by encoding the level states of WLL, WLR, INN and INP; the product of the first operand and the second operand is reflected in the change of the bit line voltage of the calculated bit line CBL.

作为本发明进一步的改进，SRAM单元采用6T-SRAM单元或在6T-SRAM单元基础上增加MOS管得到的其它的具有双字线的SRAM单元。As a further improvement of the present invention, the SRAM cell adopts a 6T-SRAM cell or other SRAM cells with double word lines obtained by adding MOS tubes to the 6T-SRAM cell.

本发明中，6T-SRAM单元包括两个NMOS管N1和N2，以及两个反相器INV0和INV1。电路连接关系如下：INV0的输入端、INV1的输出端与N1的源极相连，并作为存储节点Q。INV0的输出端、INV1的输入端与N2的源极相连，并作为存储节点QB。N1、N2的漏极分别连接在位线BL和BLB上，N1、N2的栅极分别连接字线WLL、WLR。In the present invention, the 6T-SRAM cell includes two NMOS tubes N1 and N2, and two inverters INV0 and INV1. The circuit connection relationship is as follows: the input end of INV0 and the output end of INV1 are connected to the source of N1 and serve as the storage node Q. The output end of INV0 and the input end of INV1 are connected to the source of N2 and serve as the storage node QB. The drains of N1 and N2 are connected to the bit lines BL and BLB respectively, and the gates of N1 and N2 are connected to the word lines WLL and WLR respectively.

作为本发明进一步的改进，在乘法运算过程中，当乘法运算中的第一操作数为“+1”时，则将WLL、INN和INP置低电平，WLR置高电平。当乘法运算中的第一操作数为“-1”时，则将WLL、INN和INP置高电平，WLR置低电平。当乘法运算中的第一操作数为“0”时，则将WLL和INN置低电平，WLR和INP置高电平。As a further improvement of the present invention, during the multiplication operation, when the first operand in the multiplication operation is "+1", WLL, INN and INP are set to a low level, and WLR is set to a high level. When the first operand in the multiplication operation is "-1", WLL, INN and INP are set to a high level, and WLR is set to a low level. When the first operand in the multiplication operation is "0", WLL and INN are set to a low level, and WLR and INP are set to a high level.

作为本发明进一步的改进，在乘法运算过程中，当计算位线CBL的位线电压上升，则表示乘积为“+1”；当计算位线CBL的位线电压下降，则表示乘积为“-1”；当计算位线CBL的位线电压保持不变，则表示乘积为“0”。As a further improvement of the present invention, during the multiplication operation, when the bit line voltage of the calculated bit line CBL rises, the product is "+1"; when the bit line voltage of the calculated bit line CBL drops, the product is "-1"; when the bit line voltage of the calculated bit line CBL remains unchanged, the product is "0".

作为本发明进一步的改进，权重存储部分中包含按列排布的多个SRAM单元；各个SRAM单元连接在同一组位线BL和BLB上；各个SRAM单元还连接在对应行的独立的字线WLL和WLR上；同列的各个SRAM单元共用同一个计算部分的电路。As a further improvement of the present invention, the weight storage part includes multiple SRAM cells arranged in columns; each SRAM cell is connected to the same group of bit lines BL and BLB; each SRAM cell is also connected to independent word lines WLL and WLR of the corresponding row; each SRAM cell in the same column shares the circuit of the same calculation part.

作为本发明进一步的改进，在计算部分中，晶体管N3和N4的宽长比相同；晶体管P1和P2的宽长比相同。As a further improvement of the present invention, in the calculation part, the width-to-length ratios of transistors N3 and N4 are the same; the width-to-length ratios of transistors P1 and P2 are the same.

作为本发明进一步的改进，通过调整各列计算部分中电容C的电容大小，实现对不同列的运算单元中执行的乘法运算的第二操作数的权重进行区分。其中，各列中的电容C相对单位电容的倍数，即为运算过程中第二操作数的权重。As a further improvement of the present invention, by adjusting the capacitance of the capacitor C in each column calculation part, the weight of the second operand of the multiplication operation performed in the operation unit of different columns is distinguished. Among them, the multiple of the capacitor C in each column relative to the unit capacitance is the weight of the second operand in the operation process.

作为本发明进一步的改进，带符号乘法的存内运算电路中包括N列运算单元，各列运算单元中的电容C的电容值的倍率分别为1、2、4、8、…、2^N-1。相邻的各列运算单元之间分别设置一个用于连接二者的计算位线CBL的传输门TG。As a further improvement of the present invention, the in-memory operation circuit with signed multiplication includes N columns of operation units, and the capacitance value multiples of the capacitor C in each column of the operation unit are 1, 2, 4, 8, ..., 2 ^N-1 respectively. A transmission gate TG for connecting the calculation bit lines CBL of the two columns is respectively set between adjacent columns of operation units.

利用本发明提供的包含多列运算单元的带符号乘法的存内运算电路，可以实现2bit第一操作数与Nbit的第二操作数的乘法运算，运算过程的操作为：By using the in-memory operation circuit for signed multiplication including multiple columns of operation units provided by the present invention, a multiplication operation of a 2-bit first operand and an N-bit second operand can be implemented. The operation process is as follows:

（1）断开各运算列间的传输门；并将各两列的计算位线CBL预充到中间电位。(1) Disconnect the transmission gates between the operation columns; and precharge the calculation bit lines CBL of each two columns to the intermediate potential.

（2）将Nbit的第二操作数按位分解为N个单比特数，并将各个单比特数按对应权重预存到不同列的权重存储部分。(2) The N-bit second operand is decomposed into N single-bit numbers, and each single-bit number is pre-stored in the weight storage part of a different column according to the corresponding weight.

（3）通过WLL、WLR、INN和INP向所有选中列的SRAM单元同步输入经过编码的第一操作数；进而在各列中完成第一操作数与第二操作数中其中一位的乘法运算；(3) synchronously inputting the encoded first operand to the SRAM cells of all selected columns through WLL, WLR, INN and INP; and then performing a multiplication operation of the first operand and one bit of the second operand in each column;

（4）将各运算列间的传输门闭合，2bit第一操作数与Nbit的第二操作数的乘积体现在计算位线CBL的位线电压的变化上；其中，CBL的位线电压的变化方向反映乘积的符号，CBL的变化幅度则反映乘积的数值大小。(4) The transmission gates between the operation columns are closed. The product of the 2-bit first operand and the N-bit second operand is reflected in the change of the bit line voltage of the calculated bit line CBL. The change direction of the bit line voltage of CBL reflects the sign of the product, and the change amplitude of CBL reflects the value of the product.

本发明还包括一种CIM芯片，集成有如前述的带符号乘法的存内运算电路。The present invention also includes a CIM chip, which integrates the aforementioned in-memory operation circuit with signed multiplication.

本发明提供的技术方案，具有如下有益效果：The technical solution provided by the present invention has the following beneficial effects:

本发明基于双字线双位线的SRAM单元设计了一种带符号乘法的存内计算电路，该电路将1bit权重存储在SRAM单元内， 2bit带符号数分为1bit符号位和1bit无符号数两部分、1bit符号位通过控制双字线WLL和WLR的高低电平表示，1bit无符号数结合新增的计算部分的输入字线INN和INP进行控制。在电路中，根据表征权重和带符号数中各个信号的不同取值，可以控制计算字线CBL相对电源和地的充放电路径的导通，进而通过CBL的位点电压变化表征最终的乘积结果。The present invention designs a signed multiplication in-memory calculation circuit based on a double-word-line double-bit-line SRAM cell, which stores a 1-bit weight in the SRAM cell, and a 2-bit signed number is divided into a 1-bit sign bit and a 1-bit unsigned number. The 1-bit sign bit is represented by controlling the high and low levels of the double-word lines WLL and WLR, and the 1-bit unsigned number is controlled in combination with the input word lines INN and INP of the newly added calculation part. In the circuit, according to the different values of each signal in the characterization weight and the signed number, the conduction of the charge and discharge path of the calculation word line CBL relative to the power supply and the ground can be controlled, and then the final product result is characterized by the change of the voltage at the point of CBL.

该电路可以单个电路单元中实现不同符号数间乘法的运算，节省了电路的面积开销，且电路的可靠性也相对较高。The circuit can realize multiplication operations between different symbol numbers in a single circuit unit, saving circuit area overhead, and the circuit reliability is relatively high.

本发明的方案还可以通过计算部分挂载的电容C的大小来表示各列中权重的高低位，并将计算结果反映在CBL的电压上，保证计算结果的准确性。基于这样的设计电路还支撑进行2bit带符号数与多比特权重的乘法运算，性能更加强大。The solution of the present invention can also represent the high and low bits of the weights in each column by calculating the size of the partially mounted capacitor C, and reflect the calculation result on the voltage of CBL to ensure the accuracy of the calculation result. Based on such a design circuit, it also supports the multiplication of 2-bit signed numbers and multi-bit weights, and the performance is more powerful.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明实施例1中提供的带符号乘法的存内运算电路的电路原理图。FIG1 is a circuit schematic diagram of an in-memory operation circuit for signed multiplication provided in Embodiment 1 of the present invention.

图2为图1中的每列中的运算单元的电路图。FIG. 2 is a circuit diagram of a computing unit in each column in FIG. 1 .

图3为本发明实施例1中提供的权重存储部分包含多个SRAM单元的运算单元的电路图。FIG3 is a circuit diagram of a computing unit including a plurality of SRAM units in the weight storage part provided in Embodiment 1 of the present invention.

图4为本发明实施例1中提供的包含多列运算单元的带符号乘法的存内运算电路的电路图。FIG. 4 is a circuit diagram of an in-memory operation circuit for signed multiplication including multiple columns of operation units provided in Embodiment 1 of the present invention.

图5为本发明实施例2中提供的CIM芯片的架构图。FIG5 is an architecture diagram of a CIM chip provided in Embodiment 2 of the present invention.

图6为测试实验中带符号数与单比特权重在乘法运算阶段的计算位线CBL的信号图。FIG. 6 is a signal diagram of a calculation bit line CBL in the multiplication stage of a signed number and a single-bit weight in a test experiment.

图7为测试实验中带符号数“11”与4bit权重在乘法运算阶段的计算位线CBL的信号图。FIG. 7 is a signal diagram of the calculation bit line CBL in the multiplication operation phase of the signed number “11” and the 4-bit weight in the test experiment.

图8为测试实验中带符号数“01”与4bit权重在乘法运算阶段的计算位线CBL的信号图。FIG. 8 is a signal diagram of the calculation bit line CBL in the multiplication operation stage of the signed number “01” and the 4-bit weight in the test experiment.

具体实施方式DETAILED DESCRIPTION

为了使本发明的目的、技术方案及优点更加清楚明白，以下结合附图及实施例，对本发明进行进一步地详细说明。应当理解，此处所描述的具体实施例仅用以解释本发明，并不用于限定本发明。In order to make the purpose, technical solution and advantages of the present invention more clearly understood, the present invention is further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention and are not used to limit the present invention.

实施例1Example 1

本实施例提供一种带符号乘法的存内运算电路，如图1所示，其包括至少一列运算单元，每列的运算单元中包括一个权重存储部分以及一个计算部分。其中，权重存储部分采用具有双字线的SRAM单元。在本实施例的实际电路中，SRAM单元可以采用经典的6T-SRAM单元，也可以采用在6T-SRAM单元基础上进一步增加MOS管进而升级得到的其它具有双字线的SRAM单元，如8T-SRAM、10T-SRAM和12T-SRAM，等等。The present embodiment provides an in-memory operation circuit with signed multiplication, as shown in FIG1 , which includes at least one column of operation units, and each column of operation units includes a weight storage part and a calculation part. Among them, the weight storage part adopts an SRAM cell with a double word line. In the actual circuit of this embodiment, the SRAM cell can adopt the classic 6T-SRAM cell, or it can adopt other SRAM cells with double word lines obtained by further adding MOS tubes to the 6T-SRAM cell and upgrading it, such as 8T-SRAM, 10T-SRAM and 12T-SRAM, etc.

在本实施例采用的这类具有双字线的SRAM单元中，至少包括一个具有两个存储节点Q和QB的反相锁存结构，以及位于反相锁存结构两侧用于连接位线BL和BLB的两个传输管。在本实施例的方案中，SRAM单元中两个传输管的源极接存储节点Q和QB上，漏极分别连接在两条位线BL和BLB上，栅极则分别连接在两条独立的字线上，本实施中将两条字线分别记为WLL和WLR。In the SRAM cell with dual word lines used in this embodiment, at least one inverting latch structure with two storage nodes Q and QB and two transmission tubes located on both sides of the inverting latch structure for connecting the bit lines BL and BLB are included. In the scheme of this embodiment, the source electrodes of the two transmission tubes in the SRAM cell are connected to the storage nodes Q and QB, the drain electrodes are respectively connected to the two bit lines BL and BLB, and the gate electrodes are respectively connected to two independent word lines. In this implementation, the two word lines are respectively recorded as WLL and WLR.

例如，当本实施例的权重存储部分采用6T-SRAM单元时，其包括两个NMOS管N1和N2，以及两个反相器INV0和INV1。如图2所示，电路连接关系为：INV0的输入端、INV1的输出端与N1的源极相连，并作为存储节点Q。INV0的输出端、INV1的输入端与N2的源极相连，并作为存储节点QB。N1、N2的漏极分别连接在位线BL和BLB上，N1、N2的栅极分别连接字线WLL、WLR。For example, when the weight storage part of this embodiment adopts a 6T-SRAM unit, it includes two NMOS tubes N1 and N2, and two inverters INV0 and INV1. As shown in Figure 2, the circuit connection relationship is: the input end of INV0 and the output end of INV1 are connected to the source of N1 and serve as the storage node Q. The output end of INV0 and the input end of INV1 are connected to the source of N2 and serve as the storage node QB. The drains of N1 and N2 are connected to the bit lines BL and BLB respectively, and the gates of N1 and N2 are connected to the word lines WLL and WLR respectively.

如图2所示，本实施例的每列运算单元中的计算部分由两个NMOS管N3、N4，两个PMOS管P1、P2，以及一个电容C构成；电路连接关系为：As shown in FIG. 2 , the calculation part of each column of the operation unit in this embodiment is composed of two NMOS transistors N3 and N4, two PMOS transistors P1 and P2, and a capacitor C; the circuit connection relationship is:

结合电路图可知：在本实施例的计算部分，当存储节点Q和输入字线INN均为高电平时，则计算位线CBL可以通过N3和N4连接到地端VSS，进而形成一条放电路径。而当存储节点QB和输入字线INP均为高电平时，则计算位线CBL可以通过P1和P2连接到电源端VDD，进而形成一条充电路径。Combined with the circuit diagram, it can be seen that in the calculation part of this embodiment, when the storage node Q and the input word line INN are both high, the calculation bit line CBL can be connected to the ground terminal VSS through N3 and N4, thereby forming a discharge path. When the storage node QB and the input word line INP are both high, the calculation bit line CBL can be connected to the power supply terminal VDD through P1 and P2, thereby forming a charging path.

特别地，为了保证充电路径和放电路径中充放电特性的一致性，本实施例的计算部分采用的晶体管N3和N4的宽长比相同；晶体管P1和P2的宽长比也相同。In particular, in order to ensure the consistency of the charge and discharge characteristics in the charging path and the discharging path, the calculation part of this embodiment uses the same width-to-length ratio for transistors N3 and N4; the same width-to-length ratio for transistors P1 and P2.

基于电路中计算部分的以上工作原理，在本实施例提供的带符号乘法的存内运算电路中，每列的运算单元均可以实现带符号的2bit第一操作数与不带符号的1bit第二操作数之间的乘法运算。详细地，该电路方案执行乘法运算的操作逻辑为：Based on the above working principle of the calculation part in the circuit, in the in-memory operation circuit with signed multiplication provided in this embodiment, the operation unit of each column can realize the multiplication operation between the signed 2-bit first operand and the unsigned 1-bit second operand. In detail, the operation logic of the circuit scheme to perform the multiplication operation is:

一、将第二操作数预先存储在SRAM单元中，计算位线CBL预充到中间电位。1. The second operand is pre-stored in the SRAM cell, and the calculation bit line CBL is pre-charged to an intermediate potential.

以图2所示的基于6T-SRAM的电路方案为例，如果需要执行的乘法运算中的第二操作数为“0”时，则通过6T-SRAM单元原始的数据存储功能将存储节点Q置为低电平，QB置为高电平。反之，如果需要执行的乘法运算中的第二操作数为“1”时，则通过6T-SRAM单元原始的数据存储功能将存储节点Q置为高电平，QB置为低电平。Taking the circuit scheme based on 6T-SRAM shown in FIG2 as an example, if the second operand in the multiplication operation to be performed is "0", the storage node Q is set to a low level and QB is set to a high level through the original data storage function of the 6T-SRAM unit. Conversely, if the second operand in the multiplication operation to be performed is "1", the storage node Q is set to a high level and QB is set to a low level through the original data storage function of the 6T-SRAM unit.

此外，在运算之前，还需要将计算位线CBL的位线电平预充到一个中间电位，本实施例中“中间电位”指的是相对电源电压VDD和地端VSS的中间电位。具体的，本实施例在初始阶段将CBL预充到1/2VDD。In addition, before the operation, the bit line level of the calculation bit line CBL needs to be precharged to an intermediate potential. In this embodiment, the "intermediate potential" refers to the intermediate potential relative to the power supply voltage VDD and the ground terminal VSS. Specifically, in this embodiment, CBL is precharged to 1/2 VDD in the initial stage.

二、通过WLL、WLR、INN和INP的电平状态编码输入的第一操作数。2. Encode the first input operand through the level status of WLL, WLR, INN and INP.

在本实施例中，第一操作数的输入是通过向权重存储部分中的字线WLL、WLR以及计算部分的输入字线INN和INP输入不同电平状态的控制信号的方式实现的。具体地，结合电路的工作原理，本实施例对第二操作数的编码规则如下：In this embodiment, the input of the first operand is realized by inputting control signals of different level states to the word lines WLL and WLR in the weight storage part and the input word lines INN and INP in the calculation part. Specifically, combined with the working principle of the circuit, the encoding rules of the second operand in this embodiment are as follows:

（1）当乘法运算中的第一操作数为“+1”时，则将WLL、INN和INP置低电平，WLR置高电平。(1) When the first operand in the multiplication operation is "+1", WLL, INN and INP are set to low levels and WLR is set to high level.

（2）当乘法运算中的第一操作数为“-1”时，则将WLL、INN和INP置高电平，WLR置低电平。(2) When the first operand in the multiplication operation is "-1", WLL, INN and INP are set to high levels and WLR is set to low level.

（3）当乘法运算中的第一操作数为“0”时，则将WLL和INN置低电平，WLR和INP置高电平。(3) When the first operand in the multiplication operation is "0", WLL and INN are set to low level, and WLR and INP are set to high level.

三、在本实施例提供的带符号乘法的存内运算电路中，完成上述第一操作数和第二操作数的输入后，第一操作数和第二操作数的乘积体现在计算位线CBL的位线电压的变化上。3. In the in-memory operation circuit with signed multiplication provided in this embodiment, after the input of the first operand and the second operand is completed, the product of the first operand and the second operand is reflected in the change of the bit line voltage of the calculated bit line CBL.

具体地，在本实施例的电路中，由于Q、QB、WLL、WLR、INN、INP等的电平状态会最终影响CBL与低电平的VSS导通以及高电平的VDD之间的各个MOS管的导通状态。并在CBL与VDD或VSS之间形成充电路径和放电路径，进而导致处于“中间电位”的CBL的位线电压升高或降低。而CBL上位线电压的变化，恰好可以表征最终的乘积。Specifically, in the circuit of this embodiment, the level states of Q, QB, WLL, WLR, INN, INP, etc. will eventually affect the conduction state of each MOS tube between CBL and low-level VSS and high-level VDD. A charging path and a discharging path are formed between CBL and VDD or VSS, which in turn causes the bit line voltage of CBL at the "intermediate potential" to increase or decrease. The change in the bit line voltage on CBL can just represent the final product.

具体地，在本实施例的电路执行乘法运算过程中，当计算位线CBL的位线电压上升，则表示乘积为“+1”；当计算位线CBL的位线电压下降，则表示乘积为“-1”；当计算位线CBL的位线电压保持不变，则表示乘积为“0”。Specifically, during the multiplication operation performed by the circuit of this embodiment, when the bit line voltage of the calculated bit line CBL rises, it indicates that the product is "+1"; when the bit line voltage of the calculated bit line CBL drops, it indicates that the product is "-1"; when the bit line voltage of the calculated bit line CBL remains unchanged, it indicates that the product is "0".

为了使得本实施例提供的带符号乘法的存内运算电路的原理和性能更加清楚，以下结合不同第一操作数和第二操作数的6种不同的运算过程，对电路的工作过程进行详细的说明：In order to make the principle and performance of the in-memory operation circuit with signed multiplication provided by this embodiment clearer, the operation process of the circuit is described in detail below in combination with six different operation processes of different first operands and second operands:

1、（+1）×11. (+1) × 1

首先，将第二操作数“1”预存在6T-SRAM单元中，并将CBL预充到VDD/2。此时，6T-SRAM单元中的存储节点Q为高电平、QB为低电平。First, the second operand "1" is pre-stored in the 6T-SRAM cell, and CBL is pre-charged to VDD/2. At this time, the storage node Q in the 6T-SRAM cell is at a high level, and QB is at a low level.

然后，将WLL置低电平，WLR置高电平，此时，N1管保持关断，N2管被打开，QB端的数据通过N2管传到P1管的栅端，P1管被打开。与此同时，将INN端置低电位，INP端置低电位，则P2管也被打开。此时，CBL至VDD之间的充电路径被打开，而由于N3和N4未能导通，则CBL至VSS之间的放电路径保持关闭。因此，计算位线CBL上的位线电压会从VDD/2逐渐升高到VDD。CBL的位线电压升高，表示乘积结果为（+1）。Then, WLL is set to a low level and WLR is set to a high level. At this time, the N1 tube remains off, the N2 tube is turned on, and the data at the QB end is transmitted to the gate end of the P1 tube through the N2 tube, and the P1 tube is turned on. At the same time, the INN end is set to a low potential, and the INP end is set to a low potential, then the P2 tube is also turned on. At this time, the charging path between CBL and VDD is opened, and because N3 and N4 fail to turn on, the discharge path between CBL and VSS remains closed. Therefore, the bit line voltage on the calculated bit line CBL will gradually increase from VDD/2 to VDD. The bit line voltage of CBL increases, indicating that the product result is (+1).

即完成运算：（+1）×1=（+1）。The operation is completed: (+1)×1= (+1).

2、（-1）×12. (-1) × 1

然后，将WLL置高电平，WLR置低电平，此时，N1管被打开，N2管保持关断，Q端的数据通过N1管传到N3管的栅端，N3管被打开。与此同时，将INN端置高电位，INP端置高电位，则N4管也被打开。此时，CBL至VSS之间的放电路被打开；而由于P1和P2未能导通，则CBL至VDD之间的充电路径被关闭。因此，计算位线CBL上的位线电压会从VDD/2逐渐降低到VSS。CBL的位线电压降低，表示乘积结果为（-1）。Then, WLL is set to a high level and WLR is set to a low level. At this time, N1 tube is turned on, N2 tube remains off, and the data at the Q end is transmitted to the gate end of N3 tube through N1 tube, and N3 tube is turned on. At the same time, the INN end is set to a high potential, and the INP end is set to a high potential, then N4 tube is also turned on. At this time, the discharge circuit between CBL and VSS is turned on; and because P1 and P2 fail to conduct, the charging path between CBL and VDD is closed. Therefore, the bit line voltage on the calculated bit line CBL will gradually decrease from VDD/2 to VSS. The bit line voltage of CBL decreases, indicating that the product result is (-1).

3、（+1）×03. (+1) × 0

首先，将第二操作数“0”预存在6T-SRAM单元中，并将CBL预充到VDD/2。此时，6T-SRAM单元中的存储节点Q为低电平、QB为高电平。First, the second operand "0" is pre-stored in the 6T-SRAM cell, and CBL is pre-charged to VDD/2. At this time, the storage node Q in the 6T-SRAM cell is at a low level, and QB is at a high level.

然后，将WLL置低电平，WLR置高电平，此时，N2管被打开，N1管保持关断，QB端的数据通过N2管传到P1管的栅端，P1管被关闭。与此同时，将INN端置低电位，INP端置低电位，则N4管也被关闭。在此状态下，由于N4和P1均关闭，则CBL至VSS之间的放电路被关闭，CBL至VDD之间的充电路径也被关闭。因此，计算位线CBL上的位线电压保持当前电平状态不变，则表示乘积结果为0。Then, WLL is set to a low level and WLR is set to a high level. At this time, N2 is turned on and N1 remains off. The data at the QB end is transmitted to the gate end of P1 through N2, and P1 is turned off. At the same time, the INN end is set to a low potential, and the INP end is set to a low potential, and the N4 tube is also turned off. In this state, since both N4 and P1 are turned off, the discharge circuit between CBL and VSS is turned off, and the charging path between CBL and VDD is also turned off. Therefore, if the bit line voltage on the bit line CBL remains unchanged at the current level, the product result is 0.

即完成运算：（+1）×0=0。The operation is completed: (+1)×0=0.

4、（-1）×04. (-1) × 0

然后，将WLL置高电平，WLR置低电平，此时，N1管被打开，N2管保持关断，Q端的数据通过N1管传到N3管的栅端，N3管被关闭。与此同时，将INN端置高电位，INP端置高电位，则P2管也被关闭。在此状态下，由于N3和P2均关闭，则CBL至VSS之间的放电路被关闭，CBL至VDD之间的充电路径也被关闭。因此，计算位线CBL上的位线电压保持当前电平状态不变，则表示乘积结果为0。Then, WLL is set to a high level and WLR is set to a low level. At this time, N1 tube is turned on, N2 tube remains off, and the data at the Q end is transmitted to the gate end of N3 tube through N1 tube, and N3 tube is turned off. At the same time, the INN end is set to a high potential, and the INP end is set to a high potential, then P2 tube is also turned off. In this state, since both N3 and P2 are turned off, the discharge circuit between CBL and VSS is turned off, and the charging path between CBL and VDD is also turned off. Therefore, if the bit line voltage on the bit line CBL remains unchanged at the current level, it means that the product result is 0.

即完成运算：（-1）×0=0。The operation is completed: (-1)×0=0.

5、0×15.0×1

然后，将WLL置低电平，WLR置低电平，此时，N1管和N2均被关断。与此同时，将INN端置低电位，INP端置高电位。在此状态下， CBL至VSS之间的放电路被关闭，CBL至VDD之间的充电路径也被关闭。因此，计算位线CBL上的位线电压保持当前电平状态不变，则表示乘积结果为0。Then, WLL is set to a low level, and WLR is set to a low level. At this time, both N1 and N2 are turned off. At the same time, the INN terminal is set to a low potential, and the INP terminal is set to a high potential. In this state, the discharge circuit between CBL and VSS is turned off, and the charging path between CBL and VDD is also turned off. Therefore, if the bit line voltage on the bit line CBL remains unchanged at the current level, the product result is 0.

即完成运算：0×1=0。The operation is completed: 0×1=0.

6、0×06.0×0

即完成运算：0×0=0。The calculation is completed: 0×0=0.

结合以上过程，总结本实施例提供的带符号乘法的存内运算电路执行带符号乘法运算过程的真值表如下表1所示：In combination with the above process, the truth table of the signed multiplication operation process performed by the in-memory operation circuit for signed multiplication provided in this embodiment is summarized as shown in Table 1 below:

表1：带符号乘法的存内运算电路运算过程的真值表Table 1: Truth table of the operation process of the in-memory operation circuit with signed multiplication

结合以上真值表可以发现：本实施例提供的包含权重存储部分以及一个计算部分的带符号乘法的存内运算电路，可以完整实现带符号的2bit数与不带符号的1bit数的乘法运算。In combination with the above truth table, it can be found that the in-memory operation circuit for signed multiplication provided in this embodiment, which includes a weight storage part and a calculation part, can fully realize the multiplication operation of a signed 2-bit number and an unsigned 1-bit number.

在本实施例进一步优化的方案中，如图3所示，每列运算单元的权重存储部分中包含按列排布的多个SRAM单元；各个SRAM单元连接在同一组位线BL和BLB上；各个SRAM单元还连接在对应行的独立的字线WLL和WLR上；同列的各个SRAM单元共用同一个计算部分的电路。In a further optimized solution of the present embodiment, as shown in FIG3 , the weight storage part of each column of operation units includes a plurality of SRAM cells arranged in columns; each SRAM cell is connected to the same group of bit lines BL and BLB; each SRAM cell is also connected to independent word lines WLL and WLR of the corresponding row; each SRAM cell in the same column shares the circuit of the same calculation part.

在图3所示的电路设计中，同一列中各行的SRAM单元均可以与下方的计算部分构成一个可以执行带符号乘法运算任务的基本单位，因而可以大幅节省电路中的面积开销。此外，该中同列的SRAM单元共用一个计算部分的设计，还允许在预存阶段向不同行分别预存不同的第二操作数，然后依次启用不同行，进而完成不同的乘法运算，以提高运算效率。进一步地，将这种设计应用于一个包含大量SRAM单元的阵列之后，还可以实现在某个SRAM单元参与乘法运算过程中，向其它不同行且不同列的SRAM单元中预存另一个运算任务的第二操作数，以提升电路处理大规模同类逻辑运算任务的工作效率。In the circuit design shown in FIG3 , the SRAM cells in each row of the same column can form a basic unit that can perform a signed multiplication task with the calculation part below, thereby greatly saving the area overhead in the circuit. In addition, the design in which the SRAM cells in the same column share a calculation part also allows different second operands to be pre-stored in different rows during the pre-storage stage, and then different rows are enabled in turn to complete different multiplication operations to improve the operation efficiency. Furthermore, after applying this design to an array containing a large number of SRAM cells, it is also possible to pre-store the second operand of another operation task in other SRAM cells in different rows and columns while a certain SRAM cell participates in the multiplication operation, so as to improve the work efficiency of the circuit in processing large-scale similar logical operation tasks.

在本实施例进一步优化的方案中，如图4所示，带符号乘法的存内运算电路中还可以包括多列运算单元，假设电路中运算单元的列数为N，N≥2。此时，通过调整各列计算部分挂载的电容C的电容大小，以及在相邻的各列运算单元之间分别设置一个用于连接二者的计算位线CBL的传输门TG。还可以实现对不同列的运算单元中执行的乘法运算的第二操作数的权重进行区分。In a further optimized solution of this embodiment, as shown in FIG4 , the in-memory operation circuit with signed multiplication may further include multiple columns of operation units, assuming that the number of columns of operation units in the circuit is N, N ≥ 2. At this time, by adjusting the capacitance of the capacitor C mounted on each column of the calculation part, and setting a transmission gate TG for connecting the calculation bit line CBL between adjacent columns of operation units, it is also possible to distinguish the weights of the second operands of the multiplication operations performed in operation units of different columns.

实现对第二操作数权重进行区分的原理是：当不同列中挂载的电容C的大小设置分别设置为8C，4C，2C，1C后，它们的电荷量保持不变分别是8Q，4Q，2Q，1Q。在此基础上，不同列中的运算中的运算结果体现在各自的计算位线CBL上，虽然每一列的位线电压只有VDD、VDD/2、VSS三种状态，但是当通过传输门将挂载不同电容的不同列的CBL连接起来后，不同列上的电容会进行电荷共享，进而导致计算位线CBL上的位线电压最终呈现更多不同等级的电压状态，这些不同等级的电压可以用来表征2bit第一操作数与Nbit的第二操作数的乘积。The principle of distinguishing the weight of the second operand is: when the size of the capacitors C mounted in different columns is set to 8C, 4C, 2C, and 1C respectively, their charges remain unchanged at 8Q, 4Q, 2Q, and 1Q respectively. On this basis, the calculation results in the calculations in different columns are reflected on their respective calculation bit lines CBL. Although the bit line voltage of each column has only three states: VDD, VDD/2, and VSS, when the CBLs of different columns with different capacitors are connected through the transmission gate, the capacitors on different columns will share charges, which will cause the bit line voltage on the calculation bit line CBL to finally present more different levels of voltage states. These different levels of voltage can be used to represent the product of the 2-bit first operand and the N-bit second operand.

基于图4中的电路方案，本实施例提供的包含多列运算单元的带符号乘法的存内运算电路，实现2bit第一操作数与Nbit的第二操作数的乘法运算的操作过程如下：Based on the circuit scheme in FIG. 4 , the in-memory operation circuit for signed multiplication including multiple columns of operation units provided in this embodiment implements the operation process of multiplying a 2-bit first operand and an N-bit second operand as follows:

（1）断开各运算列间的传输门TG；并将各两列的计算位线CBL预充到中间电位。(1) Disconnect the transmission gates TG between the operation columns; and precharge the calculation bit lines CBL of each of the two columns to the intermediate potential.

以执行2bit×2bit的乘法运算为例，该运算过程需要两列运算单元，其中一个运算单元上挂载的电容为1C，另一个运算单元上挂载的电容为2C。此时挂载1C电容的BLK为低位运算列，挂载2C电容的运算单元为高位运算列。Taking the execution of 2bit×2bit multiplication as an example, the operation process requires two columns of operation units, one of which is mounted with a 1C capacitor and the other with a 2C capacitor. At this time, the BLK mounted with a 1C capacitor is the low-order operation column, and the operation unit mounted with a 2C capacitor is the high-order operation column.

假设运算过程为“+1×11”，乘积结果为“+3”。此时在电路中，两个运算单元的CBL上的位线电压在电荷共享前均为VDD，电荷共享后CBL的位线电压仍为VDD，△V=VDD/2。Assume that the operation process is "+1×11" and the product result is "+3". At this time in the circuit, the bit line voltages on the CBL of the two operation units are both VDD before charge sharing, and the bit line voltages of the CBL are still VDD after charge sharing, △V=VDD/2.

假设运算过程为“+1×10”，乘积结果为“+2”。此时在电路中，低位运算单元的CBL上的位线电压在电荷共享前为VDD/2，高位运算单元的CBL上的位线电压在电荷共享前为VDD。考虑到低位运算单元挂载的电容为1C，高位运算单元挂载的电容为2C，则电荷共享后CBL的位线电压为5VDD/6，△V=VDD/3。Assume that the operation process is "+1×10", and the product result is "+2". At this time, in the circuit, the bit line voltage on the CBL of the low-order operation unit is VDD/2 before charge sharing, and the bit line voltage on the CBL of the high-order operation unit is VDD before charge sharing. Considering that the capacitor mounted on the low-order operation unit is 1C and the capacitor mounted on the high-order operation unit is 2C, the bit line voltage of CBL after charge sharing is 5VDD/6, △V=VDD/3.

假设运算过程为“+1×01”，乘积结果为“+2”。此时在电路中，低位运算单元的CBL上的位线电压在电荷共享前为VDD，低位运算单元的CBL上的位线电压在电荷共享前为VDD/2，考虑到低位运算单元挂载的电容为1C，高位运算单元挂载的电容为2C，则电荷共享后CBL的位线电压为2VDD/3，△V=VDD/6。Assume that the operation process is "+1×01" and the product result is "+2". At this time in the circuit, the bit line voltage on the CBL of the low-order operation unit is VDD before charge sharing, and the bit line voltage on the CBL of the low-order operation unit is VDD/2 before charge sharing. Considering that the capacitor mounted on the low-order operation unit is 1C and the capacitor mounted on the high-order operation unit is 2C, the bit line voltage of CBL after charge sharing is 2VDD/3, △V=VDD/6.

假设运算过程为“+1×00”，乘积结果为“+2”。此时在电路中，两个运算单元的CBL上的位线电压在电荷共享前均为VDD/2，在电荷共享后的位线电压仍为VDD/2，△V=0。Assume that the operation process is "+1×00" and the product result is "+2". At this time in the circuit, the bit line voltages on the CBL of the two operation units are both VDD/2 before charge sharing, and the bit line voltages after charge sharing are still VDD/2, and △V=0.

由此可见，当乘积结果呈+3、+2、+1、0梯级下降时，电荷共享后的CBL上的位线电压的变化量△V也呈VDD/2、VDD/3、VDD/6、0梯级下降，每一级的下降量为VDD/6。It can be seen that when the product result decreases in steps of +3, +2, +1, and 0, the change △V of the bit line voltage on the CBL after charge sharing also decreases in steps of VDD/2, VDD/3, VDD/6, and 0, and the decrease in each step is VDD/6.

总结规律为：在本实施例的电荷共享机制中，当乘积结果包括2^M种情况时，各个运算单元可以将CBL的位线电压的变化量（从VDD/2到VDD）划分为2^M个不同的梯度，并在不同梯度的△V和不同乘积结果的数字量间建立映射关系。The summary rule is: in the charge sharing mechanism of this embodiment, when the product result includes 2 ^M situations, each operation unit can divide the change of the bit line voltage of CBL (from VDD/2 to VDD) into 2 ^M different gradients, and establish a mapping relationship between △V of different gradients and the digital quantities of different product results.

以上仅以2bit带符号数为正数的例子进行介绍，基于相同原理，当2bit带符号数为负数，也应当具备相同规律。同理，当采用更多运算单元执行更高bit第二操作数的乘法运算时，也应当满足相关规律。The above only introduces the example of a 2-bit signed number being a positive number. Based on the same principle, when the 2-bit signed number is a negative number, the same rule should also apply. Similarly, when more operation units are used to perform multiplication operations of higher-bit second operands, the relevant rules should also be met.

因此，在本实施例方案中当各个运算单元上挂载不同大小的电容C之后，在本实施例的电荷共享机制下，2bit带符号第一操作数和Mbit无符号第二操作数的乘法运算结果会体现在计算位线CBL的位线电压的变化量上。通过逐次逼近ADC对电荷共享后的CBL上的位线电压的变化方向和具体数值进行量化，可以准确得到不同运算结果的数字量。Therefore, in the present embodiment, after capacitors C of different sizes are mounted on each operation unit, under the charge sharing mechanism of the present embodiment, the multiplication result of the 2-bit signed first operand and the M-bit unsigned second operand will be reflected in the change amount of the bit line voltage of the calculated bit line CBL. By quantifying the change direction and specific value of the bit line voltage on the CBL after charge sharing by the successive approximation ADC, the digital quantity of different operation results can be accurately obtained.

实施例2Example 2

在实施例1的基础上，本实施例进一步提供一种CIM芯片，该CIM芯片中集成有如实施例1中的带符号乘法的存内运算电路。如图5所示，该CIM芯片中包括一个N×N的SRAM阵列，阵列下方配置一个包括由N个如实施例1中计算部分的电路构成的计算模块；计算模块中各个单元电路的计算位线CBL通过N-1个传输门依次串联；且每一个单元电路上挂载的电容C的电容值呈8、4、2、1的梯级分布。此外，该CIM芯片中还包括与基于SRAM阵列实现数据读、写、保持功能相关的各种外围电路，On the basis of Example 1, this embodiment further provides a CIM chip, in which the in-memory operation circuit with signed multiplication as in Example 1 is integrated. As shown in FIG5 , the CIM chip includes an N×N SRAM array, and a calculation module consisting of N circuits such as the calculation part in Example 1 is configured below the array; the calculation bit lines CBL of each unit circuit in the calculation module are connected in series in sequence through N-1 transmission gates; and the capacitance value of the capacitor C mounted on each unit circuit is distributed in a stepwise manner of 8, 4, 2, and 1. In addition, the CIM chip also includes various peripheral circuits related to realizing data reading, writing, and holding functions based on the SRAM array,

本实施例提供的CIM芯片既具有与SRAM芯片相同的数据处理功能，还可以用于执行带符号数与单比特或多比特数的乘法运算。因而适合用于执行人工智能中CNN和DNN等神经网络的数据处理任务。The CIM chip provided in this embodiment has the same data processing function as the SRAM chip, and can also be used to perform multiplication operations of signed numbers and single-bit or multi-bit numbers. Therefore, it is suitable for performing data processing tasks of neural networks such as CNN and DNN in artificial intelligence.

性能测试Performance Testing

为了进一步验证本发明已提供的带符号乘法的存内运算电路的性能，技术人员制定实验计划，对图5中所示的电路的功能进行仿真实验：In order to further verify the performance of the in-memory operation circuit with signed multiplication provided by the present invention, the technicians formulated an experimental plan and conducted a simulation experiment on the function of the circuit shown in FIG5 :

1、带符号数与单比特权重的乘法1. Multiplication of signed numbers and single-bit weights

本实施例首先以电路其中一个6T-SRAM单元及其对应的计算部分的电路为实验对象，进行2bit带符号数乘以1bit权重的乘法运算，以验证电路执行带符号数乘法时（+1×1和-1×1）的运算性能。其中，将计算前（2ns之前）CBL的预充电压设定为VDD/2。This embodiment first uses one of the 6T-SRAM units in the circuit and its corresponding calculation part as the experimental object, and performs a multiplication operation of a 2-bit signed number by a 1-bit weight to verify the operation performance of the circuit when performing signed number multiplication (+1×1 and -1×1). The pre-charge voltage of CBL before calculation (2ns before) is set to VDD/2.

实验过程中计算位线CBL的信号变化如图6所示。分析图6中的信号流图可以发现：从2ns开始，电路开始进行2bit带符号数乘以1bit权重计算。当2bit带符号数为‘11’（指-1）时，WLL置高电位，WLR置低电位，符号位表示负、1bit权重为“1”，CBL放电到VSS。当2bit带符号数为‘01’（指+1）时，WLL置低电位，WLR置高电位，符号位表示正、1bit权重为“1”，CBL冲电到VDD。The signal changes of the bit line CBL during the experiment are shown in Figure 6. By analyzing the signal flow chart in Figure 6, it can be found that: starting from 2ns, the circuit starts to calculate the 2-bit signed number multiplied by the 1-bit weight. When the 2-bit signed number is '11' (referring to -1), WLL is set to a high potential, WLR is set to a low potential, the sign bit indicates negative, the 1-bit weight is "1", and CBL is discharged to VSS. When the 2-bit signed number is '01' (referring to +1), WLL is set to a low potential, WLR is set to a high potential, the sign bit indicates positive, the 1-bit weight is "1", and CBL is charged to VDD.

2、带符号数（-1）与4比特权重的乘法2. Multiplication of a signed number (-1) and a 4-bit weight

本实验进一步以电路中四个运算单元为操作对象，进行2bit带符号数乘以4bit权重计算。电路中，3个传输门将同一行的4个基于6T-SRAM单元的运算单元的CBL连接在一起，各类中挂载的电容C的大小分别是8C、4C、2C、1C。在本次运算过程，VDD设定为900mV，）CBL在2ns之前预充到达的设定电压为450mV。This experiment further uses the four operation units in the circuit as the operation object to perform a 2-bit signed number multiplied by a 4-bit weight calculation. In the circuit, three transmission gates connect the CBLs of the four operation units based on 6T-SRAM units in the same row. The sizes of the capacitors C mounted in each type are 8C, 4C, 2C, and 1C respectively. In this operation process, VDD is set to 900mV, and the set voltage reached by CBL before 2ns is 450mV.

实验过程中依次执行了“11”乘以“0000”到“1111”的所有计算任务，电路在2ns开始进行2bit带符号数乘以4bit权重计算，2.2ns时开始电荷共享：得到的CBL的信号流图如图7所示。During the experiment, all calculation tasks from "11" multiplied by "0000" to "1111" were executed in sequence. The circuit started to perform 2-bit signed number multiplied by 4-bit weight calculation at 2ns, and charge sharing started at 2.2ns. The signal flow graph of the obtained CBL is shown in Figure 7.

观察图中数据可以发现：当2bit带符号数为“11”（WLL置高电位，WLR置低电位，符号位表示负）、4bit权重为“0000”，四个CBL保持450mV不变，经过电荷共享后的CBL为448.55mV；4bit权重为“0001”，经过电荷共享后的CBL为421.32mV；4bit权重为“0010”，经过电荷共享后的CBL为391.05mV；4bit权重为“0011”，经过电荷共享后的CBL为358.97mV；4bit权重为“0100”，经过电荷共享后的CBL为331.07mV；4bit权重为“0101”，经过电荷共享后的CBL为300.72mV；4bit权重为“0110”，经过电荷共享后的CBL为268.75mV；4bit权重为“0111”，经过电荷共享后的CBL为243.42mV；4bit权重为“1000”，经过电荷共享后的CBL为209.92mV；4bit权重为“1001”，经过电荷共享后的CBL为181.73mV；4bit权重为“1010”，经过电荷共享后的CBL为150.36mV；4bit权重为“1011”，经过电荷共享后的CBL为119.17mV；4bit权重为“1100”，经过电荷共享后的CBL为91.92mV；4bit权重为“1101”，经过电荷共享后的CBL为62.03mV；4bit权重为“1110”，经过电荷共享后的CBL为33.25mV；4bit权重为“1111”，经过电荷共享后的CBL为1.99mV。By observing the data in the figure, we can find that: when the 2-bit signed number is "11" (WLL is set to high potential, WLR is set to low potential, and the sign bit indicates negative), the 4-bit weight is "0000", the four CBLs remain unchanged at 450mV, and the CBL after charge sharing is 448.55mV; the 4-bit weight is "0001", and the CBL after charge sharing is 421.32mV; the 4-bit weight is "0010", and the CBL after charge sharing is 391.05mV; the 4-bit weight is "0011", and the CBL after charge sharing is 358.97mV; the 4-bit weight is "0100", and the CBL after charge sharing is 331.07mV; the 4-bit weight is "0101", and the CBL after charge sharing is 300.72mV; the 4-bit weight is "0110", and the CBL after charge sharing is 268.75mV; The 4-bit weight is "0111", and the CBL after charge sharing is 243.42mV; the 4-bit weight is "1000", and the CBL after charge sharing is 209.92mV; the 4-bit weight is "1001", and the CBL after charge sharing is 181.73mV; the 4-bit weight is "1010", and the CBL after charge sharing is 150.36mV; the 4-bit weight is "1011", and the CBL after charge sharing is 119.17mV; the 4-bit weight is "1100", and the CBL after charge sharing is 91.92mV; the 4-bit weight is "1101", and the CBL after charge sharing is 62.03mV; the 4-bit weight is "1110", and the CBL after charge sharing is 33.25mV; the 4-bit weight is "1111", and the CBL after charge sharing is 1.99mV.

对数据进行分析可知：在多比特乘法运算过程中，各个运算结果的差值的波动在误差允许范围内，本电路在放电时具有较好的线性度，电路运算结果的可靠性较高。Analysis of the data shows that: during the multi-bit multiplication operation, the fluctuation of the difference between the various operation results is within the allowable error range. The circuit has good linearity during discharge, and the reliability of the circuit operation results is high.

3、带符号数（+1）与4比特权重的乘法3. Multiplication of a signed number (+1) and a 4-bit weight

本实验继续以电路中四个运算单元为操作对象，进行2bit带符号数乘以4bit权重计算。电路中，3个传输门将同一行的4个基于6T-SRAM单元的运算单元的CBL连接在一起，各类中挂载的电容C的大小分别是8C、4C、2C、1C。在本次运算过程，VDD设定为900mV，）CBL在2ns之前预充到达的设定电压为450mV。This experiment continues to use the four operation units in the circuit as the operation object, and performs a 2-bit signed number multiplied by a 4-bit weight calculation. In the circuit, three transmission gates connect the CBLs of the four operation units based on 6T-SRAM units in the same row together. The sizes of the capacitors C mounted in each type are 8C, 4C, 2C, and 1C respectively. In this operation process, VDD is set to 900mV, and the set voltage reached by CBL before 2ns is 450mV.

实验过程中依次执行了“01”乘以“0000”到“1111”的所有计算任务，电路在2ns开始进行2bit带符号数乘以4bit权重计算，2.2ns时开始电荷共享：得到的CBL的信号流图如图8所示。During the experiment, all calculation tasks from "01" multiplied by "0000" to "1111" were executed in sequence. The circuit started to perform 2-bit signed number multiplied by 4-bit weight calculation at 2ns, and charge sharing started at 2.2ns: the signal flow graph of the obtained CBL is shown in Figure 8.

分析图中数据可以发现：Analyzing the data in the figure, we can find that:

当2bit带符号数为“01”（WLL置低电位，WLR置高电位，符号位表示正）、4bit权重为“0000”，四个CBL保持450mV不变，经过电荷共享后的CBL为450mV；4bit权重为“0001”，经过电荷共享后的CBL为481.25mV；4bit权重为“0010”，经过电荷共享后的CBL为510.02mV；4bit权重为“0011”，经过电荷共享后的CBL为539.11mV；4bit权重为“0100”，经过电荷共享后的CBL为570.07mV；4bit权重为“0101”，经过电荷共享后的CBL为599.48mV；4bit权重为“0110”，经过电荷共享后的CBL为630.58mV；4bit权重为“0111”，经过电荷共享后的CBL为661.94mV；4bit权重为“1000”，经过电荷共享后的CBL为688.31mV；4bit权重为“1001”，经过电荷共享后的CBL为719.83mV；4bit权重为“1010”，经过电荷共享后的CBL为752.36mV；4bit权重为“1011”，经过电荷共享后的CBL为780.66mV；4bit权重为“1100”，经过电荷共享后的CBL为811.02mV；4bit权重为“1101”，经过电荷共享后的CBL为840.83mV；4bit权重为“1110”，经过电荷共享后的CBL为868.52mV；4bit权重为“1111”，经过电荷共享后的CBL为898.87mV。When the 2-bit signed number is "01" (WLL is set to low potential, WLR is set to high potential, and the sign bit indicates positive), and the 4-bit weight is "0000", the four CBLs remain unchanged at 450mV, and the CBL after charge sharing is 450mV; the 4-bit weight is "0001", and the CBL after charge sharing is 481.25mV; the 4-bit weight is "0010", and the CBL after charge sharing is 510.02mV; the 4-bit weight is "0011", and the CBL after charge sharing is 539.11mV; the 4-bit weight is "0100", and the CBL after charge sharing is 570.07mV; the 4-bit weight is "0101", and the CBL after charge sharing is 599.48mV; the 4-bit weight is "0110", and the CBL after charge sharing is 630.58mV; the 4-bit weight is "0 4-bit weight is "111", and the CBL after charge sharing is 661.94mV; the 4-bit weight is "1000", and the CBL after charge sharing is 688.31mV; the 4-bit weight is "1001", and the CBL after charge sharing is 719.83mV; the 4-bit weight is "1010", and the CBL after charge sharing is 752.36mV; the 4-bit weight is "1011", and the CBL after charge sharing is 780.66mV; the 4-bit weight is "1100", and the CBL after charge sharing is 811.02mV; the 4-bit weight is "1101", and the CBL after charge sharing is 840.83mV; the 4-bit weight is "1110", and the CBL after charge sharing is 868.52mV; the 4-bit weight is "1111", and the CBL after charge sharing is 898.87mV.

在多比特乘法运算过程中，各个运算结果的差值的波动在误差允许范围内，本电路在充电时也具有较好的线性度，电路运算结果的可靠性较高。During the multi-bit multiplication operation, the fluctuation of the difference between the various operation results is within the allowable error range. The circuit also has good linearity during charging, and the reliability of the circuit operation results is high.

以上所述实施例的各技术特征可以进行任意的组合，为使描述简洁，未对上述实施例中的各个技术特征所有可能的组合都进行描述，然而，只要这些技术特征的组合不存在矛盾，都应当认为是本说明书记载的范围。The technical features of the above-described embodiments may be arbitrarily combined. To make the description concise, not all possible combinations of the technical features in the above-described embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

以上所述实施例仅表达了本发明的几种实施方式，其描述较为具体和详细，但并不能因此而理解为对发明专利范围的限制。应当指出的是，对于本领域的普通技术人员来说，在不脱离本发明构思的前提下，还可以做出若干变形和改进，这些都属于本发明的保护范围。因此，本发明专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only express several implementation methods of the present invention, and the descriptions thereof are relatively specific and detailed, but they cannot be understood as limiting the scope of the invention patent. It should be pointed out that, for ordinary technicians in this field, several variations and improvements can be made without departing from the concept of the present invention, and these all belong to the protection scope of the present invention. Therefore, the protection scope of the patent of the present invention shall be subject to the attached claims.

Claims

1. An in-memory arithmetic circuit with signed multiplication, characterized in that it comprises at least one column of arithmetic units, each column of arithmetic units comprising a weight storage part and a calculation part; wherein, the weight stores the part, it adopts SRAM cell with double word lines; the drains of two transmission pipes in the SRAM unit are respectively connected to two bit lines, and the gates of the two transmission pipes are respectively connected to two word lines WLL and WLR; the computing part comprises two NMOS tubes N3 and N4, two PMOS tubes P1 and P2 and a capacitor C; the circuit connection relation is as follows:

The drains of P1 and N3 are connected to the computation bit line CBL; the gate of N3 is connected with the bit line BL, and the gate of P1 is connected with the bit line BLB; the source electrode of N3 is connected with the drain electrode of N4; the source electrode of the P1 is connected with the drain electrode of the P2; the grid electrode of N4 is connected with an input word line INN; the grid electrode of P2 is connected with an input word line INP; the source electrode of N4 is connected with VSS; the source electrode of P2 is connected with VDD; one end of the capacitor C is connected to the computation bit line CBL, and the other end is connected to VSS;

The operation unit of each column is used for realizing multiplication operation between a signed 2bit first operand and an unsigned second operand; the operation logic for performing the multiplication operation is:

Pre-storing a second operand in the SRAM cell; calculating the bit line CBL to be precharged to an intermediate potential; encoding the input first operand by the level states of WLL, WLR, INN and INP; the product of the first operand and the second operand is reflected in a change in the bit line voltage of the computation bit line CBL;

When the first operand in the multiplication operation is "+1", WLL, INN and INP are set to low level, and WLR is set to high level; when the first operand in the multiplication operation is "-1", WLL, INN and INP are set to high level, and WLR is set to low level; when the first operand in the multiplication operation is "0", WLL and INN are set low, and WLR and INP are set high.

2. The in-memory arithmetic circuit of signed multiplication of claim 1, wherein: the SRAM unit adopts a 6T-SRAM unit or other SRAM units with double word lines, which are obtained by adding MOS tubes on the basis of the 6T-SRAM unit.

3. The in-memory arithmetic circuit of signed multiplication of claim 2, wherein: the 6T-SRAM unit comprises two NMOS transistors N1 and N2 and two inverters INV0 and INV1; the circuit connection relationship is as follows: the input end of the INV0 and the output end of the INV1 are connected with the source electrode of the N1 and serve as a storage node Q; the output end of the INV0 and the input end of the INV1 are connected with the source electrode of the N2 and serve as a storage node QB; the drains of N1 and N2 are connected to bit lines BL and BLB, respectively, and the gates of N1 and N2 are connected to word lines WLL and WLR, respectively.

4. The in-memory arithmetic circuit of signed multiplication of claim 3, wherein: during the multiplication operation, when the bit line voltage of the calculated bit line CBL rises, the product is expressed as "+1"; when the bit line voltage of the calculated bit line CBL drops, the product is represented as "-1"; when the bit line voltage of the calculated bit line CBL remains unchanged, the product is represented as "0".

5. The in-memory arithmetic circuit of signed multiplication of claim 1, wherein: the weight storage part comprises a plurality of SRAM units which are arranged in columns; each SRAM cell is connected to the same set of bit lines BL and BLB; each SRAM cell is also connected to an independent word line WLL and WLR of the corresponding row; the same column of SRAM cells share the same circuitry of the computation portion.

6. The in-memory arithmetic circuit of signed multiplication of claim 1, wherein: in the calculation section, the aspect ratio of the transistors N3 and N4 is the same; the aspect ratio of transistors P1 and P2 is the same.

7. The in-memory arithmetic circuit of signed multiplication of claim 1, wherein: the weight of a second operand of multiplication operation executed in operation units of different columns is distinguished by adjusting the capacitance of a capacitor C in each column of calculation parts; the multiple of the capacitance C in each column relative to the unit capacitance is the weight of the second operand in the operation process.

8. The in-memory arithmetic circuit of signed multiplication of claim 7, wherein: the device comprises N rows of operation units, wherein the multiplying power of capacitance values of capacitors C in each row of operation units is 1, 2, 4, 8, … and 2 ^N-1 respectively; a transmission gate TG for connecting a calculation bit line CBL of each two adjacent columns of operation units is arranged between each two adjacent columns of operation units respectively;

The strategy for implementing multiplication operation of a 2bit first operand and a Nbit second operand based on a multi-column arithmetic unit is as follows:

Disconnecting transmission gates among operation columns; and precharging the computation bit lines CBL of each two columns to an intermediate potential;

Decomposing a second operand of Nbit into N single-bit numbers according to bits, and pre-storing each single-bit number into weight storage parts of different columns according to corresponding weights;

Synchronizing input of the encoded first operand to all selected columns of SRAM cells through WLL, WLR, INN and INP; and then completing multiplication operation of one bit in the first operand and the second operand in each column;

Closing transmission gates among operation columns, wherein the product of the 2bit first operand and the Nbit second operand is reflected on the change of bit line voltage of a calculation bit line CBL; the change direction of the bit line voltage of the CBL reflects the sign of the product, and the change amplitude of the CBL reflects the numerical value of the product.

9. A CIM chip, characterized in that: integrated with an in-memory arithmetic circuit of signed multiplication as claimed in any one of claims 1 to 8.