CN118138026A - Delay buffer unit, electronic device, delay buffer array and operation method thereof - Google Patents
Delay buffer unit, electronic device, delay buffer array and operation method thereof Download PDFInfo
- Publication number
- CN118138026A CN118138026A CN202311842660.XA CN202311842660A CN118138026A CN 118138026 A CN118138026 A CN 118138026A CN 202311842660 A CN202311842660 A CN 202311842660A CN 118138026 A CN118138026 A CN 118138026A
- Authority
- CN
- China
- Prior art keywords
- delay
- delay buffer
- control switch
- unit
- memristor
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03K—PULSE TECHNIQUE
- H03K17/00—Electronic switching or gating, i.e. not by contact-making and –breaking
- H03K17/28—Modifications for introducing a time delay before switching
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C13/00—Digital stores characterised by the use of storage elements not covered by groups G11C11/00, G11C23/00, or G11C25/00
- G11C13/0002—Digital stores characterised by the use of storage elements not covered by groups G11C11/00, G11C23/00, or G11C25/00 using resistive RAM [RRAM] elements
- G11C13/0021—Auxiliary circuits
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03K—PULSE TECHNIQUE
- H03K17/00—Electronic switching or gating, i.e. not by contact-making and –breaking
- H03K17/28—Modifications for introducing a time delay before switching
- H03K17/284—Modifications for introducing a time delay before switching in field effect transistor switches
Landscapes
- Pulse Circuits (AREA)
Abstract
本公开提供一种延时缓冲单元、延时缓冲阵列、电子装置及延时缓冲阵列的操作方法,该延时缓冲单元包括串联的第一延时缓冲子单元以及第二延时缓冲子单元,其中,第一延时缓冲子单元以及第二延时缓冲子单元分别被配置为对于经过延时缓冲单元的脉冲信号的边沿斜率的影响相反,通过该延时缓冲单元,能够用于存算一体以实现矩阵向量乘法运算,能够在降低存算一体电路硬件开销的同时保证输出的延时信号仍具有斜率较为理想的下降沿和上升沿。
The present disclosure provides a delay buffer unit, a delay buffer array, an electronic device and an operation method of the delay buffer array, wherein the delay buffer unit comprises a first delay buffer sub-unit and a second delay buffer sub-unit connected in series, wherein the first delay buffer sub-unit and the second delay buffer sub-unit are respectively configured to have opposite effects on the edge slope of a pulse signal passing through the delay buffer unit. Through the delay buffer unit, it can be used for storage and calculation in one to realize matrix-vector multiplication operations, and can reduce the hardware overhead of the storage and calculation in one circuit while ensuring that the output delay signal still has a falling edge and a rising edge with a relatively ideal slope.
Description
技术领域Technical Field
本公开的实施例涉及一种延时缓冲单元、延时缓冲单元阵列、电子装置及延时缓冲单元阵列的操作方法。Embodiments of the present disclosure relate to a delay buffer unit, a delay buffer unit array, an electronic device, and an operating method of the delay buffer unit array.
背景技术Background technique
忆阻器(例如,阻变存储器、相变存储器、导电桥存储器等)是一种可以通过施加外部激励,调节其电导状态的非易失型器件。忆阻器作为一种二端器件,具有电阻可调节且非挥发的特性,因此被广泛应用于存算一体。根据基尔霍夫电流定律和欧姆定律,由忆阻器构成的阵列可以并行地完成乘累加计算,且存储和计算都发生在阵列各器件中。Memristors (e.g., resistive random access memory, phase change memory, conductive bridge memory, etc.) are non-volatile devices whose conductance state can be adjusted by applying external stimuli. Memristors are two-terminal devices with adjustable resistance and non-volatility, so they are widely used in storage and computing. According to Kirchhoff's current law and Ohm's law, an array composed of memristors can perform multiplication and accumulation calculations in parallel, and both storage and calculation occur in each device in the array.
存算一体计算架构由于避免了存储墙问题,可以实现较高的算力以及能效。非易失性存储器存算一体系统利用其交叉型阵列结构可以将矩阵向量乘的复杂度由o(n2)降低至o(1)。The storage-computing integrated computing architecture can achieve higher computing power and energy efficiency by avoiding the storage wall problem. The non-volatile memory storage-computing integrated system can reduce the complexity of matrix-vector multiplication from O(n2) to O(1) by using its cross-array structure.
公开内容Public Content
本公开的一些实施例提供一种延时缓冲单元,所述延时缓冲单元包括串联的第一延时缓冲子单元以及第二延时缓冲子单元,第一延时缓冲子单元包括第一级反相器、第二级反相器以及第一延时调节子单元,所述第一级反相器的输入端作为所述延时缓冲单元的输入端,所述第二级反相器的输入端连接所述第一级反相器的输出端,所述第一级延时调节子单元与所述第一级反相器的第一端、第一操作电压端以及第二操作电压端连接,第二延时缓冲子单元包括第三级反相器、第四级反相器以及第二延时调节子单元,所述第三级反相器的输入端连接所述第二级反相器的输出端,所述第四级反相器的输入端连接所述第三级反相器的输出端,所述第四级反相器的输出端作为所述延时缓冲单元的输出端,所述第二级延时调节子单元与所述第三级反相器的第二端、第三操作电压端以及第四操作电压端连接,其中,所述第一延时调节子单元包括第一忆阻器以及第一控制开关,被配置为根据所述第一控制开关确定是否使用所述第一忆阻器调节所述第一延时缓冲子单元的第一延时,所述第二延时调节子单元包括第二忆阻器以及第二控制开关,被配置为根据所述第二控制开关确定是否使用所述第二忆阻器调节所述第二延时缓冲子单元的第二延时,所述第一延时缓冲子单元以及所述第二延时缓冲子单元分别被配置为对于经过所述延时缓冲单元的脉冲信号的边沿斜率的影响相反。Some embodiments of the present disclosure provide a delay buffer unit, the delay buffer unit comprising a first delay buffer subunit and a second delay buffer subunit connected in series, the first delay buffer subunit comprising a first-stage inverter, a second-stage inverter and a first delay adjustment subunit, the input end of the first-stage inverter serving as the input end of the delay buffer unit, the input end of the second-stage inverter connected to the output end of the first-stage inverter, the first-stage delay adjustment subunit connected to the first end, the first operating voltage end and the second operating voltage end of the first-stage inverter, the second delay buffer subunit comprising a third-stage inverter, a fourth-stage inverter and a second delay adjustment subunit, the input end of the third-stage inverter connected to the output end of the second-stage inverter, the input end of the fourth-stage inverter connected to the output end of the third-stage inverter, The output end of the fourth-stage inverter serves as the output end of the delay buffer unit, and the second-stage delay adjustment subunit is connected to the second end, the third operating voltage end and the fourth operating voltage end of the third-stage inverter, wherein the first delay adjustment subunit includes a first memristor and a first control switch, and is configured to determine whether to use the first memristor to adjust the first delay of the first delay buffer subunit according to the first control switch, and the second delay adjustment subunit includes a second memristor and a second control switch, and is configured to determine whether to use the second memristor to adjust the second delay of the second delay buffer subunit according to the second control switch, and the first delay buffer subunit and the second delay buffer subunit are respectively configured to have opposite effects on the edge slope of the pulse signal passing through the delay buffer unit.
例如,在本公开一些实施例提供的一种延时缓冲单元中,所述第一操作电压端以及所述第二操作电压端的电压低于所述第三操作电压端以及所述第四操作电压端的电压,所述第一操作电压端以及所述第二操作电压端至少之一在所述第一延时缓冲子单元工作时作为放电电压端,所述第三操作电压端以及所述第四操作电压端至少之一在所述第二延时缓冲子单元工作时作为充电电压端。For example, in a delay buffer unit provided in some embodiments of the present disclosure, the voltages of the first operating voltage terminal and the second operating voltage terminal are lower than the voltages of the third operating voltage terminal and the fourth operating voltage terminal, and at least one of the first operating voltage terminal and the second operating voltage terminal serves as a discharge voltage terminal when the first delay buffer sub-unit is working, and at least one of the third operating voltage terminal and the fourth operating voltage terminal serves as a charging voltage terminal when the second delay buffer sub-unit is working.
例如,在本公开一些实施例提供的一种延时缓冲单元中,所述第一控制开关的类型与所述第二控制开关的类型不同。For example, in a delay buffer unit provided in some embodiments of the present disclosure, the type of the first control switch is different from the type of the second control switch.
例如,在本公开一些实施例提供的一种延时缓冲单元中,所述第一控制开关包括第一极、第二极和第一控制极,所述第一控制极接收第一控制信号以根据所述第一控制信号将所述第一控制开关的第一极和第二极导通或截止,所述第一控制开关的第一极与所述第一级反相器的第一端电连接,所述控制开关的第二极与所述第一操作电压端电连接,所述第一忆阻器的第一端与所述第一控制开关的第一极和所述第一级反相器的第一端电连接,所述第一忆阻器的第二端与所述第二操作电压端电连接;所述第二控制开关包括第一极、第二极和第二控制极,所述第二控制极接收第二控制信号以根据所述第二控制信号将所述第二控制开关的第一极和第二极导通或截止,所述第二控制开关的第一极与所述第三级反相器的第二端电连接,所述第二控制开关的第二极与所述第三操作电压端电连接,所述第二忆阻器的第一端与所述第二控制开关的第一极和所述第三级反相器的第二端电连接,所述第二忆阻器的第二端与所述第四操作电压端电连接;所述第一控制信号和所述第二控制信号彼此反相。For example, in a delay buffer unit provided in some embodiments of the present disclosure, the first control switch includes a first pole, a second pole and a first control pole, the first control pole receives a first control signal to turn on or off the first pole and the second pole of the first control switch according to the first control signal, the first pole of the first control switch is electrically connected to the first end of the first-stage inverter, the second pole of the control switch is electrically connected to the first operating voltage end, the first end of the first memristor is electrically connected to the first pole of the first control switch and the first end of the first-stage inverter, and the second end of the first memristor is electrically connected to the second operating voltage end; the second control switch includes a first pole, a second pole and a second control pole, the second control pole receives a second control signal to turn on or off the first pole and the second pole of the second control switch according to the second control signal, the first pole of the second control switch is electrically connected to the second end of the third-stage inverter, the second pole of the second control switch is electrically connected to the third operating voltage end, the first end of the second memristor is electrically connected to the first pole of the second control switch and the second end of the third-stage inverter, and the second end of the second memristor is electrically connected to the fourth operating voltage end; the first control signal and the second control signal are inverted to each other.
例如,在本公开一些实施例提供的一种延时缓冲单元中,所述第一延时调节子单元包括第一调节电容,且被配置为根据所述第一控制信号使用所述第一忆阻器和所述第一调节电容调节所述第一延时缓冲子单元的延时,所述第二延时调节子单元包括第二调节电容,且被配置为根据第二控制信号使用所述第二忆阻器和所述第二调节电容调节所述第二延时缓冲子单元的延时。For example, in a delay buffer unit provided in some embodiments of the present disclosure, the first delay adjustment subunit includes a first adjustment capacitor, and is configured to adjust the delay of the first delay buffer subunit using the first memristor and the first adjustment capacitor according to the first control signal, and the second delay adjustment subunit includes a second adjustment capacitor, and is configured to adjust the delay of the second delay buffer subunit using the second memristor and the second adjustment capacitor according to the second control signal.
例如,在本公开一些实施例提供的一种延时缓冲单元中,所述第一调节电容的电容量与所述第二调节电容的电容量相同。For example, in a delay buffer unit provided in some embodiments of the present disclosure, the capacitance of the first adjustment capacitor is the same as the capacitance of the second adjustment capacitor.
本公开一些实施例还提供一种延时缓冲阵列,包括多个如上述任一实施例所述的延时缓冲单元,其中,所述多个延时缓冲单元排列为具有多行的阵列,并且每一行中的延时缓冲单元依次串联以构成一个延时链。Some embodiments of the present disclosure further provide a delay buffer array, comprising a plurality of delay buffer units as described in any of the above embodiments, wherein the plurality of delay buffer units are arranged in an array having a plurality of rows, and the delay buffer units in each row are sequentially connected in series to form a delay chain.
例如,在本公开一些实施例提供的一种延时缓冲阵列中,所述延时缓冲阵列还包括:时间脉冲输入模块,被配置为多个延时链的输入端分别提供时间脉冲信号;编程电压产生模块,被配置为对所述目标延时缓冲单元进行忆阻器编程;输入加载模块,被配置为为每个延时缓冲单元提供第一控制信号和第二控制信号作为输入信号;输出量化模块,被配置为对所述多个延时链的输出分别进行量化以得到数字输出信号;数据存储模块,被配置为存储所述延时缓冲阵列进行计算时的数据;模式控制模块,被配置为控制所述延时缓冲阵列执行的操作模式。For example, in a delay buffer array provided in some embodiments of the present disclosure, the delay buffer array also includes: a time pulse input module, configured to provide time pulse signals to the input ends of multiple delay chains respectively; a programming voltage generation module, configured to perform memristor programming on the target delay buffer unit; an input loading module, configured to provide a first control signal and a second control signal as input signals for each delay buffer unit; an output quantization module, configured to quantize the outputs of the multiple delay chains respectively to obtain digital output signals; a data storage module, configured to store data when the delay buffer array performs calculations; and a mode control module, configured to control the operation mode executed by the delay buffer array.
本公开一些实施例还提供一种电子装置,包括如上述任一实施例所述的延时缓冲阵列。Some embodiments of the present disclosure further provide an electronic device, comprising a delay buffer array as described in any of the above embodiments.
本公开一些实施例还提供一种延时缓冲阵列的操作方法,应用于上述任一实施例所述的延时缓冲阵列,所述延时缓冲阵列的操作方法包括:在所述延时缓冲阵列中被选择用于计算操作的延时链的输入端提供时间脉冲信号;对所述延时缓冲阵列中被选择用于计算操作的延时链中每个延时缓冲单元分别施加作为输入数据信号的第一控制信号和第二控制信号,其中,所述输入数据信号控制所述输入数据信号对应的延时缓冲单元的第一控制开关以及第二控制开关导通或截止;在所述延时缓冲阵列中被选择用于计算操作的延时链的输出端获取对象输出信号。Some embodiments of the present disclosure also provide an operating method for a delay buffer array, which is applied to the delay buffer array described in any of the above embodiments, and the operating method of the delay buffer array includes: providing a time pulse signal at the input end of the delay chain selected for computing operations in the delay buffer array; applying a first control signal and a second control signal as input data signals to each delay buffer unit in the delay chain selected for computing operations in the delay buffer array, respectively, wherein the input data signal controls the first control switch and the second control switch of the delay buffer unit corresponding to the input data signal to be turned on or off; and obtaining an object output signal at the output end of the delay chain selected for computing operations in the delay buffer array.
例如,在本公开一些实施例提供的一种延时缓冲阵列的操作方法中,施加到同一被选择用于计算操作的延时链中的每个延时缓冲单元的输入数据信号分别为多位数据中的1位。For example, in a method for operating a delay buffer array provided in some embodiments of the present disclosure, the input data signal applied to each delay buffer unit in the same delay chain selected for computing operation is 1 bit of multi-bit data.
例如,在本公开一些实施例提供的一种延时缓冲阵列的操作方法中,所述在所述延时缓冲阵列中被选择用于计算操作的延时链的输出端获取输出信号之后,所述延时缓冲阵列的操作方法还包括:获取所述时间脉冲信号与所述对象输出信号之间的延时;对所述延时进行量化以得到数字输出信号;将所述数字输出信号减去校验数据以得到所述延时缓冲阵列的校验输出。For example, in an operating method of a delay buffer array provided in some embodiments of the present disclosure, after obtaining an output signal at the output end of the delay chain selected for calculation operation in the delay buffer array, the operating method of the delay buffer array further includes: obtaining the delay between the time pulse signal and the object output signal; quantizing the delay to obtain a digital output signal; and subtracting verification data from the digital output signal to obtain a verification output of the delay buffer array.
本公开一些实施例还提供另一种延时缓冲阵列的操作方法,应用于上述任一实施例所述的延时缓冲阵列,所述操作方法包括:选择所述延时缓冲阵列中的一个延时缓冲单元作为待编程单元;施加所述第一控制信号控制所述待编程单元中的第一控制开关,并且对所述待编程单元中的第一忆阻器的两端施加第一编程电压以改变所述第一忆阻器的阻值;或者,施加所述第二控制信号控制所述待编程单元中的第二控制开关,并且对所述待编程单元中的第二忆阻器的两端施加第二编程电压以改变所述第二忆阻器的阻值。Some embodiments of the present disclosure also provide another method for operating a delay buffer array, which is applied to the delay buffer array described in any of the above embodiments, and the operating method includes: selecting a delay buffer unit in the delay buffer array as a unit to be programmed; applying the first control signal to control the first control switch in the unit to be programmed, and applying a first programming voltage to both ends of a first memristor in the unit to be programmed to change the resistance value of the first memristor; or, applying the second control signal to control the second control switch in the unit to be programmed, and applying a second programming voltage to both ends of a second memristor in the unit to be programmed to change the resistance value of the second memristor.
例如,在本公开一些实施例提供的一种延时缓冲阵列的操作方法中,改变所述第一忆阻器的阻值或改变所述第二忆阻器的阻值之后,所述操作方法还包括:在所述待编程单元对应的延时链输入端提供时间脉冲信号;控制所述待编程单元的第一控制开关截止以及第二控制开关导通,且控制所述待编程单元对应的延时链中除所述待编程单元之外的延时缓冲单元的第一控制开关以及第二控制开关导通,并获取所述待编程单元对应的延时链的输出信号以作为第一编程输出;或者,控制所述待编程单元的第一控制开关导通以及第二控制开关截止,且控制所述待编程单元对应的延时链中除所述待编程单元之外的延时缓冲单元的第一控制开关以及第二控制开关导通,并获取所述待编程单元对应的延时链的输出信号以作为第二编程输出;根据所述第一编程输出或所述第二编程输出确定所述待编程单元的第一忆阻器的阻值或第二忆阻器的阻值是否符合编程需要。For example, in an operation method of a delay buffer array provided in some embodiments of the present disclosure, after changing the resistance value of the first memristor or changing the resistance value of the second memristor, the operation method also includes: providing a time pulse signal at the input end of the delay chain corresponding to the unit to be programmed; controlling the first control switch of the unit to be programmed to be turned off and the second control switch to be turned on, and controlling the first control switch and the second control switch of the delay buffer unit other than the unit to be programmed in the delay chain corresponding to the unit to be programmed to be turned on, and obtaining the output signal of the delay chain corresponding to the unit to be programmed as the first programming output; or, controlling the first control switch of the unit to be programmed to be turned on and the second control switch to be turned off, and controlling the first control switch and the second control switch of the delay buffer unit other than the unit to be programmed in the delay chain corresponding to the unit to be programmed to be turned on, and obtaining the output signal of the delay chain corresponding to the unit to be programmed as the second programming output; determining whether the resistance value of the first memristor or the resistance value of the second memristor of the unit to be programmed meets the programming requirements according to the first programming output or the second programming output.
本公开一些实施例还提供另一种延时缓冲阵列的操作方法,应用于上述任一实施例所述的延时缓冲阵列,所述操作方法包括:在所述延时缓冲阵列中被选择用于校验操作的延时链的输入端提供时间脉冲信号;对所述延时缓冲阵列中被选择用于校验操作的延时链中每个延时缓冲单元分别施加作为输入信号的第一控制信号和第二控制信号,以控制每个所述延时缓冲单元的第一控制开关以及第二控制开关导通;在所述延时缓冲阵列中被选择用于校验操作的延时链的输出端获取输出信号以作为本征输出;对所述本征输出与所述时间脉冲信号之间的延时进行量化以得到所述延时缓冲阵列中被选择用于校验操作的延时链的校验数据。Some embodiments of the present disclosure also provide another operation method of a delay buffer array, which is applied to the delay buffer array described in any of the above embodiments, and the operation method includes: providing a time pulse signal at the input end of the delay chain selected for the verification operation in the delay buffer array; applying a first control signal and a second control signal as input signals to each delay buffer unit in the delay chain selected for the verification operation in the delay buffer array, respectively, to control the conduction of the first control switch and the second control switch of each delay buffer unit; obtaining an output signal at the output end of the delay chain selected for the verification operation in the delay buffer array as an intrinsic output; and quantifying the delay between the intrinsic output and the time pulse signal to obtain verification data of the delay chain selected for the verification operation in the delay buffer array.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本公开实施例的技术方案,下面将对实施例的附图作简单地介绍,显而易见地,下面描述中的附图仅仅涉及本公开的一些实施例,而非对本公开的限制。In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings of the embodiments will be briefly introduced below. Obviously, the drawings in the following description only relate to some embodiments of the present disclosure, rather than limiting the present disclosure.
图1A为一种示例性延时缓冲单元的结构示意图;FIG1A is a schematic diagram of the structure of an exemplary delay buffer unit;
图1B为另一种示例性延时缓冲单元的结构示意图;FIG1B is a schematic diagram of the structure of another exemplary delay buffer unit;
图1C为一种示例性延时链的结构示意图;FIG1C is a schematic diagram of the structure of an exemplary delay chain;
图1D为一种示例性计算装置的结构示意图;FIG1D is a schematic diagram of the structure of an exemplary computing device;
图2为本公开至少一实施例提供的一种延时缓冲单元结构示意图;FIG2 is a schematic diagram of a delay buffer unit structure provided by at least one embodiment of the present disclosure;
图3A为本公开至少一实施例提供的一种延时缓冲单元工作原理示意图;FIG3A is a schematic diagram of the working principle of a delay buffer unit provided by at least one embodiment of the present disclosure;
图3B为本公开至少一实施例提供的一种延时缓冲单元工作原理示意图;FIG3B is a schematic diagram of the working principle of a delay buffer unit provided by at least one embodiment of the present disclosure;
图4为本公开至少一实施例提供的一种延时缓冲单元串联结构示意图;FIG4 is a schematic diagram of a series structure of delay buffer units provided by at least one embodiment of the present disclosure;
图5为本公开至少一实施例提供的一种延时缓冲阵列结构示意图;FIG5 is a schematic diagram of a delay buffer array structure provided by at least one embodiment of the present disclosure;
图6为本公开至少一实施例提供的一种延时缓冲阵列操作方法流程图;FIG6 is a flow chart of a delay buffer array operation method provided by at least one embodiment of the present disclosure;
图7为本公开至少一实施例提供的又一种延时缓冲阵列操作方法流程图;FIG7 is a flow chart of another delay buffer array operation method provided by at least one embodiment of the present disclosure;
图8A为本公开至少一实施例提供的一种延时缓冲单元编程校验示意图;FIG8A is a schematic diagram of a delay buffer unit programming verification provided by at least one embodiment of the present disclosure;
图8B为本公开至少一实施例提供的又一种延时缓冲单元编程校验示意图;以及FIG8B is a schematic diagram of another delay buffer unit programming verification provided by at least one embodiment of the present disclosure; and
图9为本公开至少一实施例提供的又一种延时缓冲阵列操作方法流程图。FIG. 9 is a flow chart of another delay buffer array operation method provided by at least one embodiment of the present disclosure.
具体实施方式Detailed ways
为使本领域技术人员更好地理解本公开的技术方案,下面将结合附图对本公开实施例作进一步地详细描述,此处描述的具体实施例和附图仅仅用于解释本公开,而非对公开实施例的限定,在不冲突的情况下,本公开的各实施例及实施例中的各特征可相互组合。In order to enable those skilled in the art to better understand the technical solution of the present disclosure, the embodiments of the present disclosure will be further described in detail below in conjunction with the accompanying drawings. The specific embodiments and drawings described herein are only used to explain the present disclosure rather than to limit the disclosed embodiments. In the absence of conflict, the various embodiments of the present disclosure and the various features therein may be combined with each other.
为便于描述,本公开实施例的附图中仅示出了与本公开实施例相关的部分,而与本公开实施例无关的部分未在附图中示出。本公开的实施例中所涉及的每个单元、模块可仅对应一个实体结构,也可由多个实体结构组成,或者,多个单元、模块也可集成为一个实体结构。在不冲突的情况下,本公开实施例的流程图和框图中所标注的功能、步骤可按照不同于附图中所标注的顺序发生。For ease of description, the drawings of the embodiments of the present disclosure only show the parts related to the embodiments of the present disclosure, while the parts unrelated to the embodiments of the present disclosure are not shown in the drawings. Each unit and module involved in the embodiments of the present disclosure may correspond to only one entity structure, or may be composed of multiple entity structures, or multiple units and modules may be integrated into one entity structure. In the absence of conflict, the functions and steps marked in the flowcharts and block diagrams of the embodiments of the present disclosure may occur in an order different from that marked in the drawings.
本公开实施例的流程图和框图中,示出了按照本公开各实施例的系统、装置、设备、方法的可能实现的体系架构、功能和操作。流程图或框图中的每个方框可代表一个单元、模块、程序段、代码,其包含用于实现规定的功能的可执行指令。而且,框图和流程图中的每个方框或方框的组合,可用实现规定的功能的基于硬件的系统实现,也可用硬件与计算机指令的组合来实现。The flowcharts and block diagrams of the embodiments of the present disclosure illustrate the possible architectures, functions, and operations of the systems, devices, equipment, and methods according to the embodiments of the present disclosure. Each box in the flowchart or block diagram may represent a unit, module, program segment, or code, which contains executable instructions for implementing the specified functions. Moreover, each box or combination of boxes in the block diagram and flowchart may be implemented by a hardware-based system that implements the specified functions, or may be implemented by a combination of hardware and computer instructions.
为了保持本公开实施例的以下说明清楚且简明,可省略已知功能和已知部(元)件的详细说明。当本公开实施例的任一部(元)件在一个以上的附图中出现时,该部(元)件在每个附图中由相同或类似的参考标号表示。In order to keep the following description of the embodiments of the present disclosure clear and concise, detailed descriptions of known functions and known components (elements) may be omitted. When any component (element) of the embodiments of the present disclosure appears in more than one drawing, the component (element) is represented by the same or similar reference numerals in each drawing.
忆阻器是一种新型信息处理器件,具有存算融合的功能,可在存储的数据上原位实现计算操作,从而消除数据搬移的巨大开销。此外,忆阻器可以直接在模拟域上做运算(例如,忆阻器可以基于欧姆定律完成乘法运算,基于基尔霍夫电流定律完成加法运算),从而一步实现矩阵向量乘法运算,且运算过程中无需数模转换的开销。近些年来,基于忆阻器的存算一体取得了重大进展。然而,由于终端设备的供电支持是有限的,所以要求基于忆阻器的存算一体装置不仅要具有更高精度的计算,还要具有更低的能耗和更高的能效。为此,忆阻器存算一体设计在阵列结构和外围电路设计等方面做了诸多改进。Memristor is a new type of information processing device with the function of storage and computing fusion. It can perform computing operations on stored data in situ, thereby eliminating the huge overhead of data movement. In addition, memristors can perform operations directly in the analog domain (for example, memristors can complete multiplication operations based on Ohm's law and addition operations based on Kirchhoff's current law), thereby realizing matrix-vector multiplication operations in one step, and there is no need for the overhead of digital-to-analog conversion during the operation process. In recent years, memristor-based storage and computing have made significant progress. However, since the power supply support of terminal equipment is limited, the memristor-based storage and computing device is required to have not only higher-precision calculations, but also lower energy consumption and higher energy efficiency. To this end, the memristor storage and computing design has made many improvements in array structure and peripheral circuit design.
例如,一种改进方案是电压域量化方案,采用电压预充电型读取来替代静态电流开销过大的电流型读取方案,但是该方案的输出范围有限。例如,另一种改进方案是时间域量化方法,将输出结果转至时间域来增大输出范围,从而更简单高效地区分不同输出状态等,但是,时间域量化的输出存在非线性问题。并且,不论是电压域还是时间域的量化方案,都需要处理忆阻器阵列在并行打开时流过的大电流,忆阻器阵列及其外围电路上的大电流都会产生较大的功耗。例如,又一种改进方案是基于2T2R阵列结构的电流型模拟计算方案,通过减少累积输出电流的方法来缓解走线的IR压降(IR drop)问题,是一种高并行、高算力的计算方式。但是,为了保证计算精度,这种计算方式需要很大的输入功耗和钳位电路功耗,当阵列规模较大时,同样会产生较大的功耗。For example, one improved solution is a voltage domain quantization solution, which uses voltage precharge type reading to replace the current type reading solution with excessive static current overhead, but the output range of this solution is limited. For example, another improved solution is a time domain quantization method, which transfers the output result to the time domain to increase the output range, so as to distinguish different output states more simply and efficiently, etc., but the output of time domain quantization has nonlinear problems. In addition, whether it is a voltage domain or a time domain quantization solution, it is necessary to deal with the large current flowing through the memristor array when it is opened in parallel. The large current on the memristor array and its peripheral circuits will generate large power consumption. For example, another improved solution is a current type analog calculation solution based on a 2T2R array structure, which alleviates the IR drop problem of the routing by reducing the cumulative output current. It is a high parallel and high computing power calculation method. However, in order to ensure the calculation accuracy, this calculation method requires a large input power consumption and clamping circuit power consumption, and when the array scale is large, it will also generate large power consumption.
如上所述的忆阻器存算一体的阵列和电路设计方案,由于其计算机理的限制,在并行处理时不能满足极致的低功耗、高能效的需求。本公开的发明人提出了使用忆阻器延迟结构作为计算基本单元,进而利用各个单元的延迟累积实现存算一体矩阵向量乘法操作的计算阵列,该计算阵列能够避免计算过程中处理大电流的需求,且压低了工作电源电压,实现高能效。The above-mentioned memristor storage-computation integrated array and circuit design scheme cannot meet the requirements of extreme low power consumption and high energy efficiency in parallel processing due to the limitation of its computing mechanics. The inventor of the present disclosure proposes a computing array that uses a memristor delay structure as a basic computing unit, and then uses the delay accumulation of each unit to realize the storage-computation integrated matrix-vector multiplication operation. The computing array can avoid the need to process large currents during the calculation process, and lower the working power supply voltage to achieve high energy efficiency.
图1A示出了一种示例性延时缓冲单元的结构示意图;图1B示出了另一种示例性延时缓冲单元的结构示意图。FIG. 1A is a schematic diagram showing the structure of an exemplary delay buffer unit; FIG. 1B is a schematic diagram showing the structure of another exemplary delay buffer unit.
例如,一种方案采用如图1A以及图1B所示的延时缓冲单元控制输入和输出之间的延时,可以根据是否使用忆阻器来改变延时缓冲单元的传输延时,还可以通过控制忆阻器的电阻值变化来改变延时缓冲单元的传输延时,从而实现对延时缓冲单元的动态调控,能够根据实际需求灵活高效地调控延迟的大小。For example, one solution uses a delay buffer unit as shown in Figure 1A and Figure 1B to control the delay between input and output. The transmission delay of the delay buffer unit can be changed according to whether a memristor is used. The transmission delay of the delay buffer unit can also be changed by controlling the change in the resistance value of the memristor, thereby realizing dynamic regulation of the delay buffer unit and being able to flexibly and efficiently regulate the size of the delay according to actual needs.
如图1A和图1B所示的示例中,该延时缓冲单元10包括第一级反相器P1、第二级反相器P2和延时调节子单元11。As shown in the example of FIG. 1A and FIG. 1B , the delay buffer unit 10 includes a first-stage inverter P1 , a second-stage inverter P2 , and a delay adjustment subunit 11 .
例如,第一级反相器P1的输入端作为该延时缓冲单元10的输入端INT。从第一级反相器P1的输入端可以接收延时缓冲单元的输入信号,该输入信号例如可以是上升沿触发信号(如图1A所示)或下降沿触发信号(如图1B所示)。例如,第一级反相器P1包括两个晶体管T1和T2,T1例如为NMOS管,晶体管T2例如为PMOS管,晶体管T1和T2的栅端可以作为输入端,也即作为延时缓冲单元10的输入端INT以接收输入信号。例如,当输入信号为高电平时,晶体管T1打开,晶体管T2关闭,当输入信号为低电平时,晶体管T1关闭,晶体管T2打开。晶体管T1和T2的漏端彼此电连接并作为第一级反相器P1的输出端,晶体管T1或T2的源端可以连接接地端或电源端,也可以作为第一级反相器P1的第一端。For example, the input end of the first-stage inverter P1 serves as the input end INT of the delay buffer unit 10. The input signal of the delay buffer unit can be received from the input end of the first-stage inverter P1, and the input signal can be, for example, a rising edge trigger signal (as shown in FIG. 1A ) or a falling edge trigger signal (as shown in FIG. 1B ). For example, the first-stage inverter P1 includes two transistors T1 and T2, T1 is, for example, an NMOS tube, and transistor T2 is, for example, a PMOS tube. The gate ends of transistors T1 and T2 can serve as input ends, that is, as the input end INT of the delay buffer unit 10 to receive input signals. For example, when the input signal is at a high level, transistor T1 is turned on and transistor T2 is turned off. When the input signal is at a low level, transistor T1 is turned off and transistor T2 is turned on. The drain ends of transistors T1 and T2 are electrically connected to each other and serve as the output end of the first-stage inverter P1. The source end of transistor T1 or T2 can be connected to the ground end or the power supply end, or can serve as the first end of the first-stage inverter P1.
例如,第二级反相器P2的输入端与第一级反相器P1的输出端连接,并且第二级反相器P2的输出端作为延时缓冲单元10的输出端OUT。第二级反相器P2的电路结构与第一级反相器P1的结构相似。For example, the input of the second inverter P2 is connected to the output of the first inverter P1, and the output of the second inverter P2 serves as the output OUT of the delay buffer unit 10. The circuit structure of the second inverter P2 is similar to that of the first inverter P1.
例如,从第二级反相器P2的输出端可以输出延时缓冲单元10的输出信号,该输出信号对应于输入信号并相对于输入信号具有一定的延迟,该延迟由第一级反相器P1和第二级反相器P2的传输延时组成。例如,如图1A所示,当从延时缓冲单元10的输入端INT接收的输入信号为上升沿触发信号时,从延时缓冲单元10的输出端OUT输出的输出信号相对于上升沿触发信号(在图1A中的输出端OUT处分别用灰色线条和黑色线条表示上升沿触发信号和输出信号)具有一定的延迟t。例如,如图1B所示,当从延时缓冲单元10的输入端INT接收的输入信号为下降沿触发信号时,从延时缓冲单元10的输出端OUT输出的输出信号相对于下降沿触发信号(在图1B中的输出端OUT处分别用灰色线条和黑色线条表示下降沿触发信号和输出信号)具有一定的延迟t。For example, the output signal of the delay buffer unit 10 can be output from the output terminal of the second-stage inverter P2, and the output signal corresponds to the input signal and has a certain delay relative to the input signal, and the delay is composed of the transmission delay of the first-stage inverter P1 and the second-stage inverter P2. For example, as shown in FIG1A, when the input signal received from the input terminal INT of the delay buffer unit 10 is a rising edge trigger signal, the output signal output from the output terminal OUT of the delay buffer unit 10 has a certain delay t relative to the rising edge trigger signal (the rising edge trigger signal and the output signal are represented by a gray line and a black line at the output terminal OUT in FIG1A, respectively). For example, as shown in FIG1B, when the input signal received from the input terminal INT of the delay buffer unit 10 is a falling edge trigger signal, the output signal output from the output terminal OUT of the delay buffer unit 10 has a certain delay t relative to the falling edge trigger signal (the falling edge trigger signal and the output signal are represented by a gray line and a black line at the output terminal OUT in FIG1B, respectively).
例如,延时调节子单元11连接在第一级反相器P1的第一端和第一操作电压端1之间,并且延时调节子单元11包括忆阻器(这里以阻变存储器(RRAM)为例),该延时调节子单元11配置为根据第一输入信号NWL控制使用忆阻器RRAM调节第一级反相器P1的传输延时。例如,第一输入信号NWL用于控制延时调节子单元11是否使用忆阻器RRAM调节第一级反相器P1的传输延时,进而调节延时缓冲单元10的输出信号与输入信号之间的延时差(延迟t)。For example, the delay adjustment subunit 11 is connected between the first end of the first-stage inverter P1 and the first operating voltage terminal 1, and the delay adjustment subunit 11 includes a memristor (here, a resistive random access memory (RRAM) is used as an example), and the delay adjustment subunit 11 is configured to control the use of the memristor RRAM to adjust the transmission delay of the first-stage inverter P1 according to the first input signal NWL. For example, the first input signal NWL is used to control whether the delay adjustment subunit 11 uses the memristor RRAM to adjust the transmission delay of the first-stage inverter P1, thereby adjusting the delay difference (delay t) between the output signal and the input signal of the delay buffer unit 10.
例如,延时调节子单元11还包括控制开关,控制开关可以是N型晶体管或P型晶体管。控制开关(这里以N型晶体管(NM1)为例)包括第一极、第二极和控制极,第一控制开关NM1的第一极、第二极和控制极例如分别可以是N型晶体管的源极、漏极和栅极。该第一控制开关NM1的控制极与第一输入信号NWL连接,并根据第一输入信号NWL将第一控制开关NM1的第一极和第二极导通或截止。例如,第一控制开关NM1的第一极与第一级反相器P1的第一端电连接,第一控制开关NM1的第二极与第一操作电压端1电连接,当第一控制开关NM1打开时,第一级反相器P1与第一操作电压端1连接,当第一控制开关NM1关闭时,第一级反相器P1与第一操作电压端1断开连接。第一控制开关NM1可以是N型晶体管(如图1A所示),也可以是P型晶体管。For example, the delay adjustment subunit 11 also includes a control switch, which can be an N-type transistor or a P-type transistor. The control switch (here taking the N-type transistor (NM1) as an example) includes a first electrode, a second electrode and a control electrode, and the first electrode, the second electrode and the control electrode of the first control switch NM1 can be, for example, the source, the drain and the gate of the N-type transistor, respectively. The control electrode of the first control switch NM1 is connected to the first input signal NWL, and the first electrode and the second electrode of the first control switch NM1 are turned on or off according to the first input signal NWL. For example, the first electrode of the first control switch NM1 is electrically connected to the first end of the first-stage inverter P1, and the second electrode of the first control switch NM1 is electrically connected to the first operating voltage terminal 1. When the first control switch NM1 is turned on, the first-stage inverter P1 is connected to the first operating voltage terminal 1, and when the first control switch NM1 is turned off, the first-stage inverter P1 is disconnected from the first operating voltage terminal 1. The first control switch NM1 can be an N-type transistor (as shown in FIG. 1A) or a P-type transistor.
例如,延时调节子单元11中的忆阻器RRAM包括第一端和第二端,忆阻器RRAM的第一端与第一控制开关NM1的第一极和第一级反相器P1的第一端电连接。忆阻器RRAM的第二端与第一控制开关NM1的第二极可以连接到相同的操作电压端或不同的操作电压端。这里,忆阻器RRAM的第二端与第一控制开关NM1的第二极连接到不同的操作电压端表示两者分别从不同的操作电压端获得电压信号。For example, the memristor RRAM in the delay adjustment subunit 11 includes a first end and a second end, and the first end of the memristor RRAM is electrically connected to the first pole of the first control switch NM1 and the first end of the first-stage inverter P1. The second end of the memristor RRAM and the second pole of the first control switch NM1 can be connected to the same operating voltage terminal or different operating voltage terminals. Here, the second end of the memristor RRAM and the second pole of the first control switch NM1 are connected to different operating voltage terminals, indicating that the two obtain voltage signals from different operating voltage terminals, respectively.
例如,在一个示例中,忆阻器RRAM的第二端可以与第一操作电压端1电连接,也即,忆阻器RRAM的第二端与第一控制开关NM1的第二极一同连接到第一操作电压端上,第一操作电压端1例如可以是电源端或接地端。例如,在另一个示例中,忆阻器RRAM的第二端与第二操作电压端2电连接,第一控制开关NM1的第二极与第一操作电压端1电连接,第一操作电压端1和第二操作电压端2提供的电压信号不同。例如,在又一个示例中,忆阻器RRAM的第二端与第一操作电压端1电连接,第一控制开关NM1的第二极与第二操作电压端2电连接,第一操作电压端1和第二操作电压端2不同。For example, in one example, the second end of the memristor RRAM can be electrically connected to the first operating voltage terminal 1, that is, the second end of the memristor RRAM and the second pole of the first control switch NM1 are connected to the first operating voltage terminal, and the first operating voltage terminal 1 can be, for example, a power supply terminal or a ground terminal. For example, in another example, the second end of the memristor RRAM is electrically connected to the second operating voltage terminal 2, the second pole of the first control switch NM1 is electrically connected to the first operating voltage terminal 1, and the voltage signals provided by the first operating voltage terminal 1 and the second operating voltage terminal 2 are different. For example, in yet another example, the second end of the memristor RRAM is electrically connected to the first operating voltage terminal 1, the second pole of the first control switch NM1 is electrically connected to the second operating voltage terminal 2, and the first operating voltage terminal 1 and the second operating voltage terminal 2 are different.
延时调节子单元11与第一级反相器P1的第一端连接,例如与第一级反相器P1的源端连接。例如,如图1A所示,延时调节子单元11连接在第一级反相器P1中的NMOS管T1与第一操作电压端1之间。例如,在图1A所示的延时调节子单元11中,第一控制开关NM1的第一极(例如漏极)与第一级反相器P1中的NMOS管T1的源端连接,忆阻器RRAM的第一端与第一级反相器P1中的NMOS管T1的源端连接。例如,如图1B所示,延时调节子单元11连接在第一级反相器P1中的PMOS管T2与第一操作电压端1之间。例如,在如图1B所示的延时调节子单元11中,第一控制开关NM1的第一极(例如源极)与第一级反相器P1中的PMOS管T2的源端连接,忆阻器RRAM的第一端与第一级反相器P1中的PMOS管T2的源端连接。The delay adjustment subunit 11 is connected to the first end of the first-stage inverter P1, for example, connected to the source end of the first-stage inverter P1. For example, as shown in FIG1A , the delay adjustment subunit 11 is connected between the NMOS tube T1 in the first-stage inverter P1 and the first operating voltage terminal 1. For example, in the delay adjustment subunit 11 shown in FIG1A , the first pole (for example, the drain) of the first control switch NM1 is connected to the source end of the NMOS tube T1 in the first-stage inverter P1, and the first end of the memristor RRAM is connected to the source end of the NMOS tube T1 in the first-stage inverter P1. For example, as shown in FIG1B , the delay adjustment subunit 11 is connected between the PMOS tube T2 in the first-stage inverter P1 and the first operating voltage terminal 1. For example, in the delay adjustment subunit 11 shown in Figure 1B, the first pole (for example, the source) of the first control switch NM1 is connected to the source end of the PMOS tube T2 in the first-stage inverter P1, and the first end of the memristor RRAM is connected to the source end of the PMOS tube T2 in the first-stage inverter P1.
例如,延时缓冲单元10还包括电容C1,电容C1的第一极连接在第一级反相器P1的输出端和第二级反相器P2的输入端之间,电容C1的第二极接地。电容C1例如可以是专门制备的电容元件或者是寄生电容等。For example, the delay buffer unit 10 further includes a capacitor C1, a first electrode of which is connected between the output of the first inverter P1 and the input of the second inverter P2, and a second electrode of which is grounded. The capacitor C1 may be a specially prepared capacitor element or a parasitic capacitor.
例如,在图1A所示的延时缓冲单元10中,当从延时缓冲单元10的输入端INT接收的输入信号为低电平信号时,第一级反相器P1中的NMOS管T1关闭,PMOS管T2打开,从而断开延时调节子单元11与第一级反相器P1之间的导电通路。在这种情况下,可以将延时调节子单元11与第一操作电压端1和第二操作电压端2连接,以根据第二操作电压端2提供的信号对忆阻器RRAM执行第一处理操作。例如,对忆阻器RRAM执行的第一处理操作可以是置位操作、复位操作或初始化操作以改变忆阻器的电阻值。For example, in the delay buffer unit 10 shown in FIG1A , when the input signal received from the input terminal INT of the delay buffer unit 10 is a low-level signal, the NMOS tube T1 in the first-stage inverter P1 is turned off, and the PMOS tube T2 is turned on, thereby disconnecting the conductive path between the delay adjustment subunit 11 and the first-stage inverter P1. In this case, the delay adjustment subunit 11 can be connected to the first operating voltage terminal 1 and the second operating voltage terminal 2 to perform a first processing operation on the memristor RRAM according to the signal provided by the second operating voltage terminal 2. For example, the first processing operation performed on the memristor RRAM can be a set operation, a reset operation, or an initialization operation to change the resistance value of the memristor.
对于忆阻器(例如,阻变存储器)来说,其通常需要一个额外的初始化(Forming)过程,在完成初始化之后,忆阻器的电阻值可以随着外加电压信号而变化。由于在忆阻器制备完成时其内部没有导电细丝,因此需要通过初始化操作在忆阻器的内部形成导电细丝。初始化操作在忆阻器的生命周期中通常只需要进行一次。For memristors (e.g., resistive random access memory), an additional initialization process is usually required. After the initialization is completed, the resistance value of the memristor can change with the applied voltage signal. Since there are no conductive filaments inside the memristor when it is prepared, it is necessary to form conductive filaments inside the memristor through an initialization operation. The initialization operation usually only needs to be performed once in the life cycle of the memristor.
例如,在图1A所示的延时缓存单元10中,当延时调节子单元11与第一级反相器P1之间的导电通路断开后,可以在第一操作电压端1和第二操作电压端2之间施加初始化电压,以对忆阻器进行初始化操作,使得忆阻器的内部形成导电细丝。例如,在经过初始化使得忆阻器RRAM的内部形成导电细丝后,忆阻器RRAM具有阈值电压。For example, in the delay cache unit 10 shown in FIG1A , after the conductive path between the delay adjustment subunit 11 and the first-stage inverter P1 is disconnected, an initialization voltage may be applied between the first operating voltage terminal 1 and the second operating voltage terminal 2 to initialize the memristor so that a conductive filament is formed inside the memristor. For example, after the conductive filament is formed inside the memristor RRAM through initialization, the memristor RRAM has a threshold voltage.
例如,主流的做法是,在忆阻器RRAM的第一端和第二端之间施加的输入电压幅值小于忆阻器RRAM的阈值电压时,不会改变忆阻器RRAM的电阻值(或电导值)。在这种情况下,可以在第一操作电压端和第二操作电压端之间施加读取电压来读取忆阻器RRAM当前的电阻值。该读取电压小于忆阻器RRAM的阈值电压,从而可以在不改变忆阻器的电阻值的情况下,读取忆阻器当前的电阻值。For example, the mainstream approach is that when the input voltage amplitude applied between the first terminal and the second terminal of the memristor RRAM is less than the threshold voltage of the memristor RRAM, the resistance value (or conductance value) of the memristor RRAM will not be changed. In this case, a read voltage can be applied between the first operating voltage terminal and the second operating voltage terminal to read the current resistance value of the memristor RRAM. The read voltage is less than the threshold voltage of the memristor RRAM, so that the current resistance value of the memristor can be read without changing the resistance value of the memristor.
例如,在忆阻器RRAM的第一端和第二端之间施加的输入电压幅值大于忆阻器RRAM的阈值电压时,可以根据施加在忆阻器RRAM的第一端和第二端之间的置位(Set)电压或复位(Reset)电压改变忆阻器RRAM的电阻值(或电导值)。例如,置位电压为正电压脉冲,复位电压为负电压脉冲。例如,对忆阻器RRAM施加置位电压可以使得忆阻器RRAM的电阻值变小,对忆阻器RRAM施加复位电压可以使得忆阻器RRAM的电阻值变大,将向忆阻器施加置位电压称作置位操作,将向忆阻器施加复位电压称作复位操作。For example, when the input voltage amplitude applied between the first end and the second end of the memristor RRAM is greater than the threshold voltage of the memristor RRAM, the resistance value (or conductance value) of the memristor RRAM can be changed according to the set voltage or reset voltage applied between the first end and the second end of the memristor RRAM. For example, the set voltage is a positive voltage pulse, and the reset voltage is a negative voltage pulse. For example, applying a set voltage to the memristor RRAM can reduce the resistance value of the memristor RRAM, and applying a reset voltage to the memristor RRAM can increase the resistance value of the memristor RRAM. Applying a set voltage to the memristor is called a set operation, and applying a reset voltage to the memristor is called a reset operation.
图1C示出了一种示例性延时链的结构示意图。如图1C所示,多个延时缓冲单元10级联起来形成一行延时链20,从延时链20的第一个延时缓冲单元10的输入端INT接收输入信号KEEP,该输入信号KEEP例如可以是上升沿触发信号或下降沿触发信号。这样的延时链可以并排设置从而提供延时缓冲阵列。FIG1C shows a schematic diagram of the structure of an exemplary delay chain. As shown in FIG1C , a plurality of delay buffer units 10 are cascaded to form a row of delay chains 20, and an input signal KEEP is received from an input terminal INT of a first delay buffer unit 10 of the delay chain 20. The input signal KEEP may be, for example, a rising edge trigger signal or a falling edge trigger signal. Such delay chains may be arranged side by side to provide a delay buffer array.
为了便于描述,下文以如图1A所示的延时缓冲单元10为例介绍矩阵向量乘法运算,在进行计算时,输入信号KEEP为上升沿触发信号,延时缓冲单元10的延时调节子单元11中的第一控制开关NM1和忆阻器RRAM均接地。For ease of description, the matrix-vector multiplication operation is introduced below using the delay buffer unit 10 shown in FIG1A as an example. When performing the calculation, the input signal KEEP is a rising edge trigger signal, and the first control switch NM1 and the memristor RRAM in the delay adjustment subunit 11 of the delay buffer unit 10 are both grounded.
例如,在运算阵列中,每个延时缓冲单元10作为一个计算单元执行乘法运算,多个延时缓冲单元10的累积延时作为乘加运算的结果从延时链20的输出端作为该延时链的输出。For example, in the operation array, each delay buffer unit 10 performs a multiplication operation as a calculation unit, and the accumulated delay of multiple delay buffer units 10 is output from the output end of the delay chain 20 as the result of the multiplication and addition operation as the output of the delay chain.
例如,在进行计算前,需要先将矩阵G的元素(或神经网络中的计算权重)映射为延时缓冲单元10的延时,该延时的大小例如可以通过调节延时缓冲单元10中的忆阻器的电阻值大小来获得。例如,将矩阵G的一列元素G11、G21、…、Gm1分别映射为m个延时缓冲单元10的受忆阻器调控的延时W<0>、W<1>、…、W<m-1>。For example, before performing calculations, it is necessary to first map the elements of the matrix G (or the calculation weights in the neural network) to the delay of the delay buffer unit 10, and the size of the delay can be obtained, for example, by adjusting the resistance value of the memristor in the delay buffer unit 10. For example, a column of elements G11, G21, ..., Gm1 of the matrix G are respectively mapped to the delays W<0>, W<1>, ..., W<m-1> of m delay buffer units 10 that are controlled by the memristors.
然后在进行计算时,将输入向量V的元素映射为延时缓冲单元10的第一输入信号,例如,将输入向量V的输入元素V1作为级联的多个延时缓冲单元10的第一输入信号NWL<0>、NWL<1>、…、NWL<m-1>,多个第一输入信号NWL<0>、NWL<1>、…、NWL<m-1>例如可以并行输入,以提高计算效率。对应于输入元素V1的第一输入信号NWL<0>、NWL<1>、…、NWL<m-1>控制多个延时缓冲单元10的受忆阻器调控的多个延时W<0>、W<1>、…、W<m-1>是否被串入延时链20中。延时链20的输出端的累积延时NWL<m-1:0>·W<m-1:0>为向量内积运算(乘累加运算)的计算结果。Then, when performing calculations, the elements of the input vector V are mapped to the first input signals of the delay buffer unit 10. For example, the input element V1 of the input vector V is used as the first input signals NWL<0>, NWL<1>, ..., NWL<m-1> of the cascaded multiple delay buffer units 10. The multiple first input signals NWL<0>, NWL<1>, ..., NWL<m-1> can be input in parallel, for example, to improve the calculation efficiency. The first input signals NWL<0>, NWL<1>, ..., NWL<m-1> corresponding to the input element V1 control whether the multiple delays W<0>, W<1>, ..., W<m-1> of the multiple delay buffer units 10 controlled by the memristors are connected in series to the delay chain 20. The accumulated delay NWL<m-1:0>·W<m-1:0> at the output end of the delay chain 20 is the calculation result of the vector inner product operation (multiplication-accumulation operation).
例如,对于图1A或图1B每个延时缓冲单元10,当第一控制信号为低电平时(对应输入元素V1为1),忆阻器RRAM被接入第一级反相器P1和第一操作电压端1之间,此时,延时缓冲单元的延时t为经过忆阻器的电阻值调控后的延时,也即,受忆阻器调控的延时被串入了延时链中;当第一控制信号为高电平时(对应输入元素V1为0),忆阻器RRAM被旁路,忆阻器RRAM未被接入第一级反相器P1和第一操作电压端1之间,此时,延时缓冲单元的延时t为本征延时,也即,只有本征延时被串入了延时链中。For example, for each delay buffer unit 10 in Figure 1A or Figure 1B, when the first control signal is at a low level (the corresponding input element V1 is 1), the memristor RRAM is connected between the first-stage inverter P1 and the first operating voltage terminal 1. At this time, the delay t of the delay buffer unit is the delay after being adjusted by the resistance value of the memristor, that is, the delay adjusted by the memristor is connected in series to the delay chain; when the first control signal is at a high level (the corresponding input element V1 is 0), the memristor RRAM is bypassed, and the memristor RRAM is not connected between the first-stage inverter P1 and the first operating voltage terminal 1. At this time, the delay t of the delay buffer unit is the intrinsic delay, that is, only the intrinsic delay is connected in series to the delay chain.
通过上述计算操作,上述延时链可以实现一项(例如,1bit)输入数据与m个权重元素的向量内积运算(乘累加运算)。Through the above calculation operation, the above delay chain can realize a vector inner product operation (multiplication-accumulation operation) of one (for example, 1 bit) input data and m weight elements.
图1D示出了一种示例性计算装置的结构示意图。如图1D所示,该计算装置包括延时计算阵列100,延时计算阵列100包括2M行N列(图中仅示出了4行2列)的延时缓冲单元10。Fig. 1D shows a schematic diagram of the structure of an exemplary computing device. As shown in Fig. 1D, the computing device includes a delay computing array 100, and the delay computing array 100 includes delay buffer units 10 with 2M rows and N columns (only 4 rows and 2 columns are shown in the figure).
例如,每行中的N个延时缓冲单元10彼此串联以形成一行延时链20,每相邻两行延时链20构成一个延时处理组合30。每个延时处理组合30中同一列上的两个延时缓冲单元10的延迟之差可以对应一个有符号的权重元素,例如,通过配置两个延时缓冲单元10中的忆阻器的电阻值,可以使得两个延时缓冲单元10的延迟之差表示一个正值、负值或零值的权重元素。也即,每个延时处理组合30中同一列上的两个延时缓冲单元10可以作为一个差分单元,每个差分单元的延迟可以用于表示一个正值、负值或零值的权重元素。For example, N delay buffer units 10 in each row are connected in series to form a row of delay chains 20, and each two adjacent rows of delay chains 20 constitute a delay processing combination 30. The difference in delay between two delay buffer units 10 on the same column in each delay processing combination 30 can correspond to a signed weight element. For example, by configuring the resistance value of the memristor in the two delay buffer units 10, the difference in delay between the two delay buffer units 10 can represent a positive, negative or zero weight element. That is, the two delay buffer units 10 on the same column in each delay processing combination 30 can be used as a differential unit, and the delay of each differential unit can be used to represent a positive, negative or zero weight element.
例如,每个延时处理组合30中的两行延时链20接收同一个输入信号KEEP,例如,分别通过两行延时链20中的第一个延时缓冲单元10的输入端INT接收输入信号KEEP。输入信号KEEP可以用于控制延时计算阵列100的操作模式。例如,输入信号KEEP可以控制延时计算阵列100处于第一操作模式还是第二操作模式。当输入信号KEEP保持常低电平(或常高电平)时,延时计算阵列100处于第一操作模式;当输入信号KEEP为上升沿触发信号(或下降沿触发信号)时,延时计算阵列100处于第二操作模式。For example, the two rows of delay chains 20 in each delay processing combination 30 receive the same input signal KEEP, for example, the input signal KEEP is received through the input terminal INT of the first delay buffer unit 10 in the two rows of delay chains 20. The input signal KEEP can be used to control the operation mode of the delay calculation array 100. For example, the input signal KEEP can control whether the delay calculation array 100 is in the first operation mode or the second operation mode. When the input signal KEEP maintains a normally low level (or a normally high level), the delay calculation array 100 is in the first operation mode; when the input signal KEEP is a rising edge trigger signal (or a falling edge trigger signal), the delay calculation array 100 is in the second operation mode.
在第一操作模式下,输入信号KEEP例如为低电平,从而控制延时缓冲单元10中的延时调节子单元11与第一级反相器P1断开电连接,此时可以对忆阻器RRAM执行前文所述的第一处理操作,例如对忆阻器RRAM进行置位操作、复位操作等。在第二操作模式下,输入信号KEEP输入边沿信号,从而控制延时缓冲单元10中的延时调节子单元11与第一级反相器P2电连接,此时可以对延时计算阵列100执行计算操作或校准读取操作,例如,从延时链的输出端获取延时计算结果或者延时读取结果。In the first operation mode, the input signal KEEP is, for example, at a low level, thereby controlling the delay adjustment subunit 11 in the delay buffer unit 10 to be electrically disconnected from the first-stage inverter P1, and at this time, the first processing operation described above can be performed on the memristor RRAM, such as performing a set operation, a reset operation, etc. on the memristor RRAM. In the second operation mode, the input signal KEEP inputs an edge signal, thereby controlling the delay adjustment subunit 11 in the delay buffer unit 10 to be electrically connected to the first-stage inverter P2, and at this time, a calculation operation or a calibration read operation can be performed on the delay calculation array 100, for example, a delay calculation result or a delay read result can be obtained from the output end of the delay chain.
例如,每个延时处理组合30中的两行延时链20的输出端分别输出两行的多个延时缓冲单元10的累积延迟。例如,如图1D所示,第一个延时处理组合30中的第一行延时链20从输出端DLP<0>输出第一行的N个延时缓冲单元10的累积延迟t_DLP<0>,第二行延时链20从输出端DLN<0>输出第二行的N个延时缓冲单元10的累积延迟t_DLN<0>。For example, the output ends of the two rows of delay chains 20 in each delay processing combination 30 respectively output the accumulated delays of the two rows of multiple delay buffer units 10. For example, as shown in FIG1D , the first row of delay chains 20 in the first delay processing combination 30 outputs the accumulated delay t_DLP<0> of the N delay buffer units 10 in the first row from the output end DLP<0>, and the second row of delay chains 20 outputs the accumulated delay t_DLN<0> of the N delay buffer units 10 in the second row from the output end DLN<0>.
例如,在一个示例中,当对延时计算阵列100执行的是计算操作时,延时处理组合30中的两行延时链的累积延迟之差ΔT=t_DLP<0>-t_DLN<0>可以表示输入数据与有符号的权重元素的向量内积结果。For example, in one example, when a calculation operation is performed on the delay calculation array 100 , the difference ΔT=t_DLP<0>−t_DLN<0> between the accumulated delays of two delay chains in the delay processing combination 30 may represent the vector inner product result of the input data and the signed weight element.
例如,每个延时处理组合30中位于同一列上的两个延时缓冲单元10分别与同一列对应的两条字线连接。例如,如图1D所示,第一行延时处理组合30中的第一列上的两个相邻延时缓冲单元10分别与字线WLP<0>和WLN<0>连接。For example, two delay buffer units 10 located in the same column in each delay processing assembly 30 are respectively connected to two word lines corresponding to the same column. For example, as shown in FIG1D , two adjacent delay buffer units 10 in the first column of the first row delay processing assembly 30 are respectively connected to word lines WLP<0> and WLN<0>.
例如,每个延时处理组合30中同一列上的两个延时缓冲单元10均与同一列对应的位线连接。例如,如图1D所示,第一行延时处理组合30中的第一列上的两个相邻延时缓冲单元10均与位线BL<0>连接。For example, two delay buffer units 10 on the same column in each delay processing combination 30 are connected to the bit line corresponding to the same column. For example, as shown in FIG1D , two adjacent delay buffer units 10 on the first column in the first row delay processing combination 30 are connected to the bit line BL<0>.
例如,每个延时处理组合30中,位于同一行上的N个延时缓冲单元10与该同一行的源线连接。例如,第一行延时处理组合30中位于第一行的N个延时缓冲单元10均与源线SLP<0>连接,位于第二行的N个延时缓冲单元10均与源线SLN<0>连接。For example, in each delay processing assembly 30, the N delay buffer units 10 located in the same row are connected to the source line of the same row. For example, in the first row delay processing assembly 30, the N delay buffer units 10 located in the first row are all connected to the source line SLP<0>, and the N delay buffer units 10 located in the second row are all connected to the source line SLN<0>.
例如,对于延时计算阵列100中的延时缓冲单元10,延时缓冲单元10的延时调节子单元11连接到第一级反相器的第一端(例如源端),延时调节子单元11通过与所在行对应的源线、与所在列对应的位线和与所在列对应的字线连接,以通过源线、位线、字线连接到不同的操作电压端,例如第一操作电压端或第二操作电压端。For example, for the delay buffer unit 10 in the delay calculation array 100, the delay adjustment subunit 11 of the delay buffer unit 10 is connected to the first end (for example, the source end) of the first-stage inverter, and the delay adjustment subunit 11 is connected through the source line corresponding to the row, the bit line corresponding to the column, and the word line corresponding to the column, so as to be connected to different operating voltage ends, for example, the first operating voltage end or the second operating voltage end, through the source line, the bit line, and the word line.
例如,如图1D所示,在第一行第一列上的延时缓冲单元10中,延时调节子单元11中的控制开关NM1的控制极与字线WLP<0>连接,控制开关NM1的第一极与第一级反相器P1的第一端和忆阻器RRAM的第一端连接,控制开关NM1的第二极与源线SLP<0>连接,忆阻器的第二端与位线BL<0>连接。其他行与列上的延时缓冲单元10与字线、位线和源线的连接方式与之类似,此处不再赘述。For example, as shown in FIG1D , in the delay buffer unit 10 on the first row and the first column, the control electrode of the control switch NM1 in the delay adjustment subunit 11 is connected to the word line WLP<0>, the first electrode of the control switch NM1 is connected to the first end of the first-stage inverter P1 and the first end of the memristor RRAM, the second electrode of the control switch NM1 is connected to the source line SLP<0>, and the second end of the memristor is connected to the bit line BL<0>. The connection method of the delay buffer units 10 on other rows and columns with the word lines, bit lines and source lines is similar, and will not be repeated here.
在上述延时缓冲单元中,由于在控制开关截止以通过忆阻器调节延时缓冲单元对输入信号KEEP的上升沿(或下降沿)产生延时过程中,会对输入信号KEEP的上升沿(或下降沿)的斜率产生损失,因此,如图1A以及图1B所示的第二级反相器P2通常会使用三个并联的反相器来对损失的斜率进行补偿,即一个延时缓冲单元需要9T1R的开销,且输入信号KEEP一般同时包括上升沿和下降沿,延时处理组合中一行延时缓冲单元对输入信号KEEP的上升沿产生延时,另一行延时缓冲单元对输入信号KEEP的下降沿产生延时,两行延时缓冲单元中对应的一组差分单元则需要2×9T1R的开销用以表示对整个输入信号KEEP产生的延时。In the above-mentioned delay buffer unit, since the slope of the rising edge (or falling edge) of the input signal KEEP is lost in the process of controlling the switch to be cut off to delay the rising edge (or falling edge) of the input signal KEEP by adjusting the delay buffer unit through the memristor, the second-stage inverter P2 as shown in Figures 1A and 1B usually uses three parallel inverters to compensate for the lost slope, that is, one delay buffer unit requires an overhead of 9T1R, and the input signal KEEP generally includes both a rising edge and a falling edge. In the delay processing combination, one row of delay buffer units delays the rising edge of the input signal KEEP, and another row of delay buffer units delays the falling edge of the input signal KEEP. A corresponding group of differential units in the two rows of delay buffer units requires an overhead of 2×9T1R to represent the delay generated for the entire input signal KEEP.
本公开至少一实施例提供了一种延时缓冲单元,该延时缓冲单元包括串联的第一延时缓冲子单元以及第二延时缓冲子单元,第一延时缓冲子单元包括第一级反相器、第二级反相器以及第一延时调节子单元,第一级反相器的输入端作为延时缓冲单元的输入端,第二级反相器的输入端连接第一级反相器的输出端,第一级延时调节子单元与第一级反相器的第一端、第一操作电压端以及第二操作电压端连接,第二延时缓冲子单元包括第三级反相器、第四级反相器以及第二延时调节子单元,第三级反相器的输入端连接第二级反相器的输出端,第四级反相器的输入端连接第三级反相器的输出端,第四级反相器的输出端作为延时缓冲单元的输出端,第二级延时调节子单元与第三级反相器的第二端、第三操作电压端以及第四操作电压端连接,其中,第一延时调节子单元包括第一忆阻器以及第一控制开关,被配置为根据第一控制开关确定是否使用第一忆阻器调节第一延时缓冲子单元的第一延时,第二延时调节子单元包括第二忆阻器以及第二控制开关,被配置为根据第二控制开关确定是否使用第二忆阻器调节第二延时缓冲子单元的第二延时,第一延时缓冲子单元以及第二延时缓冲子单元分别被配置为对于经过延时缓冲单元的脉冲信号的边沿斜率的影响相反。At least one embodiment of the present disclosure provides a delay buffer unit, which includes a first delay buffer subunit and a second delay buffer subunit connected in series, the first delay buffer subunit includes a first-stage inverter, a second-stage inverter and a first delay adjustment subunit, the input end of the first-stage inverter serves as the input end of the delay buffer unit, the input end of the second-stage inverter is connected to the output end of the first-stage inverter, the first-stage delay adjustment subunit is connected to the first end, the first operating voltage end and the second operating voltage end of the first-stage inverter, the second delay buffer subunit includes a third-stage inverter, a fourth-stage inverter and a second delay adjustment subunit, the input end of the third-stage inverter is connected to the output end of the second-stage inverter, the input end of the fourth-stage inverter is connected to the third-stage inverter The output end of the fourth-stage inverter serves as the output end of the delay buffer unit, and the second-stage delay adjustment subunit is connected to the second end, the third operating voltage end and the fourth operating voltage end of the third-stage inverter, wherein the first delay adjustment subunit includes a first memristor and a first control switch, and is configured to determine whether to use the first memristor to adjust the first delay of the first delay buffer subunit according to the first control switch, and the second delay adjustment subunit includes a second memristor and a second control switch, and is configured to determine whether to use the second memristor to adjust the second delay of the second delay buffer subunit according to the second control switch, and the first delay buffer subunit and the second delay buffer subunit are respectively configured to have opposite effects on the edge slope of the pulse signal passing through the delay buffer unit.
例如,上述第一延时可以是对输入脉冲信号上升沿造成的延时,上述第二延时可以是对输入脉冲信号下降沿造成的延时,通过上述两级串联结构,可以实现在一行延时缓冲单元中对输入脉冲信号上升沿和下降沿同时产生延时,而不需要将上升沿和下降沿在两行延时缓冲单元中分别产生延时。For example, the first delay may be a delay caused to the rising edge of the input pulse signal, and the second delay may be a delay caused to the falling edge of the input pulse signal. Through the two-stage series structure, it is possible to simultaneously delay the rising edge and the falling edge of the input pulse signal in one row of delay buffer units without delaying the rising edge and the falling edge separately in two rows of delay buffer units.
例如,第一延时缓冲子单元对经过该延时缓冲单元的脉冲信号上升沿斜率造成损失,第二延时缓冲子单元对经过该延时缓冲单元的脉冲信号下降沿斜率造成损失。For example, the first delay buffer subunit causes a loss in the rising edge slope of the pulse signal passing through the delay buffer unit, and the second delay buffer subunit causes a loss in the falling edge slope of the pulse signal passing through the delay buffer unit.
图2为本公开至少一实施例提供的一种延时缓冲单元结构示意图。图1A以及图1B中的第二级反相器P2选用单个反相器时,上述第一延时缓冲子单元可以采用例如如图1A所示的延时缓冲单元,此时将图1B中的第一控制开关NM1的晶体管类型替换为与图1A中第一控制开关NM1晶体管类型不同的晶体管则可以作为第二延时缓冲子单元,在该情况下,上述本公开任一实施例结构可以如图2所示,第一延时缓冲子单元NCELL包括第一级反相器PM1、第二级反相器PM2、第一调节电容C1、第一控制开关NM0以及第一忆阻器R1,第二延时缓冲子单元PCELL包括第三级反相器PM3、第四级反相器PM4、第二调节电容C2、第二控制开关PM0以及第二忆阻器R2,其中,第二级反相器PM2以及第四级反相器PM4都选用单个反相器,第一控制开关NM0的第二极连接在第一操作电压端K1,第一忆阻器R1的第二端连接在第二操作电压端K2,第二控制开关PM0的第二极连接在第三操作电压端K3,第二忆阻器R2的第二端连接在第四操作电压端K4,第一延时缓冲子单元NCELL的输出连接在第二延时缓冲子单元PCELL的输入。FIG2 is a schematic diagram of a delay buffer unit structure provided by at least one embodiment of the present disclosure. When the second-stage inverter P2 in FIG1A and FIG1B uses a single inverter, the first delay buffer subunit can adopt, for example, the delay buffer unit shown in FIG1A. At this time, the transistor type of the first control switch NM1 in FIG1B is replaced with a transistor different from the transistor type of the first control switch NM1 in FIG1A, and then the second delay buffer subunit can be used. In this case, the structure of any embodiment of the present disclosure can be as shown in FIG2, and the first delay buffer subunit NCELL includes a first-stage inverter PM1, a second-stage inverter PM2, a first adjustment capacitor C1, a first control switch NM0 and a first memristor R1, and the second delay buffer subunit P CELL includes a third-stage inverter PM3, a fourth-stage inverter PM4, a second adjustment capacitor C2, a second control switch PM0 and a second memristor R2, wherein the second-stage inverter PM2 and the fourth-stage inverter PM4 are both single inverters, the second pole of the first control switch NM0 is connected to the first operating voltage terminal K1, the second end of the first memristor R1 is connected to the second operating voltage terminal K2, the second pole of the second control switch PM0 is connected to the third operating voltage terminal K3, the second end of the second memristor R2 is connected to the fourth operating voltage terminal K4, and the output of the first delay buffer subunit NCELL is connected to the input of the second delay buffer subunit PCELL.
例如,第一操作电压端K1以及第二操作电压端K2的电压低于第三操作电压端K3以及第四操作电压端K4的电压,第一操作电压端K1以及第二操作电压端K2至少之一在第一延时缓冲子单元NCELL工作时作为放电电压端,例如二者彼此连接从而共同作为放电电压端;第三操作电压端K3以及第四操作电压端K4至少之一在第二延时缓冲子单元PCELL工作时作为充电电压端,例如二者彼此连接从而共同作为充电电压端。例如,第一操作电压端K1以及第二操作电压端K2接地,第三操作电压端K3以及第四操作电压端K4接电源电压,此时忆阻器不仅在放电路径上存在(图2中NM0以及相邻的R1),也存在于充电路径上(图2中PM0以及相邻的R2);For example, the voltages of the first operating voltage terminal K1 and the second operating voltage terminal K2 are lower than the voltages of the third operating voltage terminal K3 and the fourth operating voltage terminal K4. When the first delay buffer subunit NCELL is working, at least one of the first operating voltage terminal K1 and the second operating voltage terminal K2 serves as a discharge voltage terminal, for example, the two are connected to each other so as to serve as a discharge voltage terminal together; when the second delay buffer subunit PCELL is working, at least one of the third operating voltage terminal K3 and the fourth operating voltage terminal K4 serves as a charge voltage terminal, for example, the two are connected to each other so as to serve as a charge voltage terminal together. For example, the first operating voltage terminal K1 and the second operating voltage terminal K2 are grounded, and the third operating voltage terminal K3 and the fourth operating voltage terminal K4 are connected to the power supply voltage. At this time, the memristor exists not only in the discharge path (NM0 and the adjacent R1 in FIG. 2 ), but also in the charge path (PM0 and the adjacent R2 in FIG. 2 );
例如,第一控制开关NM0的类型与第二控制开关PM0的类型不同,例如第一控制开关NM0选用N型晶体管,第二控制开关PM0选用P型晶体管。For example, the type of the first control switch NM0 is different from the type of the second control switch PM0. For example, the first control switch NM0 uses an N-type transistor, and the second control switch PM0 uses a P-type transistor.
例如,第一控制开关NM0包括第一极、第二极和第一控制极,第一控制极接收第一控制信号NWL_N以根据第一控制信号NWL_N将第一控制开关NM0的第一极和第二极导通或截止,第一控制开关NM0的第一极与第一级反相器的第一端电连接,第一控制开关NM0的第二极与第一操作电压端K1电连接,第一忆阻器R1的第一端与第一控制开关NM0的第一极和第一级反相器的第一端电连接,第一忆阻器R1的第二端与第二操作电压端K2电连接;第二控制开关PM0包括第一极、第二极和第二控制极,第二控制极接收第二控制信号NWL_P以根据第二控制信号NWL_P将第二控制开关PM0的第一极和第二极导通或截止,第二控制开关PM0的第一极与第三级反相器的第二端电连接,第二控制开关PMO的第二极与第三操作电压端K3电连接,第二忆阻器R2的第一端与第二控制开关PM0的第一极和第三级反相器的第二端电连接,第二忆阻器R2的第二端与第四操作电压端K4电连接;第一控制信号NWL_N和第二控制信号NWL_P彼此反相,以控制第一控制开关NM0和第二控制开关PM0同时导通或截止。For example, the first control switch NM0 includes a first electrode, a second electrode and a first control electrode, the first control electrode receives a first control signal NWL_N to turn on or off the first electrode and the second electrode of the first control switch NM0 according to the first control signal NWL_N, the first electrode of the first control switch NM0 is electrically connected to the first end of the first-stage inverter, the second electrode of the first control switch NM0 is electrically connected to the first operating voltage terminal K1, the first end of the first memristor R1 is electrically connected to the first electrode of the first control switch NM0 and the first end of the first-stage inverter, and the second end of the first memristor R1 is electrically connected to the second operating voltage terminal K2; the second control switch PM0 includes a first electrode, a second electrode and a second control electrode, the second The control electrode receives the second control signal NWL_P to turn on or off the first electrode and the second electrode of the second control switch PM0 according to the second control signal NWL_P, the first electrode of the second control switch PM0 is electrically connected to the second end of the third-stage inverter, the second electrode of the second control switch PMO is electrically connected to the third operating voltage terminal K3, the first end of the second memristor R2 is electrically connected to the first electrode of the second control switch PM0 and the second end of the third-stage inverter, and the second end of the second memristor R2 is electrically connected to the fourth operating voltage terminal K4; the first control signal NWL_N and the second control signal NWL_P are inverted to each other to control the first control switch NM0 and the second control switch PM0 to be turned on or off at the same time.
需要说明的是,本公开的实施例中采用的晶体管均可以为薄膜晶体管或场效应晶体管(例如MOS场效应晶体管)或其他特性相同的开关器件。这里采用的晶体管的第一极(电极)和第二极(电极)可以分别是源极和漏极或者漏极和源极。可以理解的是,源极、漏极在结构上可以是对称的,所以其源极、漏极在结构上可以是没有区别的。本公开的实施例对采用的晶体管的类型不作限定。It should be noted that the transistors used in the embodiments of the present disclosure may be thin film transistors or field effect transistors (such as MOS field effect transistors) or other switching devices with the same characteristics. The first pole (electrode) and the second pole (electrode) of the transistor used here may be the source and the drain or the drain and the source, respectively. It is understandable that the source and the drain may be symmetrical in structure, so the source and the drain may be indistinguishable in structure. The embodiments of the present disclosure do not limit the type of transistor used.
例如,第一延时调节子单元包括第一调节电容C1,且被配置为根据第一控制信号NWL_N使用第一忆阻器R1和第一调节电容C1调节第一延时缓冲子单元NCELL的延时,第二延时调节子单元包括第二调节电容C2,且被配置为根据第二控制信号NWL_P使用第二忆阻器R2和第二调节电容C2调节第二延时缓冲子单元PCELL的延时,例如,第一调节电容可以利用电路结构本身的寄生电容或者外接电容。For example, the first delay adjustment subunit includes a first adjustment capacitor C1, and is configured to adjust the delay of the first delay buffer subunit NCELL using the first memristor R1 and the first adjustment capacitor C1 according to the first control signal NWL_N, and the second delay adjustment subunit includes a second adjustment capacitor C2, and is configured to adjust the delay of the second delay buffer subunit PCELL using the second memristor R2 and the second adjustment capacitor C2 according to the second control signal NWL_P. For example, the first adjustment capacitor can utilize the parasitic capacitance of the circuit structure itself or an external capacitor.
例如,第一调节电容C1的电容量与第二调节电容C2的电容量相同,以使在第一忆阻器R1与第二忆阻器R2阻值相等的情况下,第一延时调节子单元与第二延时调节子单元对输入脉冲信号的上升沿和下降沿产生的延时一致。For example, the capacitance of the first adjustment capacitor C1 is the same as the capacitance of the second adjustment capacitor C2, so that when the resistance values of the first memristor R1 and the second memristor R2 are equal, the delays generated by the first delay adjustment subunit and the second delay adjustment subunit for the rising edge and falling edge of the input pulse signal are consistent.
如前文所述,图1D所示结构中,一行延时缓冲单元仅对输入脉冲信号的上升沿(或下降沿)产生延时的结构,需要两行延时缓冲单元组成差分结构以对输入脉冲信号经过延时缓冲单元产生的延时进行计算,且为输出理想的上升沿(或下降沿),如图1A或图1B所示的延时缓冲单元中第二级反相器P2需要用三个串联的反相器对上升沿(或下降沿)损失的斜率进行补偿,该情况下,一个差分单元需要的开销是2×9T1R。As mentioned above, in the structure shown in FIG1D , a row of delay buffer units only delays the rising edge (or falling edge) of the input pulse signal, and two rows of delay buffer units are required to form a differential structure to calculate the delay generated by the input pulse signal passing through the delay buffer unit, and to output an ideal rising edge (or falling edge). The second-stage inverter P2 in the delay buffer unit shown in FIG1A or FIG1B needs to use three inverters in series to compensate for the slope loss of the rising edge (or falling edge). In this case, the overhead required for a differential unit is 2×9T1R.
图3A、图3B为本公开至少一实施例提供的延时缓冲单元工作原理示意图。下面将结合图3A以及图3B对本公开如何降低电路开销进行说明,其中,图3A所示延时缓冲单元结构与图2所示延时缓冲单元结构相同,需要说明的是,例如,在计算模式下,图2所示的延时缓冲单元第一操作电压端K1以及第二操作电压端K2接地,第三操作电压端K3以及第四操作电压端K4接电源电压,图3B所示延时缓冲单元结构中第一延时缓冲单元与第二延时缓冲单元位置与图2中相反,本申请对此不做限制。FIG3A and FIG3B are schematic diagrams of the working principle of the delay buffer unit provided by at least one embodiment of the present disclosure. The following will be combined with FIG3A and FIG3B to illustrate how the present disclosure reduces circuit overhead, wherein the delay buffer unit structure shown in FIG3A is the same as the delay buffer unit structure shown in FIG2. It should be noted that, for example, in the calculation mode, the first operating voltage terminal K1 and the second operating voltage terminal K2 of the delay buffer unit shown in FIG2 are grounded, and the third operating voltage terminal K3 and the fourth operating voltage terminal K4 are connected to the power supply voltage. In the delay buffer unit structure shown in FIG3B, the positions of the first delay buffer unit and the second delay buffer unit are opposite to those in FIG2, and the present application does not limit this.
如图3A所示,当延时缓冲单元接收上升沿时,第一级反相器PM1和第三级反相器PM3的N型晶体管以及第二级反相器PM2和第四级反相器的P型晶体管导通(图中以黑色表示),第一级反相器PM1和第三级反相器PM3的P型晶体管以及第二级反相器PM2和第四级反相器的N型晶体管截止(图中以灰色表示),第一延时缓冲子单元NCELL的输出为X,由于第一调节电容C1和第一忆阻器R1构成的RC延时后只经过了第二级反相器PM2,因而第一延时缓冲子单元NCELL的输出X的上升沿的斜率会有损失,但是这时第一延时缓冲子单元NCELL的输出X还会经过第二延时缓冲子单元PCELL的两级反相器,由于第三级反相器PM3的P型晶体管截止使得连接第三级反相器PM3上的第二忆阻器R2被隔离,没有第二忆阻器R2参与RC延时,以此对输出X不理想的上升沿进行恢复,因而整个两级结构的输出仍具有斜率较为理想的上升沿。As shown in FIG3A , when the delay buffer unit receives a rising edge, the N-type transistors of the first-stage inverter PM1 and the third-stage inverter PM3 and the P-type transistors of the second-stage inverter PM2 and the fourth-stage inverter are turned on (indicated in black in the figure), the P-type transistors of the first-stage inverter PM1 and the third-stage inverter PM3 and the N-type transistors of the second-stage inverter PM2 and the fourth-stage inverter are turned off (indicated in gray in the figure), and the output of the first delay buffer subunit NCELL is X. Since the RC delay formed by the first adjustment capacitor C1 and the first memristor R1 has only passed The second-stage inverter PM2 is formed, and thus the slope of the rising edge of the output X of the first delay buffer subunit NCELL will be lost. However, at this time, the output X of the first delay buffer subunit NCELL will still pass through the two-stage inverter of the second delay buffer subunit PCELL. Since the P-type transistor of the third-stage inverter PM3 is cut off, the second memristor R2 connected to the third-stage inverter PM3 is isolated, and the second memristor R2 does not participate in the RC delay, so the undesirable rising edge of the output X is restored. Therefore, the output of the entire two-stage structure still has a rising edge with a relatively ideal slope.
如图3B所示,当延时缓冲单元接收下降沿时,第一级反相器PM1和第三级反相器PM3的N型晶体管以及第二级反相器PM2和第四级反相器的P型晶体管截止(图中以灰色表示),第一级反相器PM1和第三级反相器PM3的P型晶体管以及第二级反相器PM2和第四级反相器的N型晶体管导通(图中以黑色表示),第一延时缓冲子单元PCELL的输出为Y,由于第二调节电容C2和第二忆阻器R2构成的RC延时后只经过了第二级反相器PM2,因而第一延时缓冲子单元PCELL的输出Y的下降沿的斜率会有损失,但是这时第一延时缓冲子单元PCELL的输出Y还会经过第二延时缓冲子单元NCELL的两级反相器,此时由于第三级反相器PM3的N型晶体管截止使得连接第三级反相器PM3上的第一忆阻器R1被隔离,没有第一忆阻器R1参与RC延时,以此对输出Y不理想的下降沿进行恢复,因而两级结构的输出仍具有斜率较为理想的下降沿。As shown in FIG3B , when the delay buffer unit receives a falling edge, the N-type transistors of the first-stage inverter PM1 and the third-stage inverter PM3 and the P-type transistors of the second-stage inverter PM2 and the fourth-stage inverter are turned off (indicated in gray in the figure), the P-type transistors of the first-stage inverter PM1 and the third-stage inverter PM3 and the N-type transistors of the second-stage inverter PM2 and the fourth-stage inverter are turned on (indicated in black in the figure), and the output of the first delay buffer subunit PCELL is Y. Since the RC delay formed by the second adjustment capacitor C2 and the second memristor R2 has only passed The second-stage inverter PM2, therefore, the slope of the falling edge of the output Y of the first delay buffer sub-unit PCELL will be lost, but at this time the output Y of the first delay buffer sub-unit PCELL will still pass through the two-stage inverter of the second delay buffer sub-unit NCELL. At this time, since the N-type transistor of the third-stage inverter PM3 is cut off, the first memristor R1 connected to the third-stage inverter PM3 is isolated, and the first memristor R1 does not participate in the RC delay, so the undesirable falling edge of the output Y is restored, and the output of the two-stage structure still has a falling edge with a relatively ideal slope.
当输入为0时(即NWL_N=1,NWL_P=0),第一控制开关NM0栅极接高电平,第一控制开关NM0导通,第二控制开关PM0栅极接低电平,第二控制开关PM0导通,第一忆阻器R1、第二忆阻器R2均被旁路,经过该延时缓冲单元的上升沿以及下降沿的延时都是4级反相器的本征延时,故而输入输出的脉冲宽度一致;当输入为1时(即NWL_N=0,NWL_P=1),第一控制开关NM0栅极接低电平,第一控制开关NM0截止,第二控制开关PM0栅极接高电平,第二控制开关PM0截止,上升沿延时被第一忆阻器R1,第一调节电容C1构成的RC延时主导,下降沿被第二忆阻器R2,第二调节电容C2构成的RC延时主导,因而输出脉冲的宽度与输入脉冲宽度不一致,不一致的程度由第一忆阻器R1和第二忆阻器R2的大小决定。When the input is 0 (i.e., NWL_N=1, NWL_P=0), the gate of the first control switch NM0 is connected to a high level, the first control switch NM0 is turned on, the gate of the second control switch PM0 is connected to a low level, the second control switch PM0 is turned on, the first memristor R1 and the second memristor R2 are bypassed, and the delays of the rising edge and the falling edge passing through the delay buffer unit are the intrinsic delays of the 4-stage inverter, so the pulse widths of the input and output are consistent; when the input is 1 (i.e., NWL_N=0, NWL_P=1), the gate of the first control switch NM0 is connected to a low level, the first control switch NM0 is turned off, the gate of the second control switch PM0 is connected to a high level, the second control switch PM0 is turned off, the rising edge delay is dominated by the RC delay composed of the first memristor R1 and the first adjustment capacitor C1, and the falling edge is dominated by the RC delay composed of the second memristor R2 and the second adjustment capacitor C2, so the width of the output pulse is inconsistent with the input pulse width, and the degree of inconsistency is determined by the size of the first memristor R1 and the second memristor R2.
在上述实施例中,两级延时缓冲子单元就已经构成一个差分单元,仅需要2*5T1R的开销,而且上升沿或下降沿都是在RC延时后经过了三级反相器的本征延时来恢复边沿斜率,因而在减小硬件开销的同时保证输出的延时信号仍具有斜率较为理想的下降沿和上升沿。由于使用更少的晶体管,电路动态翻转过程中的寄生电容更小,因此该单元结构可以实现更低的动态翻转功耗。In the above embodiment, the two-stage delay buffer subunit has formed a differential unit, which only requires 2*5T1R overhead, and the rising edge or falling edge is restored to the edge slope after the RC delay through the intrinsic delay of the three-stage inverter, thereby reducing the hardware overhead while ensuring that the output delay signal still has a falling edge and a rising edge with a relatively ideal slope. Since fewer transistors are used, the parasitic capacitance during the dynamic flipping process of the circuit is smaller, so the unit structure can achieve lower dynamic flipping power consumption.
图4是本公开至少一实施例提供的一种延时缓冲单元串联结构示意图,本公开至少一实施例提供的延时缓冲单元能够实现1位(bit)输入和带符号权重的乘积,下面结合图4进行说明。FIG4 is a schematic diagram of a series structure of a delay buffer unit provided in at least one embodiment of the present disclosure. The delay buffer unit provided in at least one embodiment of the present disclosure can realize the product of 1-bit input and signed weight, which is explained below in conjunction with FIG4 .
如图4所示,m个如上述任一实施例所述的延时缓冲单元串联形成串联结构,m为大于1的整数,该串联结构形成延时链。在该延时链的输入端输入输入脉冲W1,对应的每个延时缓冲单元输入控制信号IN<m-1>(例如第一个延时缓冲单元的控制信号为IN<0>),则可以在该延时链的输出端获取输出脉冲W2,其中,单个延时缓冲子单元产生的延时为W<m-1>(例如第一个延时缓冲子单元产生的延时为W<0>)。As shown in FIG4 , m delay buffer units as described in any of the above embodiments are connected in series to form a series structure, where m is an integer greater than 1, and the series structure forms a delay chain. An input pulse W1 is input at the input end of the delay chain, and a control signal IN<m-1> is input to each corresponding delay buffer unit (for example, the control signal of the first delay buffer unit is IN<0>), then an output pulse W2 can be obtained at the output end of the delay chain, wherein the delay generated by a single delay buffer subunit is W<m-1> (for example, the delay generated by the first delay buffer subunit is W<0>).
输出的脉冲W2相对于输入脉冲W1的上升沿的延时差NW由全部的第一延时缓冲子单元NCELL的RC延时累加主导,输出的脉冲相对于输入脉冲的下降沿的延时差PW由全部的第二延时缓冲子单元PCELL的RC延时累加主导,这样一来就完成了1bit输入向量(IN<0>,IN<1>…IN<m-1>)与带符号的权重向量(W<0>,W<1>…W<m-1>)的内积,这一结果就是由输出和输入的脉冲宽度的差(W2-W1)来体现的,即:The delay difference NW of the output pulse W2 relative to the rising edge of the input pulse W1 is dominated by the RC delay accumulation of all the first delay buffer sub-units NCELL, and the delay difference PW of the output pulse relative to the falling edge of the input pulse is dominated by the RC delay accumulation of all the second delay buffer sub-units PCELL. In this way, the inner product of the 1-bit input vector (IN<0>, IN<1>…IN<m-1>) and the signed weight vector (W<0>, W<1>…W<m-1>) is completed. This result is reflected by the difference in the output and input pulse widths (W2-W1), that is:
本公开至少一实施例还提供一种延时缓冲阵列,该延时缓冲阵列包括多个如上述任一实施例所述的延时缓冲单元。At least one embodiment of the present disclosure further provides a delay buffer array, which includes a plurality of delay buffer units as described in any of the above embodiments.
图5为本公开至少一实施例提供的一种延时缓冲阵列结构示意图,如图5所示,该延时缓冲阵列包括多个如上述任一实施例所述的延时缓冲单元,其中,多个延时缓冲单元排列为具有多行的阵列,并且每一行中的延时缓冲单元依次串联以构成一个延时链,例如,每个延时缓冲单元包括串联的第一延时缓冲子单元NCELL以及第二延时缓冲子单元PCELL。Figure 5 is a schematic diagram of a delay buffer array structure provided by at least one embodiment of the present disclosure. As shown in Figure 5, the delay buffer array includes a plurality of delay buffer units as described in any of the above embodiments, wherein the plurality of delay buffer units are arranged in an array having a plurality of rows, and the delay buffer units in each row are connected in series in sequence to form a delay chain. For example, each delay buffer unit includes a first delay buffer sub-unit NCELL and a second delay buffer sub-unit PCELL connected in series.
该延时缓冲阵列还可以进一步包括时间脉冲输入模块101,编程电压产生模块102,输入加载模块103,输出量化模块104,数据存储模块105,模式控制模块106,列选模块107,行选模块108,量化控制模块109。其中,时间脉冲输入模块101被配置为多个延时链的输入端分别提供时间脉冲信号;编程电压产生模块102被配置为对目标延时缓冲单元进行忆阻器编程;输入加载模块103被配置为为每个延时缓冲单元提供第一控制信号和第二控制信号作为输入信号;输出量化模块104被配置为对多个延时链的输出分别进行量化以得到数字输出信号;数据存储模块105被配置为存储延时缓冲阵列进行计算时的数据。The delay buffer array may further include a time pulse input module 101, a programming voltage generation module 102, an input loading module 103, an output quantization module 104, a data storage module 105, a mode control module 106, a column selection module 107, a row selection module 108, and a quantization control module 109. The time pulse input module 101 is configured to provide time pulse signals to the input ends of multiple delay chains respectively; the programming voltage generation module 102 is configured to perform memristor programming on the target delay buffer unit; the input loading module 103 is configured to provide a first control signal and a second control signal as input signals for each delay buffer unit; the output quantization module 104 is configured to quantize the outputs of multiple delay chains respectively to obtain digital output signals; and the data storage module 105 is configured to store data when the delay buffer array performs calculations.
模式控制模块106被配置为控制该延时缓冲阵列的工作模式,例如计算模式、编程模式以及校验模式等,列选模块107和行选模块108被配置为在权重映射时确定选取的延时缓冲单元位置,量化控制模块109被配置为产生输出量化模块104工作的时序信号以控制量化输出模块的工作,例如,量化控制模块可以是TDC(Time-to-Digital Converter,时间数字转换器)阵列。The mode control module 106 is configured to control the working mode of the delay buffer array, such as calculation mode, programming mode and verification mode, etc. The column selection module 107 and the row selection module 108 are configured to determine the selected delay buffer unit position during weight mapping, and the quantization control module 109 is configured to generate a timing signal for the output quantization module 104 to control the operation of the quantization output module. For example, the quantization control module can be a TDC (Time-to-Digital Converter) array.
由于实际中可能出现所有第一延时缓冲子单元NCELL的上升沿本征延时与第二延时缓冲子单元PCELL的本征延时不完全匹配,因此此时就算输入全0时输入脉冲与输出脉冲的宽度差也不会等于0,这对于计算来说是一个系统误差,可以在进行校验以及计算之前进行全0输入下的计算从而通过输出量化模块104提取该系统误差作为校验数据并保存在数据存储模块105中,之后校验以及计算的量化输出结果均减去该校验数据,实现消除系统误差。In practice, it may happen that the rising edge intrinsic delay of all the first delay buffer subunits NCELL and the intrinsic delay of the second delay buffer subunit PCELL do not completely match each other. Therefore, even when all 0s are input, the width difference between the input pulse and the output pulse will not be equal to 0. This is a systematic error for calculation. Before verification and calculation, calculation under all-0 input can be performed to extract the system error as verification data through the output quantization module 104 and save it in the data storage module 105. Afterwards, the verification data is subtracted from the quantized output results of the verification and calculation to eliminate the systematic error.
权重映射时通过列选模块107和行选模块108确定选定延时缓冲单元的位置并通过输出量化模块104进行校验,用编程电压产生模块102进行选中单元RRAM的编程;在计算时输入加载模块103给阵列的竖直方向施加输入向量,在水平方向得到矩阵向量乘的输出向量结果,通过输出量化模块104量化成数字输出。需要说明的是,无论是进行校验还是计算,在其开始前需要一个校准相位PH0,目的就是用来量化并存储全0输入下的系统误差DOUT0,之后在正常的校验或者计算过程PH1中在通过输出量化模块104量化成数字输出DOUT1。两个相位得到的数字输出在数字域进行相减得到真实能反映结果的输出向量DOUT1-DOUT0。During weight mapping, the position of the selected delay buffer unit is determined by the column selection module 107 and the row selection module 108 and verified by the output quantization module 104, and the programming voltage generation module 102 is used to program the selected unit RRAM; during calculation, the input loading module 103 applies the input vector to the vertical direction of the array, and obtains the output vector result of the matrix vector multiplication in the horizontal direction, which is quantized into a digital output by the output quantization module 104. It should be noted that, whether it is verification or calculation, a calibration phase PH0 is required before it starts, the purpose of which is to quantize and store the system error DOUT0 under all-0 input, and then quantize it into a digital output DOUT1 by the output quantization module 104 in the normal verification or calculation process PH1. The digital outputs obtained by the two phases are subtracted in the digital domain to obtain the output vector DOUT1-DOUT0 that can truly reflect the result.
下面结合一些实施例对本公开上述延时缓冲阵列在计算、校验以及编程的模式下的工作过程进行说明。The working process of the delay buffer array in the calculation, verification and programming modes of the present disclosure is described below in conjunction with some embodiments.
图6为本公开至少一实施例提供的一种延时缓冲阵列操作方法流程图,该操作方法应用于上述任一实施例所述的延时缓冲阵列,如图6所示,该操作方法包括步骤S201-S203。FIG6 is a flow chart of a delay buffer array operation method provided by at least one embodiment of the present disclosure. The operation method is applied to the delay buffer array described in any of the above embodiments. As shown in FIG6 , the operation method includes steps S201 - S203 .
S201、在延时缓冲阵列中被选择用于计算操作的延时链的输入端提供时间脉冲信号。S201 . Provide a time pulse signal at an input end of a delay chain selected for a computing operation in a delay buffer array.
例如,可以通过如图5所示的时间脉冲输入模块101提供时间脉冲信号,通过列选模块107和行选模块108选择用于计算操作的延时缓冲单元对应的延时链。For example, the time pulse input module 101 shown in FIG. 5 may provide a time pulse signal, and the column selection module 107 and the row selection module 108 may select a delay chain corresponding to a delay buffer unit used for a calculation operation.
S202、对延时缓冲阵列中被选择用于计算操作的延时链中每个延时缓冲单元分别施加作为输入数据信号的第一控制信号和第二控制信号,其中,输入数据信号控制输入数据信号对应的延时缓冲单元的第一控制开关以及第二控制开关导通或截止。S202. Apply a first control signal and a second control signal as input data signals to each delay buffer unit in the delay chain selected for computing operations in the delay buffer array, wherein the input data signal controls the first control switch and the second control switch of the delay buffer unit corresponding to the input data signal to be turned on or off.
例如,可以通过如图5所示的输入加载模块103提供第一控制信号和第二控制信号。For example, the first control signal and the second control signal may be provided by the input loading module 103 as shown in FIG. 5 .
S203、在延时缓冲阵列中被选择用于计算操作的延时链的输出端获取对象输出信号,其中,该对象输出信号是与输入的时间脉冲信号有延时的时间信号。S203 . Obtain an object output signal at an output end of a delay chain selected for a calculation operation in a delay buffer array, wherein the object output signal is a time signal delayed with respect to an input time pulse signal.
例如,施加到同一被选择用于计算操作的延时链中的每个延时缓冲单元的输入数据信号分别为多位数据中的1位。例如,延时链中存在m个延时缓冲单元,则输入的数据信号则有m位,每一位输入信号输入到对应的延时缓冲单元中,例如延时链中有两个延时缓冲单元,此时的输入信号为10,则第一个延时缓冲单元的对应输入信号为1,第二个延时缓冲单元对应的输入信号为0;对单个延时缓冲单元而言,输入信号有0和1,输入为0时,对应的第一控制信号和第二控制信号控制第一控制开关以及第二控制开关导通,将延时缓冲单元中的忆阻器旁路,输入为1时,对应的第一控制信号和第二控制信号控制第一控制开关以及第二控制开关截止,使得延时缓冲单元中的忆阻器参与RC延时。For example, the input data signal applied to each delay buffer unit in the same delay chain selected for computing operation is 1 bit of the multi-bit data. For example, if there are m delay buffer units in the delay chain, the input data signal has m bits, and each input signal is input into the corresponding delay buffer unit. For example, if there are two delay buffer units in the delay chain, and the input signal at this time is 10, the corresponding input signal of the first delay buffer unit is 1, and the corresponding input signal of the second delay buffer unit is 0; for a single delay buffer unit, the input signals are 0 and 1. When the input is 0, the corresponding first control signal and the second control signal control the first control switch and the second control switch to turn on, bypassing the memristor in the delay buffer unit. When the input is 1, the corresponding first control signal and the second control signal control the first control switch and the second control switch to turn off, so that the memristor in the delay buffer unit participates in RC delay.
例如,在获取到对象输出信号之后,还可以对对象输出信号进行量化,将时间信号转化为数字信号,具体有以下步骤:For example, after obtaining the object output signal, the object output signal can also be quantized to convert the time signal into a digital signal. Specifically, the following steps are performed:
S204、获取时间脉冲信号与对象输出信号之间的延时;S204, obtaining the delay between the time pulse signal and the object output signal;
S205、对延时进行量化以得到数字输出信号;S205, quantifying the delay to obtain a digital output signal;
S206将数字输出信号减去校验数据以得到延时缓冲阵列的校验输出。S206: Subtract the verification data from the digital output signal to obtain the verification output of the delay buffer array.
其中,校验数据即为前述中延时缓冲阵列在全0输入下的系统误差DOUT0。例如,可以通过如图5所示的输出量化模块104对延时进行量化The verification data is the system error DOUT0 of the delay buffer array under the all-0 input. For example, the delay can be quantized by the output quantization module 104 shown in FIG. 5
图7为本公开至少一实施例提供的又一种延时缓冲阵列操作方法流程图,该操作方法应用于上述任一实施例所述的延时缓冲阵列,如图7所示,该操作方法包括步骤S301-S302。FIG7 is a flow chart of another delay buffer array operation method provided by at least one embodiment of the present disclosure. The operation method is applied to the delay buffer array described in any of the above embodiments. As shown in FIG7 , the operation method includes steps S301 - S302 .
S301、选择延时缓冲阵列中的一个延时缓冲单元作为待编程单元;S301, selecting a delay buffer unit in the delay buffer array as a unit to be programmed;
S302、施加第一控制信号控制待编程单元中的第一控制开关,并且对待编程单元中的第一忆阻器的两端施加第一编程电压以改变第一忆阻器的阻值;或者,施加第二控制信号控制待编程单元中的第二控制开关,并且对待编程单元中的第二忆阻器的两端施加第二编程电压以改变第二忆阻器的阻值。S302, applying a first control signal to control a first control switch in the unit to be programmed, and applying a first programming voltage to both ends of a first memristor in the unit to be programmed to change the resistance value of the first memristor; or, applying a second control signal to control a second control switch in the unit to be programmed, and applying a second programming voltage to both ends of a second memristor in the unit to be programmed to change the resistance value of the second memristor.
例如,可以通过如图5所示的列选模块107和行选模块108选择作为待编程单元的延时缓冲单元。如前文所述,通过时间脉冲输入模块101输入例如常低电平,通过编程电压产生模块102对忆阻器执行第一处理操作实现对忆阻器的编程,具体过程此处不再赘述。For example, the delay buffer unit as the unit to be programmed can be selected by the column selection module 107 and the row selection module 108 as shown in FIG5 . As described above, the programming of the memristor is realized by inputting, for example, a normally low level through the time pulse input module 101 and performing the first processing operation on the memristor through the programming voltage generation module 102, and the specific process is not repeated here.
需要说明的是,对忆阻器的初始化操作仅需执行一次,对忆阻器的编程操作可以执行多次以调节(编程)忆阻器的电阻值,直到符合编程需要,例如,对忆阻器进行置位操作或复位操作后,在传统的做法中,还可以对忆阻器进行读取操作,以校验忆阻器的电阻值是否达到了预期电阻值,如果忆阻器的电阻值未达到预期电阻值,可以再次对忆阻器进行置位操作或复位操作,直至忆阻器的电阻值达到预期电阻值为止,判断忆阻器的电阻值是否达到预期具体包括以下步骤:It should be noted that the initialization operation of the memristor only needs to be performed once, and the programming operation of the memristor can be performed multiple times to adjust (program) the resistance value of the memristor until it meets the programming requirements. For example, after the memristor is set or reset, in the traditional practice, the memristor can also be read to verify whether the resistance value of the memristor reaches the expected resistance value. If the resistance value of the memristor does not reach the expected resistance value, the memristor can be set or reset again until the resistance value of the memristor reaches the expected resistance value. Determining whether the resistance value of the memristor reaches the expected resistance value specifically includes the following steps:
S303、在待编程单元对应的延时链输入端提供时间脉冲信号;S303, providing a time pulse signal at the input end of the delay chain corresponding to the unit to be programmed;
S304、控制待编程单元的第一控制开关截止以及第二控制开关导通,且控制待编程单元对应的延时链中除待编程单元之外的延时缓冲单元的第一控制开关以及第二控制开关导通,并获取待编程单元对应的延时链的输出信号以作为第一编程输出;或者,控制待编程单元的第一控制开关导通以及第二控制开关截止,且控制待编程单元对应的延时链中除待编程单元之外的延时缓冲单元的第一控制开关以及第二控制开关导通,并获取待编程单元对应的延时链的输出信号以作为第二编程输出;S304, control the first control switch of the unit to be programmed to be turned off and the second control switch to be turned on, and control the first control switch and the second control switch of the delay buffer unit other than the unit to be programmed in the delay chain corresponding to the unit to be programmed to be turned on, and obtain the output signal of the delay chain corresponding to the unit to be programmed as the first programming output; or control the first control switch of the unit to be programmed to be turned on and the second control switch to be turned off, and control the first control switch and the second control switch of the delay buffer unit other than the unit to be programmed in the delay chain corresponding to the unit to be programmed to be turned on, and obtain the output signal of the delay chain corresponding to the unit to be programmed as the second programming output;
S305、根据第一编程输出或第二编程输出确定待编程单元的第一忆阻器的阻值或第二忆阻器的阻值是否符合编程需要。S305 , determining whether the resistance value of the first memristor or the resistance value of the second memristor of the unit to be programmed meets programming requirements according to the first programming output or the second programming output.
图8A以及图8B为本公开至少一实施例提供的一种延时缓冲单元编程校验示意图。8A and 8B are schematic diagrams of a delay buffer unit programming verification according to at least one embodiment of the present disclosure.
如图8A所示,m个如上述任一实施例所述的延时缓冲单元串联形成延时链,每个延时缓冲单元中包括串联的第一延时缓冲子单元NCELL以及第二延时缓冲子单元PCELL,在该延时链的输入端灌入输入脉冲W1,输出端获取输出脉冲W2,选中延时链中其中一个延时缓冲单元作为待编程单元(图中为第一个),每个延时缓冲单元对输入脉冲产生的延时为W<m-1>(例如第一个延时缓冲单元对输入脉冲产生的延时为W<0>),该延时链中所有延时缓冲子单元对输入脉冲W1上升沿产生的本征延时为tr_int,对输入脉冲W1下降沿产生的本征延时为tf_int,通过控制信号控制待编程单元第一延时子单元NCELL中的第一控制开关截止(图中输入为1的第一延时子单元NCELL),延时链中其余的第一延时子单元NCELL中的第一控制开关以及所有第二延时子单元PCELL中的第二控制开关受控制信号控制导通(图中输入为0的第一延时子单元NCELL以及第二延时子单元PCELL),此时延时链中仅有待编程单元的第一延时子单元NCELL中的忆阻器参与了RC延时,对输入脉冲W1的上升沿的延时影响为R0+C。As shown in FIG8A , m delay buffer units as described in any of the above embodiments are connected in series to form a delay chain, each delay buffer unit includes a first delay buffer subunit NCELL and a second delay buffer subunit PCELL connected in series, an input pulse W1 is input into the input end of the delay chain, and an output pulse W2 is obtained at the output end, one of the delay buffer units in the delay chain is selected as the unit to be programmed (the first one in the figure), and the delay generated by each delay buffer unit to the input pulse is W<m-1> (for example, the delay generated by the first delay buffer unit to the input pulse is W<0>), and the intrinsic delay generated by all the delay buffer subunits in the delay chain to the rising edge of the input pulse W1 is tr_int, which is The intrinsic delay generated by the falling edge of the input pulse W1 is tf_int. The first control switch in the first delay subunit NCELL of the unit to be programmed is cut off by the control signal (the first delay subunit NCELL with an input of 1 in the figure), and the first control switches in the remaining first delay subunits NCELL in the delay chain and the second control switches in all second delay subunits PCELL are turned on by the control signal (the first delay subunit NCELL and the second delay subunit PCELL with an input of 0 in the figure). At this time, only the memristor in the first delay subunit NCELL of the unit to be programmed in the delay chain participates in the RC delay, and the delay effect on the rising edge of the input pulse W1 is R0 + C.
如图8B所示,通过控制信号控制图8A中待编程单元的第二延时子单元PCELL中的第二控制开关导通,延时链中所有的第一延时子单元NCELL中的第一控制开关以及其余第二延时子单元PCELL中的第二控制开关受控制信号控制导通,其余配置与图8A相同,此处不再赘述。此时延时链中仅有待编程单元的第二延时子单元PCELL中的忆阻器参与了RC延时,对输入脉冲W1的下降沿延时的影响为R0-C。由此可以根据R0+C以及R0-C的大小判断待编程单元中两个忆阻器的阻值是否符合要求。As shown in FIG8B, the second control switch in the second delay subunit PCELL of the unit to be programmed in FIG8A is turned on by controlling the control signal, and the first control switches in all the first delay subunits NCELL in the delay chain and the second control switches in the remaining second delay subunits PCELL are turned on by the control signal, and the remaining configurations are the same as FIG8A and will not be repeated here. At this time, only the memristor in the second delay subunit PCELL of the unit to be programmed in the delay chain participates in the RC delay, and the effect on the falling edge delay of the input pulse W1 is R0 - C. Therefore, it can be judged whether the resistance values of the two memristors in the unit to be programmed meet the requirements according to the size of R0 + C and R0 - C.
上述为在执行计算模式之前对单个单元进行编程校验的过程,也就是权重的写过程;权重的编程校验同样在时间域内进行,也即权重编程的判据是延时而非传统意义下的电导。对于正权重进行编程时,如图8A所示,只有被选中的单元的NCELL输入1,该单元内的PCELL以及其他单元的PCELL和NCELL均输入0,这样一来输入整个延时链上只有被选中单元的NCELL贡献上升沿RC延时,因此可以依据输入输出脉冲的相对宽度来进行选中单元的NCELL的权重定义以及修正;对于负权重的编程类似,如图8B所示,只有被选中的单元的PCELL输入1,该单元内的NCELL以及其他单元的PCELL和NCELL均输入0,这样一来输入整个延时链上只有被选中单元的PCELL贡献下降沿RC延时,因此可以依据输入输出脉冲的相对宽度来进行选中单元的PCELL的权重定义以及修正。The above is the process of programming and verifying a single cell before executing the calculation mode, that is, the process of writing the weight; the programming and verification of the weight is also performed in the time domain, that is, the criterion for weight programming is delay rather than conductivity in the traditional sense. When programming positive weights, as shown in FIG8A , only the NCELL of the selected cell inputs 1, and the PCELL in the cell and the PCELL and NCELL of other cells all input 0, so that only the NCELL of the selected cell contributes to the rising edge RC delay in the entire delay chain, so the weight definition and correction of the NCELL of the selected cell can be performed based on the relative width of the input and output pulses; similarly for negative weight programming, as shown in FIG8B , only the PCELL of the selected cell inputs 1, and the NCELL in the cell and the PCELL and NCELL of other cells all input 0, so that only the PCELL of the selected cell contributes to the falling edge RC delay in the entire delay chain, so the weight definition and correction of the PCELL of the selected cell can be performed based on the relative width of the input and output pulses.
图9为本公开至少一实施例提供的又一种延时缓冲阵列操作方法流程图,该操作方法应用于上述任一实施例所述的延时缓冲阵列,如图9所示,该操作方法包括步骤S401-S404。FIG9 is a flow chart of another delay buffer array operation method provided by at least one embodiment of the present disclosure. The operation method is applied to the delay buffer array described in any of the above embodiments. As shown in FIG9 , the operation method includes steps S401 - S404 .
S401、在延时缓冲阵列中被选择用于校验操作的延时链的输入端提供时间脉冲信号;S401, providing a time pulse signal at the input end of the delay chain selected for the verification operation in the delay buffer array;
S402、对延时缓冲阵列中被选择用于校验操作的延时链中每个延时缓冲单元分别施加作为输入信号的第一控制信号和第二控制信号,以控制每个延时缓冲单元的第一控制开关以及第二控制开关导通;S402, applying a first control signal and a second control signal as input signals to each delay buffer unit in a delay chain selected for a verification operation in the delay buffer array, so as to control a first control switch and a second control switch of each delay buffer unit to be turned on;
S403、在延时缓冲阵列中被选择用于校验操作的延时链的输出端获取输出信号以作为本征输出;S403, obtaining an output signal as an intrinsic output at an output end of a delay chain selected for a verification operation in a delay buffer array;
S404、对本征输出与时间脉冲信号之间的延时进行量化以得到延时缓冲阵列中被选择用于校验操作的延时链的校验数据。S404 , quantify the delay between the intrinsic output and the time pulse signal to obtain verification data of the delay chain selected for verification operation in the delay buffer array.
例如,可以通过如图5所示的时间脉冲输入模块101提供时间脉冲信号,通过列选模块107和行选模块108选择用于校验操作的延时缓冲单元对应的延时链,通过输入加载模块103提供全为0的第一控制信号以及第二控制信号以控制延时链中所有第一控制开关以及第二控制开关导通,通过输出量化模块104对本征输出与输入时间脉冲信号之间的延时进行量化以得到系统误差DOUT0作为校验数据。For example, a time pulse signal can be provided through the time pulse input module 101 as shown in Figure 5, the delay chain corresponding to the delay buffer unit used for the verification operation can be selected through the column selection module 107 and the row selection module 108, the first control signal and the second control signal that are all 0 are provided through the input loading module 103 to control all the first control switches and the second control switches in the delay chain to be turned on, and the delay between the intrinsic output and the input time pulse signal is quantized through the output quantization module 104 to obtain the system error DOUT0 as the verification data.
本公开至少一实施例还提供一种电子装置,该电子装置包括如上述任一实施例所述的延时缓冲阵列,该延时转换阵列例如可以为如图5所示结构。At least one embodiment of the present disclosure further provides an electronic device, which includes a delay buffer array as described in any of the above embodiments. The delay conversion array may be, for example, a structure as shown in FIG. 5 .
该电子装置例如可以是处理器等任何包括该处理器与处理器配套使用的产品或部件。又例如,该电子装置可以是服务端设备或终端设备,例如终端设备可以为手机、笔记本电脑等。The electronic device may be, for example, a processor or any other product or component including the processor and used in conjunction with the processor. For another example, the electronic device may be a server device or a terminal device, such as a mobile phone, a laptop computer, or the like.
本公开实施例提供的延时缓冲单元、阵列、电子装置以及阵列的操作方法,能够在减小单元面积,使用更少晶体管的情况下保证计算精度、输出具有斜率较为理想的下降沿和上升沿的延时信号,并且降低动态功耗。The delay buffer unit, array, electronic device and array operation method provided by the embodiments of the present disclosure can ensure calculation accuracy while reducing the unit area and using fewer transistors, output delay signals with falling and rising edges with relatively ideal slopes, and reduce dynamic power consumption.
例如,在至少一个实施例中,延时缓冲单元构成的阵列的前向计算范式可以在一行内支持全并行的矩阵向量乘;例如,可以通过校验数据能够抵消由于工艺等非理想因素造成的系统误差。For example, in at least one embodiment, the forward computing paradigm of the array of delay buffer units can support fully parallel matrix-vector multiplication within a row; for example, system errors caused by non-ideal factors such as process can be offset by verifying data.
可以理解的是,以上实施方式仅仅是为了说明本公开的原理而采用的示例性实施方式,然而本公开并不局限于此。对于本领域内的普通技术人员而言,在不脱离本公开的精神和实质的情况下,可以做出各种变型和改进,这些变型和改进也视为本公开的保护范围。It is to be understood that the above embodiments are merely exemplary embodiments used to illustrate the principles of the present disclosure, but the present disclosure is not limited thereto. For those of ordinary skill in the art, various modifications and improvements can be made without departing from the spirit and substance of the present disclosure, and these modifications and improvements are also considered to be within the scope of protection of the present disclosure.
除了上述示例性的描述之外,对于本公开,还有以下几点需要说明:In addition to the above exemplary description, the following points need to be explained for the present disclosure:
(1)本公开实施例附图只涉及到与本公开实施例涉及到的结构,其他结构可参考通常设计。(1) The drawings of the embodiments of the present disclosure only relate to the structures related to the embodiments of the present disclosure, and other structures may refer to the general design.
(2)为了清晰起见,在用于描述本公开的实施例的附图中,层或区域的厚度被放大或缩小,即这些附图并非按照实际的比例绘制。(2) For the sake of clarity, in the drawings used to describe the embodiments of the present disclosure, the thickness of layers or regions is enlarged or reduced, that is, these drawings are not drawn according to the actual scale.
(3)在不冲突的情况下,本公开的实施例及实施例中的特征可以相互组合以得到新的实施例。(3) In the absence of conflict, the embodiments of the present disclosure and the features therein may be combined with each other to obtain new embodiments.
以上所述,仅为本公开的具体实施方式,但本公开的保护范围并不局限于此,本公开的保护范围应以所述权利要求的保护范围为准。The above description is only a specific implementation of the present disclosure, but the protection scope of the present disclosure is not limited thereto. The protection scope of the present disclosure shall be based on the protection scope of the claims.
Claims (15)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311842660.XA CN118138026A (en) | 2023-12-28 | 2023-12-28 | Delay buffer unit, electronic device, delay buffer array and operation method thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311842660.XA CN118138026A (en) | 2023-12-28 | 2023-12-28 | Delay buffer unit, electronic device, delay buffer array and operation method thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118138026A true CN118138026A (en) | 2024-06-04 |
Family
ID=91244709
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311842660.XA Pending CN118138026A (en) | 2023-12-28 | 2023-12-28 | Delay buffer unit, electronic device, delay buffer array and operation method thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118138026A (en) |
-
2023
- 2023-12-28 CN CN202311842660.XA patent/CN118138026A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11893271B2 (en) | Computing-in-memory circuit | |
US10825526B1 (en) | Non-volatile memory with reduced data cache buffer | |
CN114298296B (en) | Convolutional neural network processing method and device based on storage and computing integrated array | |
CN115210810A (en) | In-memory computational dynamic random access memory | |
WO2024131396A1 (en) | Delay buffer unit and operation method therefor, and calculation apparatus and operation method therefor | |
US12242949B2 (en) | Compute-in-memory devices, systems and methods of operation thereof | |
WO2024109644A1 (en) | Method for operating memristor array, and data processing apparatus | |
TWI778674B (en) | Signal preserve in mram during reading | |
US10811082B1 (en) | Non-volatile memory with fast data cache transfer scheme | |
CN115954029A (en) | Multi-bit operation module and in-memory calculation circuit structure using the same | |
CN118138026A (en) | Delay buffer unit, electronic device, delay buffer array and operation method thereof | |
TWI773393B (en) | Signal amplification in mram during reading | |
CN116052741A (en) | Nonvolatile 3T1R1C memory circuit, correction circuit, DRAM and memory circuit | |
US11605426B2 (en) | Retention drift correction in non-volatile memory arrays | |
CN111243648B (en) | Flash memory unit, flash memory module and flash memory chip | |
CN111931923A (en) | Near memory computing system | |
US20240112715A1 (en) | Data logic processing circuit integrated in a data storage circuit | |
CN112951290A (en) | Memory computing circuit and device based on nonvolatile random access memory | |
TWI751048B (en) | Memory device and operation thereof | |
CN117559980A (en) | Delay buffer unit, delay buffer array, electronic device and operating method | |
US12354672B2 (en) | Memory sensing with global non-regular counter and/or global multiple reference voltages | |
CN114121089B (en) | Data processing method and device based on memristor array | |
US20250078935A1 (en) | Memory sensing with global non-regular counter and/or global multiple reference voltages | |
US20240364338A1 (en) | Mini-pump level shifter for robust switching operation under low vdd environment | |
Nie et al. | Cross-layer designs against non-ideal effects in ReRAM-based processing-in-memory system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |