WO2020088005A1 - Embedded quick adder apparatus based on memristor array underflow path and calculation method - Google Patents

Embedded quick adder apparatus based on memristor array underflow path and calculation method Download PDF

Info

Publication number
WO2020088005A1
WO2020088005A1 PCT/CN2019/097848 CN2019097848W WO2020088005A1 WO 2020088005 A1 WO2020088005 A1 WO 2020088005A1 CN 2019097848 W CN2019097848 W CN 2019097848W WO 2020088005 A1 WO2020088005 A1 WO 2020088005A1
Authority
WO
WIPO (PCT)
Prior art keywords
carry
memristor
mapping
calculation
adder
Prior art date
Application number
PCT/CN2019/097848
Other languages
French (fr)
Chinese (zh)
Inventor
景乃锋
李桃中
李彤
王琴
蒋剑飞
贺光辉
毛志刚
Original Assignee
上海交通大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海交通大学 filed Critical 上海交通大学
Publication of WO2020088005A1 publication Critical patent/WO2020088005A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/50Adding; Subtracting
    • G06F7/505Adding; Subtracting in bit-parallel fashion, i.e. having a different digit-handling circuit for each denomination
    • G06F7/5052Adding; Subtracting in bit-parallel fashion, i.e. having a different digit-handling circuit for each denomination using carry completion detection, either over all stages or at sample stages only

Definitions

  • the invention belongs to the field of nonvolatile memory based on new materials and relates to memory calculation technology.
  • Boolean logic The method based on Boolean logic is the simplest and most intuitive, that is, it is spliced by the basic logic organization supported by the circuit according to the logic expression of addition.
  • Typical implementations include IMPLY circuits and MAGIC circuits.
  • the disadvantage of this calculation method is also obvious, that is, the calculation efficiency is low.
  • For the 1-bit full adder using IMPLY circuit and MAGIC circuit to achieve 29 and 12 steps, respectively, contemporary computing systems are usually 32-bit wide. Without considering the carry movement, only the calculation part requires 928 and 384 steps, considering the write speed of memristors is usually slow, such a large computational overhead is unacceptable.
  • look-up table (LUT) is to imitate the design idea of FPGA, use the programmable characteristic of memristor, calculate the result of a certain specific logic function by means of IMPLY or MAGIC in advance, and store it in the memristor array. Because these pre-calculations are done offline, this method has a high computational efficiency, and one calculation is equivalent to only one read operation. For example, if the lookup table stores a 1-bit full adder, 32 steps are required to complete the 32-bit adder calculation, and if a 2-bit adder result is stored, only 16 steps are required to complete the 32-bit calculation.
  • this method is not a real memory calculation, they are just using the memristor storage array as a programmable arithmetic unit, and once the storage array is configured as this type of arithmetic unit, there is no way to use it as data
  • the memory stores the operands and operation results. And this way and it consumes hardware resources, only the lookup table of the 1-bit full adder needs to occupy 8 ⁇ 14 array space, and as the bit width increases, the consumed array area increases nonlinearly.
  • the method based on the programmable logic array is based on the characteristics that any digital logic can be expressed as a product-of-sum or a sum-of-product form, and is customized using a memristor array
  • the minimum item (maximum item) plane In this plane, the structure of the minimum item (maximum item) is fixed.
  • different digital logics can be implemented by activating different rows or columns to achieve The purpose of programming.
  • the shortcomings of this implementation are consistent with the lookup table-based approach, that is, it is not really a memory calculation, and consumes a lot of hardware resources of the memristor array.
  • Memristor memory is used as an alternative to memory calculation to solve the memory wall problem.
  • the present invention proposes an implementation scheme of an adder to improve the efficiency of memory calculation of the memristive memory.
  • Figure 1 shows the principle diagram of carrying calculation based on memristor. Note that the memristor represents the logic value through the resistance value, the low resistance state (LRS) represents logic 1, and the high resistance state (HRS) represents logic. 0.
  • Figure 1 (b) shows a schematic diagram of 4-bit carry calculation based on memristor, where the calculation of the carry of each bit is based on the above three cases. Since the states of R_G, R_D, and R_P can be obtained in advance by means of logical calculation, the circuit can quickly complete the carry calculation according to the submerged flow path of the corresponding bit.
  • the logic calculation here can be implemented using existing memristor operation technology, such as IMPLY or MAGIC.
  • R_P forms a carry chain of carry propagation through series connection.
  • the series connection method is not perfectly compatible with the array structure, so it is necessary to customize the carry propagation path on the basis of the array structure.
  • the adder implementation based on memristive array is mainly reflected in the following three points:
  • Carry underflow path mapping pre-calculate the status of R_G, R_D and R_P, used to determine the way of carrying calculations for different bits;
  • This patent mainly proposes an adder design based on a memristor storage array.
  • the design is tested using HSPICE, a new non-volatile memory simulation tool NVSim, which reflects the technical effects of this patent from three aspects: computing performance, area overhead and power consumption overhead. :
  • Area overhead The area overhead is evaluated from two aspects. On the one hand, the number of array elements that need to be occupied by the intermediate data generated during the addition calculation. This part of the unit needs to be reserved for buffering intermediate data during the addition operation and cannot be used for other The storage of data, that is, the greater the proportion of this part of the unit, the lower the utilization rate of the array, and the greater the overhead of the addition operation. Also taking the 32-bit adder as an example, IMPLY and MAGIC require an additional 2 and 352 units, respectively, and this design requires an additional 64 units. For IMPLY design, the array overhead is increased by 31 times, and for MAGIC design, the array overhead is reduced. 4.5 times; on the other hand is the overhead of the carry chain and the control circuit relative to the traditional memristive memory, this part of the overhead is about 12.4%;
  • Power consumption overhead The additional power consumption overhead is also caused by the control circuit and the introduced carry chain. Compared with the power consumption of the peripheral circuit during the storage operation, the power consumption overhead of these two parts accounts for about 19.5%.
  • FIG. 1 is a schematic diagram of a system framework according to an embodiment of the present invention (a) 1-bit (b) 4-bit;
  • FIG. 2 is a schematic diagram of a carry potential path mapping according to an embodiment of the present invention (a) voltage division read operation (b) current sensing read operation;
  • FIG. 3 is a schematic diagram of a carry chain of a carry propagation path according to an embodiment of the present invention (a) based on ReRAM (b) based on CMOS.
  • the addition implementation based on the memristor storage array mainly includes three points: carry-underflow path mapping, construct serial carry chain and sum calculation. The embodiment is specifically explained below.
  • Figure 1 shows the principle diagram of adding operation based on memristor.
  • the main work completed by carrying underflow path mapping is to map the schematic to the real array structure. Since R_G of different bits are independent of each other, they can be mapped to the same row of the array and the mapping calculation is performed at the same time.
  • R_D the mapping method is the same as R_G, and it can be mapped to another row of the array.
  • R_D mapping is not necessary because the voltage-divided read operations usually need to be
  • the load resistance is connected to the bottom of the array to sense the resistance state of the selected cell during the reading process.
  • the load resistance plays the same role as R_D, so the mapping step of R_D can be omitted, further reducing the Calculate delay and mapping overhead.
  • R_P cannot be directly mapped into the array structure, and a serial carry chain needs to be constructed to realize the function of R_P.
  • carry out the calculation by applying the operation voltage to the row where R_G is located.
  • the array uses a voltage-divided read operation, just turn off the rest of the array; if the array uses other types of read operations, such as current sensing, you need to ground the row where R_D is located and turn off the rest of the array.
  • Figure 2 shows the schematic diagrams of these two different mapping methods respectively.
  • the low resistance state of the memristor should also reach the order of kilo-ohms, so even if the memorization on the carry path
  • the resistors are in a low-impedance state, and the current may still be gradually weakened in the propagation process due to the excessive resistance of the path, and the calculation of the subsequent carry cannot be completed.
  • the present invention uses a traditional MOS tube to complete the construction of the carry chain, as shown in FIG. 3b.
  • the present invention will use the line buffer to temporarily store the control signal of the carry propagation path And use this signal to directly control the turning on and off of the transistors in the carry chain.
  • the present invention does not directly connect the transistor to the adjacent bit line, but connects its lower bit end to the corresponding sensitive amplifier output, using the strong power of the sensitive amplifier
  • the driving ability drives the spread of carry. In this way, no matter how long the carry chain is, the carry propagation will not be affected by the lack of driving force, because the carry propagation of any bit is directly driven by the sensitive amplifier of the previous bit, so that the process of carry propagation is independent
  • the length of the carry chain The serial carry chain constructed in this way can quickly obtain the carry of each bit after the carry calculation is activated.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Optimization (AREA)
  • General Engineering & Computer Science (AREA)
  • Memory System (AREA)
  • Complex Calculations (AREA)
  • Logic Circuits (AREA)

Abstract

The present invention relates to an adder apparatus based on a memristor array underflow path, and compared with a memristor adder performing an addition calculation by means of logical iteration at the present stage, the present invention is characterized in: 1 separating serial and parallel parts of an addition operation, that is, serial carry and parallel summation; 2 aiming at serial carry, constructing a corresponding underflow path and customizing a carry propagation underflow path therein, and using electric current propagation to simulate carry behavior, thereby greatly accelerating a carry calculation; and 3 after obtaining the carry of each bit, using a logical calculation method of the memristor at the current stage and cross array parallel structure features to complete the summation calculation of all bits at the same time.

Description

一种基于忆阻器阵列潜流路径的内嵌快速加法器装置及计算方法Embedded fast adder device and calculation method based on memristor array submerged flow path 技术领域Technical field
本发明属于基于新型材料的非易失存储器领域,涉及内存计算技术。The invention belongs to the field of nonvolatile memory based on new materials and relates to memory calculation technology.
背景技术Background technique
现阶段针对基于忆阻器存储阵列进行加法计算的方法主要有三种。分别是基于布尔逻辑的方式,基于查找表的方式(LUT)以及基于可编程逻辑阵列的方式(PLA)。At this stage, there are three main methods for addition calculation based on memristor storage arrays. They are based on Boolean logic, lookup table (LUT) and programmable logic array (PLA).
基于布尔逻辑的方式最为简单直观,即根据加法的逻辑表达式通过电路支持的基本逻辑组织拼接而成,典型的实现包括IMPLY电路,MAGIC电路等。然而这种运算方式的缺点也很明显,即计算效率低下。对于1-bit全加器而言,利用IMPLY电路和MAGIC电路实现分别需要29和12步操作,当代计算系统通常为32位宽,在不考虑进位搬移的前提下,仅计算部分就需要928和384步,考虑到忆阻器的写速度通常较慢,如此大的计算开销是难以接受的。The method based on Boolean logic is the simplest and most intuitive, that is, it is spliced by the basic logic organization supported by the circuit according to the logic expression of addition. Typical implementations include IMPLY circuits and MAGIC circuits. However, the disadvantage of this calculation method is also obvious, that is, the calculation efficiency is low. For the 1-bit full adder, using IMPLY circuit and MAGIC circuit to achieve 29 and 12 steps, respectively, contemporary computing systems are usually 32-bit wide. Without considering the carry movement, only the calculation part requires 928 and 384 steps, considering the write speed of memristors is usually slow, such a large computational overhead is unacceptable.
基于查找表(LUT)的方式是仿照FPGA的设计思想,利用忆阻器可编程特性,提前通过IMPLY或MAGIC等手段计算得到某种特定逻辑功能的结果并存入忆阻阵列中。由于这些预计算都是在线下完成的,因此这种方法的运算效率较高,一次计算仅相当于一次读操作。例如,若查找表存储的是1-bit全加器,则完成32-bit加法计算需要32步,若存储的是2-bit加法器结果,则仅需要16步完成32-bit计算。然而这种方式并不是真正意义上的内存计算,它们仅仅是将忆阻器存储阵列作为一种可编程的运算单元,并且一旦存储阵列被配置成这种类型的运算单元,就没办法作为数据存储器去存储操作数和运算结果。并且这种方式及其耗费硬件资源,仅1-bit全加器的查找表就需要占据8×14的阵列空间,且随着位宽增加,消耗的阵列面积呈非线性增长。The way based on look-up table (LUT) is to imitate the design idea of FPGA, use the programmable characteristic of memristor, calculate the result of a certain specific logic function by means of IMPLY or MAGIC in advance, and store it in the memristor array. Because these pre-calculations are done offline, this method has a high computational efficiency, and one calculation is equivalent to only one read operation. For example, if the lookup table stores a 1-bit full adder, 32 steps are required to complete the 32-bit adder calculation, and if a 2-bit adder result is stored, only 16 steps are required to complete the 32-bit calculation. However, this method is not a real memory calculation, they are just using the memristor storage array as a programmable arithmetic unit, and once the storage array is configured as this type of arithmetic unit, there is no way to use it as data The memory stores the operands and operation results. And this way and it consumes hardware resources, only the lookup table of the 1-bit full adder needs to occupy 8 × 14 array space, and as the bit width increases, the consumed array area increases nonlinearly.
基于可编程逻辑阵列(PLA)的方式是根据任意数字逻辑均可表示为和之积(product-of-sum)或积之和(sum-of-product)形式的特点,利用忆阻器阵列定制最小项(最大项)平面,在这个平面中,最小项(最大项)的构造是固定的,然 而在形成这些基本项后,可以通过激活不同行或列来实现不同的数字逻辑,从而达到可编程的目的。这种实现方式的缺点和基于查找表的方式是一致的,即并非真正意义上的内存计算,并且相当耗费忆阻器阵列的硬件资源。The method based on the programmable logic array (PLA) is based on the characteristics that any digital logic can be expressed as a product-of-sum or a sum-of-product form, and is customized using a memristor array The minimum item (maximum item) plane. In this plane, the structure of the minimum item (maximum item) is fixed. However, after these basic items are formed, different digital logics can be implemented by activating different rows or columns to achieve The purpose of programming. The shortcomings of this implementation are consistent with the lookup table-based approach, that is, it is not really a memory calculation, and consumes a lot of hardware resources of the memristor array.
发明内容Summary of the invention
忆阻存储器作为内存计算的备选方案,用于解决存储器墙问题。本发明针对忆阻存储阵列的结构,提出一种加法器的实现方案,以提高忆阻存储器内存计算的效率。Memristor memory is used as an alternative to memory calculation to solve the memory wall problem. In view of the structure of the memristive memory array, the present invention proposes an implementation scheme of an adder to improve the efficiency of memory calculation of the memristive memory.
1-bit全加器的逻辑表达式可表示为以下两式:The logical expression of the 1-bit full adder can be expressed as the following two formulas:
Figure PCTCN2019097848-appb-000001
Figure PCTCN2019097848-appb-000001
C_o=AB+AC_i+BC_iC_o = AB + AC_i + BC_i
对于多比特加法器,如果按照逻辑迭代的方式一步一步进行计算,则运算效率低下,从而丧失了内存计算技术由于降低数据搬移开销带来的优势。但如果可以以一种高效的方式得到所有位的进位,则可以利用阵列结构并行的特点一次性求得所有位的和。依据这种思想,图1给出了基于忆阻器实现进位计算的原理图,注意忆阻器通过阻值表示逻辑值,低阻态(LRS)表示逻辑1,高阻态(HRS)表示逻辑0。其中图1(a)显示了进位产生的三条潜流路径:1)进位产生路径,R_G=A·B,意味着当两个加数都为1时,该路径为低阻态,电源通过该路径对电容充电并最终形成进位;2)进位取消路径,R_D=A -·B -,意味着当两个加数都为0时,该路径为低阻态,电容通过该路径放电并最终不形成进位;3)进位传播路径,
Figure PCTCN2019097848-appb-000002
意味着当两个加数互斥时,该路径为低阻态,上一位的进位通过该路径传播给当前位。图1(b)显示了基于忆阻器4-bit进位计算的示意图,其中每一位的进位的计算都出于上述三种情况。由于R_G,R_D和R_P的状态可以通过逻辑计算的方式预先得到,因此电路可以根据相应位的潜流路径快速完成进位计算。此处的逻辑计算可以采用现有的忆阻器运算技术实现,如IMPLY或MAGIC实现。然而不同于R_G和R_D并联结构,R_P是通过串联方式形成了进位传播的进位链,串联方式与阵列结构无法完美兼容,因此需要在阵列结构的基础上定制进位传播路径。不难看出当进位从最低位沿着R_P的进位链传播到最高位时,整个进位计算的延时是最长的。虽然电路依然受制于进位串行传播的本质,然而相较于逻辑迭代的方式,进位计算的速度得到极大幅度的提升,原因在于本发明利用的是模拟电流传播并充电的方式完成进位的 传播过程,而逻辑迭代的方式则按照时钟周期并根据逻辑表达式一步一步计算得到每一位的进位,并且每一步的进位计算都需要写操作,而写操作对于忆阻器件而言普遍较慢,因此这种进位计算的方式速度很慢。在得到所有位的进位后,根据求和逻辑表达式计算出最终的结果,在这一步计算中,位与位之间是相互独立的,因此不同位的求和运算可以在阵列结构下并行进行。
For multi-bit adders, if the calculation is performed step by step according to the logic iteration method, the operation efficiency is low, thereby losing the advantages of the memory calculation technology due to the reduction of data movement overhead. However, if the carry of all bits can be obtained in an efficient manner, the parallel nature of the array structure can be used to obtain the sum of all bits at once. According to this idea, Figure 1 shows the principle diagram of carrying calculation based on memristor. Note that the memristor represents the logic value through the resistance value, the low resistance state (LRS) represents logic 1, and the high resistance state (HRS) represents logic. 0. Figure 1 (a) shows the three submerged current paths generated by carry: 1) Carry generation path, R_G = A · B, which means that when both addends are 1, the path is in a low-impedance state, and the power passes through the path charging the capacitor and eventually a carry; 2) carry cancellation path, R_D = A - · B - , means that when the two addends are 0, the path is a low impedance state, the capacitor discharges through this path is not formed and ultimately Carry; 3) carry propagation path,
Figure PCTCN2019097848-appb-000002
This means that when the two addends are mutually exclusive, the path is in a low-impedance state, and the carry of the previous bit is propagated to the current bit through the path. Figure 1 (b) shows a schematic diagram of 4-bit carry calculation based on memristor, where the calculation of the carry of each bit is based on the above three cases. Since the states of R_G, R_D, and R_P can be obtained in advance by means of logical calculation, the circuit can quickly complete the carry calculation according to the submerged flow path of the corresponding bit. The logic calculation here can be implemented using existing memristor operation technology, such as IMPLY or MAGIC. However, unlike the parallel structure of R_G and R_D, R_P forms a carry chain of carry propagation through series connection. The series connection method is not perfectly compatible with the array structure, so it is necessary to customize the carry propagation path on the basis of the array structure. It is not difficult to see that when the carry is propagated from the lowest bit to the highest bit along the carry chain of R_P, the delay of the entire carry calculation is the longest. Although the circuit is still subject to the essence of carry serial propagation, compared to the logic iteration method, the speed of carry calculation is greatly improved, because the present invention uses the method of simulating current propagation and charging to complete the carry propagation The process of logic iteration is to calculate the carry of each bit step by step according to the clock cycle and according to the logical expression, and each step of the carry calculation requires a write operation, which is generally slower for memristive devices. Therefore, this method of carrying calculation is very slow. After getting the carry of all bits, the final result is calculated according to the summation logic expression. In this step of calculation, the bits are independent of each other, so the summation of different bits can be performed in parallel under the array structure .
综上,基于忆阻阵列的加法器实现主要体现在以下三点:In summary, the adder implementation based on memristive array is mainly reflected in the following three points:
进位潜流路径映射:预先计算R_G,R_D和R_P的状态,用于确定不同位的进位计算途径;Carry underflow path mapping: pre-calculate the status of R_G, R_D and R_P, used to determine the way of carrying calculations for different bits;
构造串行进位链:由于阵列结构无法形成进位传播路径,因此需要定制一条由R_P控制的进位传播路径,以应对上述介绍的进位计算的第三种情况;Construct a serial carry chain: Since the array structure cannot form a carry propagation path, a carry propagation path controlled by R_P needs to be customized to cope with the third case of carry calculation described above;
求和计算:各比特位进位计算完成后,通过相应的逻辑实现并行完成所有位的求和计算。Summation calculation: After the carry calculation of each bit is completed, the sum calculation of all bits is completed in parallel through the corresponding logic.
本专利主要提出一种基于忆阻器存储阵列的加法器设计,利用HSPICE,新型非易失存储器仿真工具NVSim对本设计进行测试,从计算性能,面积开销和功耗开销三方面体现本专利技术效果:This patent mainly proposes an adder design based on a memristor storage array. The design is tested using HSPICE, a new non-volatile memory simulation tool NVSim, which reflects the technical effects of this patent from three aspects: computing performance, area overhead and power consumption overhead. :
计算性能:虽然进位传播的延时依然与操作数位宽有关,但由于本发明是通过模拟的方式,利用电流在构造好的路径上流动传播以形成进位,从而使进位计算所消耗的时钟周期与操作数位宽呈亚线性关系。例如对于32-bit加法器,IMPLY和MAGIC的实现方式分别需要928和384个时钟周期,但采用本发明仅需要消耗13个时钟周期,计算性能相应提高了约70和28倍;Computational performance: Although the delay of carry propagation is still related to the bit width of the operand, because the present invention uses simulation to flow and propagate on the constructed path to form a carry, so that the clock cycle consumed by the carry calculation is The operand width is sub-linear. For example, for a 32-bit adder, IMPLY and MAGIC implementation methods require 928 and 384 clock cycles respectively, but using the present invention only requires 13 clock cycles, and the calculation performance is improved by about 70 and 28 times;
面积开销:面积开销从两个方面进行评定,一方面是加法计算过程中产生的中间数据需要额外占据的阵列单元数目,这部分单元在加法操作中需要预留出来缓存中间数据而无法用于其它数据的存储,即这部分单元占据的比例越大,阵列的利用率就越低,加法操作的开销就越大。同样以32-bit加法器为例,IMPLY和MAGIC分别额外需要2和352个单元,而本设计额外需要64个单元,对于IMPLY设计,阵列开销增大了31倍,对于MAGIC设计,阵列开销降低了4.5倍;另一方面是进位链以及控制电路相对于传统忆阻存储器的开销,这部分开销大约为12.4%;Area overhead: The area overhead is evaluated from two aspects. On the one hand, the number of array elements that need to be occupied by the intermediate data generated during the addition calculation. This part of the unit needs to be reserved for buffering intermediate data during the addition operation and cannot be used for other The storage of data, that is, the greater the proportion of this part of the unit, the lower the utilization rate of the array, and the greater the overhead of the addition operation. Also taking the 32-bit adder as an example, IMPLY and MAGIC require an additional 2 and 352 units, respectively, and this design requires an additional 64 units. For IMPLY design, the array overhead is increased by 31 times, and for MAGIC design, the array overhead is reduced. 4.5 times; on the other hand is the overhead of the carry chain and the control circuit relative to the traditional memristive memory, this part of the overhead is about 12.4%;
功耗开销:额外的功耗开销同样由控制电路和引入的进位链导致,相对于存储操作时外围电路的功耗,这两部分的功耗开销约占19.5%。Power consumption overhead: The additional power consumption overhead is also caused by the control circuit and the introduced carry chain. Compared with the power consumption of the peripheral circuit during the storage operation, the power consumption overhead of these two parts accounts for about 19.5%.
附图说明BRIEF DESCRIPTION
通过阅读参照以下附图对非限制性实施例所作的详细描述,本发明的其它特征、目的和优点将会变得更明显:By reading the detailed description of the non-limiting embodiments with reference to the following drawings, other features, objects, and advantages of the present invention will become more apparent:
图1为本发明实施例的系统框架示意图(a)1-bit(b)4-bit;1 is a schematic diagram of a system framework according to an embodiment of the present invention (a) 1-bit (b) 4-bit;
图2为本发明实施例的进位潜流路径映射示意图(a)分压式读操作(b)电流感应式读操作;FIG. 2 is a schematic diagram of a carry potential path mapping according to an embodiment of the present invention (a) voltage division read operation (b) current sensing read operation;
图3为本发明实施例的进位传播潜流路径进位链示意图(a)基于ReRAM(b)基于CMOS。FIG. 3 is a schematic diagram of a carry chain of a carry propagation path according to an embodiment of the present invention (a) based on ReRAM (b) based on CMOS.
具体实施方式detailed description
基于忆阻器存储阵列的加法实现主要包括三点:进位潜流路径映射,构造串行进位链和求和计算。以下具体阐述实施方案。The addition implementation based on the memristor storage array mainly includes three points: carry-underflow path mapping, construct serial carry chain and sum calculation. The embodiment is specifically explained below.
进位潜流路径映射:图1给出了基于忆阻器实现加法操作的原理图,进位潜流路径映射完成的主要工作就是将该原理图映射到真正的阵列结构中。由于不同比特位的R_G是相互独立的,因此它们可以被映射至阵列的同一行并且同时进行映射计算。对于R_D而言,其映射方式与R_G相同,将其映射至阵列的另一行即可,然而对于采用分压式读操作的存储阵列,无需进行R_D映射,原因在于分压式读操作通常需要在阵列底部连接负载电阻,用以在读过程中感应区分选中单元的阻值状态,而在进位计算中,该负载电阻起到了与R_D同样的作用,因此可以省去R_D的映射步骤,进一步减小了计算延迟及映射开销。R_P无法直接映射到阵列结构中,需要构造串行进位链实现R_P的功能。当映射工作全部完成后,通过对R_G所在行施加运算电压以激活进位计算。此外,如果阵列采用的是分压式读操作,则关闭阵列其余行即可;如果阵列采用的是其它类型的读操作,如电流感应式,则需要将R_D所在行接地并关闭阵列其余行。图2分别给出了这两种不同映射方式的示意图。Carrying underflow path mapping: Figure 1 shows the principle diagram of adding operation based on memristor. The main work completed by carrying underflow path mapping is to map the schematic to the real array structure. Since R_G of different bits are independent of each other, they can be mapped to the same row of the array and the mapping calculation is performed at the same time. For R_D, the mapping method is the same as R_G, and it can be mapped to another row of the array. However, for storage arrays that use voltage-divided read operations, R_D mapping is not necessary because the voltage-divided read operations usually need to be The load resistance is connected to the bottom of the array to sense the resistance state of the selected cell during the reading process. In the carry calculation, the load resistance plays the same role as R_D, so the mapping step of R_D can be omitted, further reducing the Calculate delay and mapping overhead. R_P cannot be directly mapped into the array structure, and a serial carry chain needs to be constructed to realize the function of R_P. When all the mapping work is completed, carry out the calculation by applying the operation voltage to the row where R_G is located. In addition, if the array uses a voltage-divided read operation, just turn off the rest of the array; if the array uses other types of read operations, such as current sensing, you need to ground the row where R_D is located and turn off the rest of the array. Figure 2 shows the schematic diagrams of these two different mapping methods respectively.
构造串行进位链:构造串行进位链的目的是为无法直接映射到阵列结构中的R_P定制一条电流传输路径,从而实现进位的串行传播。然而在电路实现中发现,如果直接采用忆阻器构造进位传播路径(图3a),则存在两个问题,一是无法并行的对串行连接的忆阻器进行编程,这导致映射过程与操作数位宽相关,极大降低了加法器运算效率;二是进位计算过程中电流驱动力不足,一般来说,忆阻器的低阻状态也要达到千欧量级,因此即使进位路径上的忆阻器都处于低阻态,电流可能还是会由于路径阻值过大而在传播过程中逐渐减弱而无法完成对后续位进位的计算。为解决这两个问题,本发明使用传 统MOS管完成进位链的搭建,如图3b所示。首先为解决串行映射的问题,本发明将利用行缓存临时存储进位传播路径的控制信号
Figure PCTCN2019097848-appb-000003
并用此信号直接控制进位链上晶体管的开启与关断,由于这些控制信号是由每条位线独立的计算出来,因此整个计算过程可以并行进行,此外还省去了写入忆阻器的时间;其次为解决进位传播过程中电流驱动力不足的问题,本发明并未将晶体管直接连接于相邻位线上,而将其低位比特端连接到对应的敏感放大器输出上,利用敏感放大器的强驱动能力驱使进位的传播。采用这种方式后,无论进位链多长,进位传播都不会受驱动力不足的影响,因为任意比特位的进位传播都由上一比特位的敏感放大器直接驱动,从而使进位传播的过程独立于进位链的长度。以这种方式构造的串行进位链,可以在进位计算激活后快速得到各个比特位的进位。
Construct a serial carry chain: The purpose of constructing a serial carry chain is to customize a current transmission path for R_P that cannot be directly mapped into the array structure, so as to realize the serial propagation of carry. However, it was found in the circuit implementation that if the memristor is directly used to construct the carry propagation path (Figure 3a), there are two problems. One is that it is impossible to program the serially connected memristors in parallel, which leads to the mapping process and operation. The digital wide correlation greatly reduces the efficiency of the adder; the second is that the current driving force is insufficient during the carry calculation. Generally speaking, the low resistance state of the memristor should also reach the order of kilo-ohms, so even if the memorization on the carry path The resistors are in a low-impedance state, and the current may still be gradually weakened in the propagation process due to the excessive resistance of the path, and the calculation of the subsequent carry cannot be completed. To solve these two problems, the present invention uses a traditional MOS tube to complete the construction of the carry chain, as shown in FIG. 3b. First of all, in order to solve the problem of serial mapping, the present invention will use the line buffer to temporarily store the control signal of the carry propagation path
Figure PCTCN2019097848-appb-000003
And use this signal to directly control the turning on and off of the transistors in the carry chain. Because these control signals are calculated independently by each bit line, the entire calculation process can be performed in parallel, and the time for writing to the memristor is also saved. Secondly, in order to solve the problem of insufficient current driving force in the carry propagation process, the present invention does not directly connect the transistor to the adjacent bit line, but connects its lower bit end to the corresponding sensitive amplifier output, using the strong power of the sensitive amplifier The driving ability drives the spread of carry. In this way, no matter how long the carry chain is, the carry propagation will not be affected by the lack of driving force, because the carry propagation of any bit is directly driven by the sensitive amplifier of the previous bit, so that the process of carry propagation is independent The length of the carry chain. The serial carry chain constructed in this way can quickly obtain the carry of each bit after the carry calculation is activated.
求和计算:在进位计算完成后可以得到所有位对应的进位值,此时所有比特位的求和计算都是独立的,因此可以根据其逻辑表达式并行执行完成最终运算。Sum calculation: After the carry calculation is completed, the carry value corresponding to all bits can be obtained. At this time, the sum calculation of all bits is independent, so the final operation can be performed in parallel according to its logical expression.
在上述介绍的基于忆阻阵列实现的加法器方案中,所有涉及逻辑计算的步骤,本发明都采用现有基于忆阻器逻辑运算的技术,如IMPLY,MAGIC等。这说明本加法器设计是独立于具体的逻辑运算技术的,只要该逻辑运算可以在阵列结构下执行并且能实现本发明所要求的一些特定的逻辑功能,就可以利用不同的逻辑运算方式完成进位路径的映射以及最终的求和运算。In the adder solution based on the memristor array described above, all the steps involved in logic calculation are used in the present invention using existing memristor-based logic operation techniques, such as IMPLY and MAGIC. This shows that the design of this adder is independent of the specific logic operation technology, as long as the logic operation can be executed under the array structure and can achieve some specific logic functions required by the present invention, you can use different logic operations to complete the carry Path mapping and final sum operation.

Claims (8)

  1. 一种基于忆阻器存储阵列的加法器设计,其特征是,包括:An adder design based on a memristor storage array is characterized by:
    进位路径映射;Carry path mapping;
    串行进位链的构造;Construction of serial carry chain;
    求和计算。Summation calculation.
  2. 根据权利要求1所述的基于忆阻器存储阵列的加法器设计,其特征是,进位路径映射,包括:The design of an adder based on a memristor storage array according to claim 1, wherein the carry path mapping includes:
    针对R_G映射,在忆阻器存储阵列上完成并行映射;For R_G mapping, complete parallel mapping on the memristor storage array;
    针对R_D映射,在忆阻器存储阵列上完成并行映射。For R_D mapping, parallel mapping is done on the memristor storage array.
  3. 根据权利要求1所述的基于忆阻器存储阵列的加法器设计,其特征是,串行进位链的构造,包括:The design of an adder based on a memristor memory array according to claim 1, wherein the structure of the serial carry chain includes:
    由MOS管构成进位传播路径;Carry propagation path is formed by MOS tube;
    利用阵列行缓存并行计算并存储进位链控制信号;Use array line buffer to calculate and store carry chain control signals in parallel;
    借助敏感放大器提高电流驱动力。Improve the current driving force with the help of sensitive amplifier.
  4. 根据权利要求1所述的基于忆阻器存储阵列的加法器设计,其特征是,求和计算,包括:The design of an adder based on a memristor memory array according to claim 1, wherein the summation calculation includes:
    利用进位计算的结果并行完成所有比特位的求和运算。Use the result of the carry calculation to complete the sum of all bits in parallel.
  5. 根据权利要求2所述的基于忆阻器存储阵列的加法器设计,其特征是,针对R_G映射,在忆阻器存储阵列上完成并行映射,包括:The design of an adder based on a memristor storage array according to claim 2 is characterized in that, for R_G mapping, parallel mapping is performed on the memristor storage array, including:
    启动进位计算时,在R_G所在行施加运算电压。When starting the carry calculation, apply the operation voltage to the row where R_G is located.
  6. 根据权利要求2所述的基于忆阻器存储阵列的加法器设计,其特征是,针对R_D映射,在忆阻器存储阵列上完成并行映射,包括:The adder design based on the memristor storage array according to claim 2, wherein, for R_D mapping, parallel mapping is performed on the memristor storage array, including:
    针对分压式读操作,无需进行R_D映射;For the divided voltage read operation, R_D mapping is not required;
    针对其它方式读操作,启动进位计算时,将R_D所在行接地。For other read operations, when starting the carry calculation, ground the row where R_D is located.
  7. 根据权利要求3所述的基于忆阻器存储阵列的加法器设计,其特征是,利用阵列行缓存并行计算并存储进位链控制信号,包括:The design of an adder based on a memristor storage array according to claim 3, characterized in that the array line buffer is used to calculate and store the carry chain control signals in parallel, including:
    利用行缓存临时存储进位传播路径的控制信号R_P=A⊕B,并用此信号直接控制进位链上晶体管的开启与关断。The line buffer temporarily stores the control signal R_P = A⊕B of the carry propagation path, and uses this signal to directly control the turning on and off of the transistors in the carry chain.
  8. 根据权利要求3所述的基于忆阻器存储阵列的加法器设计,其特征是,借助敏感 放大器提高电流驱动力,包括:The design of an adder based on a memristor memory array according to claim 3, wherein the current driving force is improved by means of a sensitive amplifier, including:
    将晶体管靠近低比特位的一端接于敏感放大器的输出上,高比特位一端接于下一比特位对应的位线上。Connect the end of the transistor close to the low bit to the output of the sensitive amplifier, and the end of the high bit to the bit line corresponding to the next bit.
PCT/CN2019/097848 2018-11-02 2019-07-26 Embedded quick adder apparatus based on memristor array underflow path and calculation method WO2020088005A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811299086.7 2018-11-02
CN201811299086.7A CN109521993B (en) 2018-11-02 2018-11-02 Quick adder calculation method based on memristor array undercurrent path

Publications (1)

Publication Number Publication Date
WO2020088005A1 true WO2020088005A1 (en) 2020-05-07

Family

ID=65774174

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/097848 WO2020088005A1 (en) 2018-11-02 2019-07-26 Embedded quick adder apparatus based on memristor array underflow path and calculation method

Country Status (2)

Country Link
CN (1) CN109521993B (en)
WO (1) WO2020088005A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112885963A (en) * 2021-01-13 2021-06-01 西安交通大学 Memristor cross array
CN113489484A (en) * 2021-03-16 2021-10-08 上海交通大学 Full adder function implementation method based on resistive random access device
CN113553793A (en) * 2021-06-08 2021-10-26 南京理工大学 Method for improving memory logic calculation efficiency based on memristor

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109521993B (en) * 2018-11-02 2022-07-01 上海交通大学 Quick adder calculation method based on memristor array undercurrent path

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101140511A (en) * 2006-09-05 2008-03-12 硅谷数模半导体(北京)有限公司 Cascaded carry binary adder
CN105739944A (en) * 2016-03-21 2016-07-06 华中科技大学 Multi-system additive operation circuit based on memristors and operation method thereof
US9921808B1 (en) * 2017-06-02 2018-03-20 Board Of Regents, The University Of Texas System Memristor-based adders using memristors-as-drivers (MAD) gates
CN109521993A (en) * 2018-11-02 2019-03-26 上海交通大学 A kind of adder quick calculation method based on memristor array undercurrent path

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7606092B2 (en) * 2007-02-01 2009-10-20 Analog Devices, Inc. Testing for SRAM memory data retention
CN102882513B (en) * 2012-10-09 2015-04-15 北京大学 Full adder circuit and chip
US20150149517A1 (en) * 2013-11-25 2015-05-28 University Of The West Of England Logic device and method of performing a logical operation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101140511A (en) * 2006-09-05 2008-03-12 硅谷数模半导体(北京)有限公司 Cascaded carry binary adder
CN105739944A (en) * 2016-03-21 2016-07-06 华中科技大学 Multi-system additive operation circuit based on memristors and operation method thereof
US9921808B1 (en) * 2017-06-02 2018-03-20 Board Of Regents, The University Of Texas System Memristor-based adders using memristors-as-drivers (MAD) gates
CN109521993A (en) * 2018-11-02 2019-03-26 上海交通大学 A kind of adder quick calculation method based on memristor array undercurrent path

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112885963A (en) * 2021-01-13 2021-06-01 西安交通大学 Memristor cross array
CN112885963B (en) * 2021-01-13 2022-12-09 西安交通大学 Memristor cross array
CN113489484A (en) * 2021-03-16 2021-10-08 上海交通大学 Full adder function implementation method based on resistive random access device
CN113489484B (en) * 2021-03-16 2024-01-09 上海交通大学 Full adder function implementation method based on resistive device
CN113553793A (en) * 2021-06-08 2021-10-26 南京理工大学 Method for improving memory logic calculation efficiency based on memristor

Also Published As

Publication number Publication date
CN109521993B (en) 2022-07-01
CN109521993A (en) 2019-03-26

Similar Documents

Publication Publication Date Title
WO2020088005A1 (en) Embedded quick adder apparatus based on memristor array underflow path and calculation method
Zabihi et al. In-memory processing on the spintronic CRAM: From hardware design to application mapping
CN112581996B (en) Time domain memory internal computing array structure based on magnetic random access memory
KR101705926B1 (en) Conditional operation in an internal processor of a memory device
US8521958B2 (en) Internal processor buffer
WO2018189620A1 (en) Neural network circuit
TWI427532B (en) Parallel processing and internal processors
CN105814637A (en) Division operations for memory
Zha et al. Reconfigurable in-memory computing with resistive memory crossbar
EP3997563A1 (en) Methods for performing processing-in-memory operations, and related memory devices and systems
CN111158635B (en) FeFET-based nonvolatile low-power-consumption multiplier and operation method thereof
US20210350846A1 (en) Associativity-Agnostic In-Cache Computing Memory Architecture Optimized for Multiplication
JP2007293871A (en) Hardware emulation system having heterogeneous cluster of processor
Lalchhandama et al. In-memory computing on resistive ram systems using majority operation
CN109521995B (en) Calculation method of logic operation device embedded in memristor array
TWI771014B (en) Memory circuit and operating method thereof
CN114974337A (en) Time domain memory computing circuit based on spin magnetic random access memory
JP4105100B2 (en) Logical operation circuit and logical operation method
TWI782573B (en) In-memory computation device and in-memory computation method
TWI740761B (en) Data processing apparatus, artificial intelligence chip
Chen et al. BRAMAC: Compute-in-BRAM Architectures for Multiply-Accumulate on FPGAs
CN113658625A (en) 1T1R array-based reconfigurable state logic operation circuit and method
Monga et al. A Novel Decoder Design for Logic Computation in SRAM: CiM-SRAM
Inglese et al. Memristive logic-in-memory implementations: A comparison
CN108109655B (en) RRAM iterative multiplier circuit based on MIG logic and implementation method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19877933

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19877933

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 27/09/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19877933

Country of ref document: EP

Kind code of ref document: A1