WO2021004466A1 - 一种基于多位并行二进制突触阵列的神经形态计算电路 - Google Patents

一种基于多位并行二进制突触阵列的神经形态计算电路 Download PDF

Info

Publication number
WO2021004466A1
WO2021004466A1 PCT/CN2020/100756 CN2020100756W WO2021004466A1 WO 2021004466 A1 WO2021004466 A1 WO 2021004466A1 CN 2020100756 W CN2020100756 W CN 2020100756W WO 2021004466 A1 WO2021004466 A1 WO 2021004466A1
Authority
WO
WIPO (PCT)
Prior art keywords
rram
array
input
bit parallel
bit
Prior art date
Application number
PCT/CN2020/100756
Other languages
English (en)
French (fr)
Inventor
黄科杰
张赛
沈海斌
Original Assignee
浙江大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浙江大学 filed Critical 浙江大学
Publication of WO2021004466A1 publication Critical patent/WO2021004466A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • G06N3/065Analogue means
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M1/00Analogue/digital conversion; Digital/analogue conversion
    • H03M1/12Analogue/digital converters
    • H03M1/34Analogue value compared with reference values
    • H03M1/38Analogue value compared with reference values sequentially only, e.g. successive approximation type
    • H03M1/46Analogue value compared with reference values sequentially only, e.g. successive approximation type with digital/analogue converter for supplying reference values to converter
    • H03M1/466Analogue value compared with reference values sequentially only, e.g. successive approximation type with digital/analogue converter for supplying reference values to converter using switched capacitors
    • H03M1/468Analogue value compared with reference values sequentially only, e.g. successive approximation type with digital/analogue converter for supplying reference values to converter using switched capacitors in which the input S/H circuit is merged with the feedback DAC array

Definitions

  • the invention relates to the field of neuromorphic computing, in particular to a neuromorphic computing circuit based on a multi-bit parallel binary synapse array.
  • Neuromorphic computing can greatly improve the energy efficiency of artificial neural network computing.
  • the storage unit and the computing unit are integrated, which solves the bottleneck problem of traditional von Neumann structure transmission bandwidth and transmission energy consumption.
  • the emerging resistive non-volatile memory (RRAM, Resistive Random-Access-Memory) is the best choice for neuromorphic calculation.
  • RRAM Resistive Random-Access-Memory
  • the weighted combination of input signals can be converted into output voltage by using RRAM resistance to complete the artificial neural network.
  • the basic operation matrix multiplication and accumulation MAC, Multiplication-and-Accumulation), so as to achieve ultra-low power consumption in-memory parallel computing.
  • DACs Digital-to-Analog Converters
  • ADCs Analog-to-Digital Converters
  • the present invention proposes a neuromorphic computing circuit based on a multi-bit parallel binary neural network synapse array, which can realize a high-precision and high-performance deep neural network with low energy consumption.
  • the present invention proposes a novel neural network synapse array, which can perform a large number of parallel calculations of multiplication and accumulation.
  • a high-performance neuromorphic computing architecture is proposed, which can be configured into different deep neural networks to meet different application requirements.
  • the present invention provides the following solutions:
  • a neuromorphic computing circuit based on multi-bit parallel binary synapse array including: nerve axon module, multi-bit parallel binary RRAM synapse array, time division multiplexer, multiple integrators and a shared successive approximation model Digital converter
  • the input signal from the upper layer of the neural network first enters the axon module.
  • the axon module includes two basic units: a timing scheduler and an adder; the timing scheduler is used to arrange the timing of the signal so that the input signal adopts dendrites first
  • the strategy is input to the multi-bit parallel binary RRAM synapse array in turn; the adder is used to expand the array size.
  • the adder is used to calculate the results of multiple arrays Add together to get the output of the network layer;
  • the basic unit in the multi-bit parallel binary RRAM synapse array is a transistor-1RRAM structure, where the transistor is used to control the switching behavior, the source level is grounded, the drain level is connected to one end of the binary RRAM, and the other end of the RRAM is connected to the integrator circuit ;
  • N binary RRAMs simulate the difference level of nerve synapses in the form of fixed points;
  • the transistor gate is connected to the input signal line, and the input of the neural network layer also adopts the form of N fixed points ,
  • the binary input of each bit is directly used as the control voltage Vc of the 1 transistor-1RRAM structure;
  • the integrator includes an integrating operational amplifier and a switched capacitor circuit to convert the input signal and the MAC calculation result of the RRAM array weight into an analog integrated voltage;
  • the shared successive approximation analog-to-digital converter quantizes the analog integrated voltage into N-bit digital output data
  • the time division multiplexer is used to share the shared successive approximation analog-to-digital converter and integrator to all the inputs of the network layer, and maximize the utilization of hardware resources through timing scheduling.
  • the RRAM is modeled using experimental data of a nitrogen-doped aluminum oxide structure, and each RRAM has two resistances: a low resistance state and a high resistance state.
  • the shown shared successive approximation analog-to-digital converter adopts a combination structure of a high-precision, high-power ADC and a low-precision, low-power ADC, and uses a low-precision, low-power ADC to quantify the high 4-bit results, and uses high precision and high The power ADC quantifies the low 4-bit result.
  • the present invention has the following technical effects:
  • the multi-bit parallel binary RRAM neural network synapse array and neuromorphic computing circuit proposed by the present invention have the advantages of high precision and low power consumption compared with the current system, and can be configured into most deep neural network applications, and is particularly suitable for deployment in In edge computing devices with high energy consumption requirements.
  • Figure 1 is a structure diagram of traditional neuromorphic computing
  • Figure 2 is a diagram of the high-performance neuromorphic calculation structure proposed by the present invention.
  • Figure 3 is a multi-bit parallel binary RRAM synapse array proposed by the present invention.
  • Figure 4 is a structural diagram of 1T1R unit
  • Figure 5 is a schematic diagram of the integral of the calculation circuit proposed by the present invention.
  • FIG. 6 is a block diagram of the points system proposed by the present invention.
  • FIG. 7 is a structure diagram of the 8-bit shared SARADC proposed by the present invention.
  • FIG. 1 The traditional neuromorphic calculation circuit is shown in Figure 1. Interface components such as DAC and ADC will bring a lot of power consumption, and using different input voltages as the reading voltage of RRAM, the resistance of RRAM will have a large deviation. The accuracy of the calculation result is not high, which limits the scope of application.
  • Figure 2 shows the neuromorphic computing architecture proposed by the present invention, including axon modules, multi-bit parallel binary RRAM synapse array, time division multiplexer, multiple integrators and a shared successive approximation analog-to-digital converter (SAR ADC, Successive Approximation Register Analog-to-Digital Converter).
  • SAR ADC Successive Approximation Register Analog-to-Digital Converter
  • the axon module includes two basic units: a timing scheduler and an adder.
  • the timing scheduler is used to arrange the timing of the signal so that the input signal adopts the strategy of dendrites first, and is sequentially input to the multi-bit parallel binary RRAM synapse array; the adder can be used to expand the array size, when the configured neural network input layer is greater than 1
  • the adder of the axon module can be used to add the calculation results of multiple arrays to obtain the output of the network layer.
  • the integrator includes an integrating operational amplifier and a switched capacitor circuit to convert the input signal and the MAC calculation result of the RRAM array weight into an analog integrated voltage. A detailed introduction will be given in the description of the integrating circuit below.
  • the analog integrated voltage is quantized into N-bit digital output data by sharing SARADC.
  • the time division multiplexer is used to share the SARADC and integrator to all the inputs of the network layer, and maximize the utilization of hardware resources through timing scheduling.
  • the multi-bit parallel binary RRAM synapse array proposed by the present invention is shown in FIG. 3.
  • the transistor NMOS
  • the source level is grounded
  • the drain level is connected to one end of the binary RRAM
  • the other end of the RRAM is connected
  • the integrator circuit, N binary RRAMs simulate the difference level of nerve synapses in the form of fixed points.
  • the gate is connected to the input signal line, and the input of the neural network layer is also in the form of N-bit fixed points.
  • the binary input of each bit is directly used as the control voltage Vc of the 1T1R unit, thereby eliminating the use of the input interface DAC and greatly reducing energy consumption And area occupied.
  • the invention uses the experimental data of nitrogen-doped alumina structure to model RRAM, and each RRAM has two resistances: a low resistance state (about 10M ⁇ ) and a high resistance state (about 1G-10G ⁇ ). Through timing arrangement, the RRAM array is turned on only during the integration phase, and is in the off state most of the time, thereby greatly reducing the power consumption of the synapse array.
  • the RRAM array provided by the present invention has the characteristics of high density and one-time reading, and can greatly reduce the power consumption and area of the synapse array.
  • the multiple binary RRAMs proposed by the present invention solve the problems of large nonlinear deviation and low quantization accuracy of a single multi-bit RRAM, and use a fixed operational amplifier.
  • the reference voltage is used as the read voltage of the RRAM, which can significantly reduce the deviation of the resistance of the RRAM under different read voltages, and improve the accuracy of weighting.
  • the N binary RRAM proposed in the present invention can increase the weighting accuracy to N bits, and the activation value input of each layer of the RRAM array is also N bits Therefore, the accuracy of the entire network can be increased to N bits, which overcomes the problem of large performance loss of the BNN array structure in deep neural networks such as Alexnet, and can achieve higher accuracy.
  • the integral principle of the calculation circuit proposed by the present invention is shown in Fig. 5.
  • each bit of the input data is sequentially input to the integral circuit, and the principle of charge redistribution is used to complete Fig. 6 Points system.
  • the integration completes the process of multiplication and accumulation, and then the result of the analog integrated voltage is quantized into N-bit digital form through the shared SARADC, which is convenient for signal transmission and storage.
  • the integration circuit proposed by the present invention uses the principle of charge redistribution to complete the weighting process of different weight bits and different input bits, has a simple structure, small errors and easy control, and can achieve higher integration accuracy and network accuracy.
  • the current proposed mirror current source system and dynamic threshold system generally have problems such as complex structure, large circuit errors and large power consumption, which can only be applied to small-scale neural networks.
  • the 8-bit shared SARADC structure proposed by the present invention is shown in FIG. 7, and the SARADC can be configured as N bits according to specific needs.
  • the capacitor used for data temporary storage and charge redistribution in the integration circuit is also used for the DAC capacitor array in SARADC, which reduces the area occupied by resource sharing.
  • the 8-bit shared SARADC proposed by the present invention adopts the structure of the combination of high-precision high-power ADC and low-precision low-power ADC.
  • the low-power ADC is used to quantify the high 4-bit result
  • the high-precision ADC is used to quantify the low 4-bit result.
  • Accuracy also reduces energy consumption.
  • the dynamic comparator structure and free-running clock method can also be used to reduce the power consumption of the comparator, and the separate DAC capacitor method can be used to reduce the conversion power consumption of the capacitor array, which is beneficial for deployment in low-power designs.
  • Fig. 3 is the neural synapse array structure used in the present invention.
  • the output result of dendrites can be further expressed as:
  • FIG. 5 and Figure 6 are the specific calculation circuit integration principle and integration system. Each integrator is composed of integrating operational amplifier, C n capacitor, C f -C n capacitor, and S1, S2, S3, S4 switches. The specific connection relationship is shown in Figure 5. Using 256 parallel inputs, each input data is quantized to a fixed number of N bits, and enters the integral circuit from low to high. In other words, A 0 , 0 A 1 , 0... A p- 1 , 0 are selected as axons in turn The input is used as the control voltage of the 1T1R unit in the RRAM synapse array.
  • the switches S1, S2, and the sampling switch S5 in SARADC are closed, while switches S3 and S4 are turned off to separate the output voltage of the integrator.
  • the resulting integrated voltage can be expressed as:
  • V o is the integration voltage of the current integrator
  • V o - is the integration state before the integrator
  • T is the fixed integration time
  • G i is the conductance value of the binarization weight (RRAM corresponds to the high resistance state and low resistance state
  • the resistance values of are respectively 1/R H and 1/R L )
  • V ref is the reference reading voltage
  • Cf is the total feedback capacitance.
  • the switch S2 When the 1-bit integration process is completed, the switch S2 is turned off to maintain a constant integration voltage, while the op amp is turned off to minimize power consumption, and then the switch S1 is turned off to make the power consumption of the RRAM array close to zero. Then switch S3 is closed and the equivalent analog voltage calculated by MAC is obtained using the charge redistribution method. At the same time, the switch S4 is turned off to complete the reset of the integrating circuit. Once the charge redistribution process is completed, switch S4 is turned off and S2 is turned on to prepare for the integration process of the next bit of input data.
  • the weighting process of different weight bits and different input bits is completed at the same time.
  • different capacitors are used to achieve the weighting of different weight bits.
  • the process equivalent voltage V s can be expressed as:
  • Equation (3) can be regarded as a special case of equation (1) when the input is only 1 bit.
  • Equation (4) is equivalent to equation (1).
  • sampling switch S5 located in SARADC is turned off. While completing the integration process of all bits to obtain the V out output voltage, SARADC also completes the sampling of V out by sharing the DAC array, and begins to quantize the analog integrated voltage result into N bits Digital form for easy storage and transmission.
  • the gated clock is turned off, and the switches S1, S2, S3, and S4 are turned off to turn off the energy consumption of the integrating circuit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Power Engineering (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Neurology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Analogue/Digital Conversion (AREA)

Abstract

一种基于多位并行二进制突触阵列的神经形态计算电路,包括神经轴突模块、多位并行的二进制RRAM突触阵列、时分复用器、多个积分器和一个共享的逐次逼近型模数转换器;神经轴突模块包括2个基本单元:时序调度器和加法器,时序调度器用于安排信号的时序,使输入信号采用树突优先的策略,依次输入到多位并行的二进制RRAM突触阵列;加法器用于阵列规模的拓展,当配置的神经网络输入层大于1个RRAM阵列的输入时,利用加法器将多个阵列的计算结果相加,从而得到网络层的输出。该电路相比于当前的体制具有高精度和低功耗的优势,可配置成大多数深度神经网络应用,特别适合部署于对能耗要求高的边缘计算设备中。

Description

一种基于多位并行二进制突触阵列的神经形态计算电路
本申请要求于2019年7月08日提交中国专利局、申请号为201910609991.6、发明名称为“一种基于多位并行二进制突触阵列的神经形态计算电路”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及神经形态计算领域,特别是涉及一种基于多位并行二进制突触阵列的神经形态计算电路。
背景技术
近年来深度神经网络在人工智能领域迅速发展,在图像识别、自然语言处理等方面取得优异的成果。目前很多先进的深度学习算法,通过增加网络的深度和参数的数量来提高网络的性能,对硬件的存储容量、计算能力以及能效提出了更高的要求。比如AlphaGo需要消耗一百万瓦的能量才能获得足够的算力,相比之下人脑只需要消耗20瓦的能量。
神经形态计算能够大幅提升人工神经网络计算的能效,通过模仿人脑的结构将存储单元和计算单元集成在一起,解决了传统冯诺依曼结构传输带宽和传输能耗的瓶颈问题。新兴的电阻式非易失性存储器(RRAM,Resistive Random-Access-Memory)是实现神经形态计算的最佳选择,利用RRAM阻值可以将输入信号的加权组合转变为输出电压,完成人工神经网络中的基本操作矩阵乘法和累加(MAC,Multiplication-and-Accumulation),从而实现超低功耗的存内并行计算。
当前提出的神经形态计算电路,大都需要高精度的数模转换器(DACs,Digital-to-Analog Converters)和模数转换器(ADCs,Analog-to-Digital Converters)作为接口器件,导致接口器件的能耗占整体能耗的80%以上,不利于在边缘计算设备里的应用。而且当前的神经形态计算解决方案,实现的权重量化精度和激活值量化精度低,只能面向Lenet等简单网络,对于Alexnet等规模较大的深度神经网络性能损失明显,很大程度上限制了其应用的范围。因此,本发明提出了一种基于多位并行二进制神经网络突触阵列的神经形态计算电路,能够在低能耗的情况下实现高精度高性能的深度神经网络。
发明内容
针对现有技术存在的缺陷和对低功耗高精度的改进需求,本发明提出了一种新颖的神经网络突触阵列,能够执行大量乘和累加的并行计算。同时提出了一种高效能的神经形态计算架构,可配置成不同的深度神经网络,以满足不同的应用需求。
为实现上述目的,本发明提供了如下方案:
一种基于多位并行二进制突触阵列的神经形态计算电路,包括:神经轴突模块、多位并行的二进制RRAM突触阵列、时分复用器、多个积分器和一个共享的逐次逼近型模数转换器;
来自神经网络上一层的输入信号,先进入神经轴突模块,神经轴突模块包括2个基本单元:时序调度器和加法器;时序调度器用于安排信号的时序,使输入信号采用树突优先的策略,依次输入到多位并行的二进制RRAM突触阵列;加法器用于阵列规模的拓展,当配置的神经网络输入层大于1个RRAM阵列的输入时,利用加法器将多个阵列的计算结果相加,从而得到网络层的输出;
多位并行的二进制RRAM突触阵列中的基本组成单元为1晶体管-1RRAM结构,其中晶体管用来控制开关行为,源级接地,漏级接二进制RRAM的一端,RRAM的另一端连入积分器电路;多位并行的二进制RRAM突触阵列中N个二进制RRAM以固定点数的形式来模拟神经突触的差异水平;晶体管栅极接输入信号线,神经网络层的输入也采用N位固定点数的形式,每位二进制的输入直接作为1晶体管-1RRAM结构的控制电压Vc;
积分器包括积分运放和开关电容电路,用来将输入信号和RRAM阵列权重的MAC计算结果转化为模拟积分电压;
共享的逐次逼近型模数转换器将模拟积分电压量化为N位数字形式的输出数据;
时分复用器用于将共享的逐次逼近型模数转换器和积分器共享给网络层所有的输入,通过时序的调度最大化硬件资源的利用率。
可选的,采用氮掺杂氧化铝结构的实验数据对RRAM进行建模,每个RRAM具有2个电阻:低阻态和高阻态。
可选的,所示共享的逐次逼近型模数转换器采用高精度高功耗ADC和低精度低功耗ADC组合的结构,用低精度低功耗ADC量化高4位结果,用高精度高功耗ADC量化低4位结果。
与现有技术相比,本发明具有以下技术效果:
本发明提出的多位并行的二进制RRAM神经网络突触阵列和神经形态计算电路,相比当前的体制具有高精度和低功耗的优势,可配置成大多数深度神经网络应用,特别适合部署于对能耗要求高的边缘计算设备中。
说明书附图
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是传统神经形态计算结构图;
图2是本发明提出的高效能神经形态计算结构图;
图3是本发明提出的多位并行的二进制RRAM突触阵列;
图4是1T1R单元结构图;
图5是本发明提出的计算电路积分原理图;
图6是本发明提出的积分体制框图;
图7是本发明提出的8位共享SARADC结构图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
为使本发明的上述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本发明作进一步详细的说明。
传统的神经形态计算电路如图1所示,DAC和ADC等接口部件会带来很大的功耗,而且以不同的输入电压作为RRAM的读取电压,RRAM阻值会产生较大的偏差,导致计算结果的精确度不高,限制了应用的范围。 图2表示本发明提出的神经形态计算架构,包括神经轴突模块、多位并行的二进制RRAM突触阵列、时分复用器、多个积分器和一个共享的逐次逼近型模数转换器(SAR ADC,Successive Approximation Register Analog-to-DigitalConverter)。来自神经网络上一层的输入信号,先进入神经轴突模块,神经轴突模块包括2个基本单元:时序调度器和加法器。时序调度器用于安排信号的时序,使输入信号采用树突优先的策略,依次输入到多位并行的二进制RRAM突触阵列;加法器可用于阵列规模的拓展,当配置的神经网络输入层大于1个RRAM阵列的输入时,可以利用轴突模块的加法器将多个阵列的计算结果相加,从而得到网络层的输出。积分器包括积分运放和开关电容电路,用来将输入信号和RRAM阵列权重的MAC计算结果转化为模拟积分电压,在下面积分电路的描述中将会给出详细的介绍。最后通过共享SARADC将模拟积分电压量化为N位数字形式的输出数据。时分复用器用于将SARADC和积分器共享给网络层所有的输入,通过时序的调度最大化硬件资源的利用率。
本发明提出的多位并行的二进制RRAM突触阵列如图3所示。采用图4的1晶体管-1RRAM(1T1R,1 Transistor 1 RRAM)结构作为基本组成单元,晶体管(NMOS)用来控制开关行为,源级接地,漏级接二进制RRAM的一端,RRAM的另一端连入积分器电路,N个二进制RRAM以固定点数的形式来模拟神经突触的差异水平。栅极接输入信号线,神经网络层的输入也采用N位固定点数的形式,每位二进制的输入直接作为1T1R单元的控制电压Vc,从而消除了输入接口DAC的使用,大幅减少了能量的消耗和面积的占用。本发明采用氮掺杂氧化铝结构的实验数据对RRAM进行建模,每个RRAM具有2个电阻:低阻态(约10MΩ)和高阻态(约1G—10GΩ)。通过时序安排,使RRAM阵列只在积分阶段导通,大多数时间处于关断状态,从而大大降低了突触阵列的功耗。
相比传统的SRAM阵列,本发明提出的RRAM阵列具有高密度和一次性读取的特点,能大幅减少突触阵列的功耗和面积。相比传统的多位RRAM和采用不同输入电压作为读取电压的体制,本发明提出的多个二进制RRAM解决了单个多位RRAM非线性偏差大、量化精度低的问题,同时采用固定的运放参考电压作为RRAM的读取电压,能显著降低 RRAM阻值在不同读取电压下的偏差,提升权重量化的精度。相比传统的二值神经网络(BNN,BinarizedNeural Networks)突触阵列的方法,本发明提出的N个二进制RRAM能将权重量化精度提升至N位,而且RRAM阵列的每层激活值输入也是N位,因此可以将整个网络的精度提升到N位,克服了BNN阵列结构在Alexnet等深度神经网络中性能损失大的问题,能实现较高的准确率。
本发明提出的计算电路积分原理如图5所示,采用256路输入并行计算和树突优先的策略,将输入数据的每一位顺序输入积分电路,利用电荷重新分配的原理,完成如图6的积分体制。N位输入信号和N位RRAM阵列权重,可依次表示为数字形式x=A n-1A n-2…A 0和w=a n-1a n-2…a 0,利用欧姆定律和电流积分完成乘和累加过程,然后将模拟积分电压的结果通过共享SARADC量化为N位数字形式,便于信号的传输与存储。本发明提出的积分电路,利用电荷重新分配的原理,完成不同权重位和不同输入位的加权过程,结构简单且误差小易控制,能实现较高的积分精度和网络正确率。而当前提出的镜像电流源体制和动态阈值体制等,普遍存在结构复杂、电路误差大和功耗大的问题,导致只能应用在小规模的神经网络中。
本发明提出的8位共享SARADC结构如图7所示,可根据具体需要将SARADC配置为N位。在积分电路中用于数据暂存和电荷重新分配的电容也用于SARADC中的DAC电容阵列,通过资源共享的方式减少了面积的占用。本发明提出的8位共享SARADC采用高精度高功耗ADC和低精度低功耗ADC组合的结构,用低功耗ADC量化高4位结果,用高精度ADC量化低4位结果,在实现高精度的同时降低了能量的消耗。此外,还可以采用动态比较器结构和自激时钟的方法来降低比较器的功耗,采用分离DAC电容方法降低电容阵列的转换功耗,利于部署到低功耗的设计中。
图3是本发明采用的神经突触阵列结构,用N个二进制的RRAM来模拟一个突触,因此可将一个N位固定点权重表示为w=a n-1a n-2…a 0,进一步地可将树突输出结果表达为:
y=∑x iw i=∑2 n-1a i,n-1x i+…+∑2 1a i,1x i+∑2 0a i,0x i   (1)
图5和图6是具体的计算电路积分原理和积分体制。每个积分器由积 分运放、C n电容、C f-C n电容以及S1、S2、S3、S4开关组成,具体连接关系如图5所示。采用256路并行输入,每一个输入数据量化为N位固定点数,从低位到高位依次进入积分电路,换言之,A 00A 10…A p-1,0依次被选为轴突线的输入,作为RRAM突触阵列中1T1R单元的控制电压。
当积分电路开启时,门控时钟打开,由开关S1、S2、S3、S4和S5控制积分过程和电荷重分配过程。
在积分阶段,开关S1、S2和SARADC中的采样开关S5闭合,同时将开关S3和S4关断以分隔开积分器的输出电压,得到的积分电压可表示为:
Figure PCTCN2020100756-appb-000001
其中V o是当前积分器的积分电压,V o -是积分器前面的积分状态,T是固定的积分时间,G i是二值化权重的电导值(RRAM在高阻态和低阻态对应的电阻值分别为1/R H和1/R L),V ref是基准读取电压, Cf是总的反馈电容。
当1位积分过程完成后,开关S2断开以维持积分电压恒定,同时关断运放以最小化功耗,然后开关S1关断使RRAM阵列的功耗接近于0。随后开关S3闭合利用电荷重分配方法得到MAC计算的等价模拟电压。同时开关S4关断完成积分电路的复位。一旦电荷重分配过程完成,开关S4关断、S2打开,为下一位输入数据的积分过程做好准备。
在电荷重分配阶段,同时完成了不同权重位和不同输入位的加权过程。首先用不同的电容来实现不同权重位的加权,电容从大到小依次为C n-1,C n-2…C 0,可表示为如下关系:C n-1=2 1C n-2=…=2 n-1C 0。不同权重位加权后,过程等效电压V s可表示为:
Figure PCTCN2020100756-appb-000002
式(3)可以看作等式(1)在输入只有1位时的特殊情况。
SARADC中的共享DAC电容阵列C f(C f=2 nC 0)用于镜像V s电压,以 完成不同输入位的加权过程。输出积分电压V out初始化为0,每位积分过程完成后,前面输入位的平分电压V x -与当前位的积分等效电压V s通过C f和C n-1C n-2…C 0进行电荷平分,由于输入数据从低位到高位依次进行输入,相当于分别被平分了2 n-1,2 n-2…2 0次,因此最后积分输出电压Vout可表示为:
Figure PCTCN2020100756-appb-000003
等式(4)等价于等式(1)。通过上述积分过程和电荷重分配过程,完成了数字形式的N位固定点输入和N位固定点权重的乘和累加运算,得到模拟形式的输出电压。
最后,位于SARADC中的采样开关S5断开,在完成所有位积分过程得到V out输出电压的同时,SARADC也通过共享DAC阵列完成了对V out的采样,开始将模拟积分电压结果量化成N位数字形式,以便于存储和传输。在SARADC量化阶段,门控时钟关断,开关S1、S2、S3和S4断开以关断积分电路的能耗。
本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处。综上所述,本说明书内容不应理解为对本发明的限制。

Claims (3)

  1. 一种基于多位并行二进制突触阵列的神经形态计算电路,其特征在于,包括:神经轴突模块、多位并行的二进制RRAM突触阵列、时分复用器、多个积分器和一个共享的逐次逼近型模数转换器;
    来自神经网络上一层的输入信号,先进入神经轴突模块,神经轴突模块包括2个基本单元:时序调度器和加法器;时序调度器用于安排信号的时序,使输入信号采用树突优先的策略,依次输入到多位并行的二进制RRAM突触阵列;加法器用于阵列规模的拓展,当配置的神经网络输入层大于1个RRAM阵列的输入时,利用加法器将多个阵列的计算结果相加,从而得到网络层的输出;
    多位并行的二进制RRAM突触阵列中的基本组成单元为1晶体管-1RRAM结构,其中晶体管用来控制开关行为,源级接地,漏级接二进制RRAM的一端,RRAM的另一端连入积分器电路;多位并行的二进制RRAM突触阵列中N个二进制RRAM以固定点数的形式来模拟神经突触的差异水平;晶体管栅极接输入信号线,神经网络层的输入也采用N位固定点数的形式,每位二进制的输入直接作为1晶体管-1RRAM结构的控制电压Vc;
    积分器包括积分运放和开关电容电路,用来将输入信号和RRAM阵列权重的MAC计算结果转化为模拟积分电压;
    共享的逐次逼近型模数转换器将模拟积分电压量化为N位数字形式的输出数据;
    时分复用器用于将共享的逐次逼近型模数转换器和积分器共享给网络层所有的输入,通过时序的调度最大化硬件资源的利用率。
  2. 根据权利要求1所述的基于多位并行二进制突触阵列的神经形态计算电路,其特征在于,采用氮掺杂氧化铝结构的实验数据对RRAM进行建模,每个RRAM具有2个电阻:低阻态和高阻态。
  3. 根据权利要求1所述基于多位并行二进制突触阵列的神经形态计算电路,其特征在于,所示共享的逐次逼近型模数转换器采用高精度高功耗ADC和低精度低功耗ADC组合的结构,用低精度低功耗ADC量化高4位结果,用高精度高功耗ADC量化低4位结果。
PCT/CN2020/100756 2019-07-08 2020-07-08 一种基于多位并行二进制突触阵列的神经形态计算电路 WO2021004466A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910609991.6 2019-07-08
CN201910609991.6A CN110378475B (zh) 2019-07-08 2019-07-08 一种基于多位并行二进制突触阵列的神经形态计算电路

Publications (1)

Publication Number Publication Date
WO2021004466A1 true WO2021004466A1 (zh) 2021-01-14

Family

ID=68252416

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/100756 WO2021004466A1 (zh) 2019-07-08 2020-07-08 一种基于多位并行二进制突触阵列的神经形态计算电路

Country Status (2)

Country Link
CN (1) CN110378475B (zh)
WO (1) WO2021004466A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115278130A (zh) * 2022-07-29 2022-11-01 西安理工大学 一种用于超大规模图像传感器的模数转换电路及量化方法

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110378475B (zh) * 2019-07-08 2021-08-06 浙江大学 一种基于多位并行二进制突触阵列的神经形态计算电路
US11663455B2 (en) * 2020-02-12 2023-05-30 Ememory Technology Inc. Resistive random-access memory cell and associated cell array structure
CN111325330B (zh) * 2020-02-19 2022-10-11 北京大学 一种突触对称性时间依赖可塑性算法电路及其阵列结构
CN112070204B (zh) * 2020-07-24 2023-01-20 中国科学院计算技术研究所 一种基于阻变存储器的神经网络映射方法、加速器
US11741353B2 (en) 2020-12-09 2023-08-29 International Business Machines Corporation Bias scheme for single-device synaptic element
CN113157034B (zh) * 2021-01-19 2022-06-03 浙江大学 一种被动稳压电路实现的高线性度神经形态计算电路
CN113222131B (zh) * 2021-04-30 2022-09-06 中国科学技术大学 基于1t1r的可实现带符号权重系数的突触阵列电路
WO2024015023A2 (en) * 2022-07-15 2024-01-18 Agency For Science, Technology And Research Neural processing core for a neural network and method of operating thereof
CN117831589B (zh) * 2024-01-11 2024-07-26 浙江大学 一种基于rram的高密度数模混合存算阵列

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107273972A (zh) * 2017-05-11 2017-10-20 北京大学 一种基于阻变器件和适应‑激发神经元的神经形态系统及实现方法
US20180068722A1 (en) * 2015-08-05 2018-03-08 University Of Rochester Resistive memory accelerator
CN108416432A (zh) * 2018-01-19 2018-08-17 北京大学 电路和电路的工作方法
CN108780492A (zh) * 2016-02-08 2018-11-09 斯佩罗设备公司 模拟协处理器
CN110378475A (zh) * 2019-07-08 2019-10-25 浙江大学 一种基于多位并行二进制突触阵列的神经形态计算电路

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10217046B2 (en) * 2015-06-29 2019-02-26 International Business Machines Corporation Neuromorphic processing devices
CN107194462B (zh) * 2016-03-15 2020-05-19 清华大学 三值神经网络突触阵列及利用其的神经形态计算网络系统
US9887351B1 (en) * 2016-09-30 2018-02-06 International Business Machines Corporation Multivalent oxide cap for analog switching resistive memory
CN108830379B (zh) * 2018-05-23 2021-12-17 电子科技大学 一种基于参数量化共享的神经形态处理器
CN109858620B (zh) * 2018-12-29 2021-08-20 北京灵汐科技有限公司 一种类脑计算系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180068722A1 (en) * 2015-08-05 2018-03-08 University Of Rochester Resistive memory accelerator
CN108780492A (zh) * 2016-02-08 2018-11-09 斯佩罗设备公司 模拟协处理器
CN107273972A (zh) * 2017-05-11 2017-10-20 北京大学 一种基于阻变器件和适应‑激发神经元的神经形态系统及实现方法
CN108416432A (zh) * 2018-01-19 2018-08-17 北京大学 电路和电路的工作方法
CN110378475A (zh) * 2019-07-08 2019-10-25 浙江大学 一种基于多位并行二进制突触阵列的神经形态计算电路

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115278130A (zh) * 2022-07-29 2022-11-01 西安理工大学 一种用于超大规模图像传感器的模数转换电路及量化方法

Also Published As

Publication number Publication date
CN110378475B (zh) 2021-08-06
CN110378475A (zh) 2019-10-25

Similar Documents

Publication Publication Date Title
WO2021004466A1 (zh) 一种基于多位并行二进制突触阵列的神经形态计算电路
WO2020238889A1 (zh) 一种基于Radix-4编码和差分权重的乘累加电路
EP3989445A1 (en) Sub-unit, mac array, bit-width reconfigurable hybrid analog-digital in-memory computing module
KR20210144417A (ko) 인-메모리 프로세싱을 수행하는 장치 및 이를 포함하는 컴퓨팅 장치
Kim et al. A digital neuromorphic VLSI architecture with memristor crossbar synaptic array for machine learning
Wang et al. Low power convolutional neural networks on a chip
US11531898B2 (en) Training of artificial neural networks
KR102653822B1 (ko) 혼성 신호 컴퓨팅 시스템 및 방법
US20200356848A1 (en) Computing circuitry
CN104300984B (zh) 一种模数转换器和模数转换方法
CN111144558A (zh) 基于时间可变的电流积分和电荷共享的多位卷积运算模组
KR20200103262A (ko) 비트라인의 전하 공유에 기반하는 cim 장치 및 그 동작 방법
Zhang et al. An in-memory-computing DNN achieving 700 TOPS/W and 6 TOPS/mm 2 in 130-nm CMOS
CN109977470B (zh) 忆阻Hopfield神经网络稀疏编码的电路及其操作方法
KR20220004430A (ko) 인-메모리 프로세싱을 수행하는 장치 및 이를 포함하는 컴퓨팅 장치
Cao et al. NeuADC: Neural network-inspired RRAM-based synthesizable analog-to-digital conversion with reconfigurable quantization support
Liu et al. A 40-nm 202.3 nJ/classification neuromorphic architecture employing in-SRAM charge-domain compute
CN114758699A (zh) 一种数据处理方法、系统、装置及介质
Xuan et al. High-efficiency data conversion interface for reconfigurable function-in-memory computing
Ogbogu et al. Energy-efficient reram-based ml training via mixed pruning and reconfigurable adc
CN107835023B (zh) 一种逐次逼近型数模转换器
Lin et al. A reconfigurable in-SRAM computing architecture for DCNN applications
CN108777580B (zh) 混合电容翻转技术控制sar adc电平开关方法
CN111478704A (zh) 低功耗模拟数字转换器
Jiang et al. Multicore Spiking Neuromorphic Chip in 180-nm with ReRAM Synapses and Digital Neurons

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20837535

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20837535

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20837535

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 16/09/2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20837535

Country of ref document: EP

Kind code of ref document: A1