WO2021114859A1 - 利用忆阻器本征噪声实现贝叶斯神经网络的方法及装置 - Google Patents
利用忆阻器本征噪声实现贝叶斯神经网络的方法及装置 Download PDFInfo
- Publication number
- WO2021114859A1 WO2021114859A1 PCT/CN2020/121244 CN2020121244W WO2021114859A1 WO 2021114859 A1 WO2021114859 A1 WO 2021114859A1 CN 2020121244 W CN2020121244 W CN 2020121244W WO 2021114859 A1 WO2021114859 A1 WO 2021114859A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- memristor
- bayesian
- weight distribution
- network
- neural network
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C13/00—Digital stores characterised by the use of storage elements not covered by groups G11C11/00, G11C23/00, or G11C25/00
- G11C13/0002—Digital stores characterised by the use of storage elements not covered by groups G11C11/00, G11C23/00, or G11C25/00 using resistive RAM [RRAM] elements
- G11C13/0004—Digital stores characterised by the use of storage elements not covered by groups G11C11/00, G11C23/00, or G11C25/00 using resistive RAM [RRAM] elements comprising amorphous/crystalline phase transition cells
Definitions
- This application belongs to the technical field of neural networks, and in particular relates to a method and device for realizing Bayesian neural network by using memristor intrinsic noise.
- DNNs deep neural networks
- Bayesian neural network Bayesian neural network
- Bayesian neural networks are widely used in medical diagnosis, recommendation systems, few-sample learning, nonlinear dynamic system control, and attack detection.
- the ownership values in BayNNs are all represented by random variables. The mean and standard deviation of the probability distribution should be adjusted to train the network to be used in different scenarios.
- BayNNs uses Markov Chain Monte-Carlo (Markov Chain Monte-Carlo, MCMC) method to sample the weight distribution.
- This application aims to solve one of the technical problems in the related technology at least to a certain extent.
- one purpose of the present application is to propose a method and device for implementing Bayesian neural network using intrinsic noise of the memristor, which has low computational power consumption and high speed.
- Another purpose of the present application is to propose a device for implementing Bayesian neural network using intrinsic noise of memristor.
- one embodiment of the present application proposes a method for implementing Bayesian neural network by using intrinsic noise of the memristor, which includes:
- S1 Obtain a Bayesian network, and train the Bayesian network according to a selected data set to obtain a weight distribution of the Bayesian network;
- the method for implementing Bayesian neural network using intrinsic noise of memristor in the embodiment of the present application maps the weight distribution of the Bayesian neural network to the memristor, and uses the memristor cross array to realize distributed sampling and matrix vector multiplication.
- the method and the use of the intrinsic noise of the memristor when reading are used to realize the sampling of random variables, and the Bayesian neural network is realized through the memristor cross array, with low power consumption and high speed.
- the method for implementing a Bayesian neural network using memristor intrinsic noise may also have the following additional technical features:
- the input sequence is applied to the bit line of the mapped memristor with a READ voltage pulse, and the output current flowing out of the source line of the mapped memristor is collected, and The output current is calculated to obtain the prediction result.
- the calculating the Bayesian network to obtain the weight distribution of the Bayesian network includes:
- the selected data set is calculated by a variational method to obtain the weight distribution of the Bayesian network.
- the processing the weight distribution of the Bayesian network includes:
- the weight distribution is biased and scaled so that the weight distribution satisfies the conductance window of the memristor.
- the calculation based on the processed weight distribution and the conductance of multiple memristors to obtain the target conductance value includes:
- the method when the target conductance value is mapped to the memristor, the method further includes:
- the Bayesian network includes but is not limited to a fully connected structure and a convolutional neural network structure, and each weight distribution of the Bayesian network is an independent distribution.
- another embodiment of the present application proposes a device for implementing Bayesian neural network using intrinsic noise of the memristor, including:
- a training module configured to obtain a Bayesian network, and train the Bayesian network according to a selected data set to obtain a weight distribution of the Bayesian network
- the mapping module is used to process the weight distribution of the Bayesian network, calculate according to the processed weight distribution and the conductance of a plurality of memristors to obtain a target conductance value, and map the target conductance value to the Memristor.
- the device for implementing Bayesian neural network using intrinsic noise of the memristor in the embodiment of the present application maps the weight distribution of the Bayesian neural network to the memristor, and uses the memristor cross array to realize distributed sampling and matrix vector multiplication.
- the method and the use of the intrinsic noise of the memristor when reading are used to realize the sampling of random variables, and the Bayesian neural network is realized through the memristor cross array, with low power consumption and high speed.
- the device for implementing Bayesian neural network by using memristor intrinsic noise may also have the following additional technical features:
- it further includes:
- the prediction module is used to apply the READ voltage pulse to the bit of the mapped memristor when the input sequence is used for the prediction calculation through the Bayesian network, and collect the source line of the mapped memristor. Output current, calculate the output current to obtain the prediction result.
- mapping module is specifically used for:
- Figure 1 is a schematic diagram of the Bayesian network calculation process based on the MCMC sampling method
- Fig. 2 is a flowchart of a method for implementing Bayesian neural network by using intrinsic noise of a memristor according to an embodiment of the present application;
- Fig. 3 is an architecture diagram of a Bayesian neural network system based on a memristor according to an embodiment of the present application
- Fig. 4 is a process diagram of calculating and writing a target conductance value according to an embodiment of the present application
- FIG. 5 is a schematic diagram of mapping a Bayesian network to a memristor array according to an embodiment of the present application
- FIG. 6 is a schematic diagram of the total current outputted by multiple memristors in READ according to an embodiment of the present application showing a Gaussian distribution
- Fig. 7 is a schematic structural diagram of an apparatus for implementing a Bayesian neural network using intrinsic noise of a memristor according to an embodiment of the present application.
- Fig. 2 is a flowchart of a method for implementing a Bayesian neural network using intrinsic noise of a memristor according to an embodiment of the present application.
- the method for implementing Bayesian neural network by using memristor intrinsic noise includes the following steps:
- Step S1 Obtain the Bayesian network, and train the Bayesian network according to the selected data set to obtain the weight distribution of the Bayesian network.
- FIG. 3 is a diagram of the architecture of a Bayesian neural network system based on a memristor according to an embodiment of the present application.
- the structure of the Bayesian neural network includes but is not limited to a fully connected structure and a CNN structure, etc., but the network weights are random variables .
- the weights of the fully connected network/CNN, etc. are fixed values, and each weight of the Bayesian neural network is a distribution.
- the weights in the Bayesian network are all a distribution, such as Gaussian distribution or Laplace distribution.
- each weight is distributed independently of each other.
- offline training is performed on a Bayesian network, and offline training is to calculate the distribution of weights in the Bayesian neural network by using a variational method for the selected data set on a computer.
- Step S2 processing the weight distribution of the Bayesian network, calculating according to the processed weight distribution and the conductance of the multiple memristors, to obtain the target conductance value, and map the target conductance value to the memristor.
- the weight distribution is biased and scaled until the weight distribution satisfies the appropriate conductance window.
- the weights of the same layer are all biased the same.
- zoom In the subsequent neural network prediction, it is necessary to remove the bias and zoom.
- the target conductance value is calculated according to the processed weight distribution and the conductance value of the memristor, and the calculated target conductance value is mapped to the memristor array.
- the input sequence is applied to the bit-line of the mapped memristor with a READ voltage pulse, and the mapped memristor is collected Calculate the output current from the source-line of the output current to obtain the predicted result.
- the input sequence is applied to the BL (Bit-line) with a READ voltage pulse, and then the output current flowing from the SL (Source-line) is collected for further calculation processing.
- the total output current presents a distribution similar to Gaussian.
- the total output current of all voltage pulses is the result of multiplying the input vector and the weighted sample value matrix.
- such a parallel read operation is equivalent to two operations of sampling and vector matrix multiplication.
- it further includes: measuring the conductance value of the memristor, verifying whether the difference between the conductance value of the memristor and the target conductance value is less than the error threshold, if it is less, then the verification is passed, and if it is not less than , Then adjust the pulse SET/RESET operation to the memristor until the verification is passed or the maximum verification times are reached.
- the conductance value G of the device is used to verify whether the difference between the conductance value and the target conductance value reaches the error ⁇ . If it is not reached, perform the adjustment pulse SET/RESET operation on the memristor array. If the error ⁇ is reached, the verification is passed; otherwise, the SET/RESET operation of the corresponding memristor is continued until the verification is passed or the maximum number of verifications is reached.
- the weight distribution of the Bayesian neural network is mapped to the memristor, and the memristor cross array is used to realize distributed sampling and matrix
- the method of vector multiplication and the use of the intrinsic noise of the memristor during reading realize the sampling of random variables, and the Bayesian neural network is realized through the memristor cross array, with low power consumption and high speed.
- Fig. 7 is a schematic structural diagram of an apparatus for implementing a Bayesian neural network using intrinsic noise of a memristor according to an embodiment of the present application.
- the device for implementing Bayesian neural network by using intrinsic noise of the memristor includes: a training module 100 and a mapping module 200.
- the training module 100 is used to obtain the Bayesian network, and train the Bayesian network according to the selected data set to obtain the weight distribution of the Bayesian network.
- the mapping module 200 is used to process the weight distribution of the Bayesian network, calculate according to the processed weight distribution and the conductance of the multiple memristors, to obtain the target conductance value, and map the target conductance value to the memristor.
- a prediction module which is used to apply the input sequence to the bit line of the mapped memristor with a READ voltage pulse when the prediction calculation is performed through the Bayesian network. -line), collect the output current flowing out of the source-line of the mapped memristor, calculate the output current, and obtain the prediction result.
- calculating the Bayesian network to obtain the weight distribution of the Bayesian network includes:
- the selected data set is calculated by the variational method, and the weight distribution of the Bayesian network is obtained.
- processing the weight distribution of the Bayesian network includes:
- the weight distribution is biased and scaled so that the weight distribution meets the conductance window of the memristor.
- the calculation is performed based on the processed weight distribution and the conductance of multiple memristors to obtain the target conductance value, including:
- the method when the target conductance value is mapped to the memristor, the method further includes:
- the weight distribution of the Bayesian neural network is mapped to the memristor, and the memristor cross array is used to realize distributed sampling and matrix
- the method of vector multiplication and the use of the intrinsic noise of the memristor during reading realize the sampling of random variables, and the Bayesian neural network is realized through the memristor cross array, with low power consumption and high speed.
- first and second are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features. Therefore, the features defined with “first” and “second” may explicitly or implicitly include at least one of the features. In the description of the present application, "a plurality of” means at least two, such as two, three, etc., unless specifically defined otherwise.
- the terms “installed”, “connected”, “connected”, “fixed” and other terms should be understood in a broad sense, for example, it can be a fixed connection or a detachable connection , Or integrated; it can be mechanically connected or electrically connected; it can be directly connected or indirectly connected through an intermediary, it can be the internal connection of two components or the interaction relationship between two components, unless otherwise specified The limit.
- installed can be a fixed connection or a detachable connection , Or integrated; it can be mechanically connected or electrically connected; it can be directly connected or indirectly connected through an intermediary, it can be the internal connection of two components or the interaction relationship between two components, unless otherwise specified The limit.
- the first feature “on” or “under” the second feature may be in direct contact with the first and second features, or the first and second features may be indirectly through an intermediary. contact.
- the "above”, “above” and “above” of the first feature on the second feature may mean that the first feature is directly above or diagonally above the second feature, or it simply means that the level of the first feature is higher than the second feature.
- the “below”, “below” and “below” of the second feature of the first feature may mean that the first feature is directly below or obliquely below the second feature, or simply means that the level of the first feature is smaller than the second feature.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Chemical & Material Sciences (AREA)
- Crystallography & Structural Chemistry (AREA)
- Semiconductor Integrated Circuits (AREA)
- Complex Calculations (AREA)
Abstract
本申请公开了一种利用忆阻器本征噪声实现贝叶斯神经网络的方法及装置,其中,该方法包括:获取贝叶斯网络,根据选定的数据集对所述贝叶斯网络进行训练得到所述贝叶斯网络的权重分布;对所述贝叶斯网络的权重分布进行处理,根据处理后的权重分布及多个忆阻器的电导进行计算,得到目标电导值,将所述目标电导值映射到所述忆阻器中。该方法利用忆阻器交叉阵列实现贝叶斯神经网络,功耗低,计算速度快,计算能效高。
Description
相关申请的交叉引用
本申请要求清华大学于2019年12月09日提交的、发明名称为“利用忆阻器本征噪声实现贝叶斯神经网络的方法及装置”的、中国专利申请号“201911251361.2”的优先权。
本申请属于神经网络技术领域,特别涉及一种利用忆阻器本征噪声实现贝叶斯神经网络的方法及装置。
在人工智能领域,深度神经网络(Deep neural networks,DNNs)近年来得到快速发展,在图像和视觉计算、语音和语言处理、信息安全、棋类比赛等领域取得了令人瞩目的成绩。然而,普通的DNN难以抵御攻击,比如在图像分类的情况下,输入图像添加了人眼察觉不到的小扰动,但是DNN产生错误和过分自信的分类结果,因为DNN不能捕捉预测和模型中的不确定性。这样的扰动输入(被称为对抗样本,adversarial example)是在安全关键应用中使用DNN的主要障碍。另一方面,贝叶斯神经网络(Bayesian neural network,BayNN)可以通过评估预测的不确定性来检测对抗样本。基于这一优点,贝叶斯神经网络被广泛应用于医疗诊断、推荐系统、少样本学习、非线性动态系统控制和攻击检测等领域。与标准DNNs中的定值权值不同,BayNNs中的所有权值都用随机变量表示。概率分布的均值和标准差应可调,以训练网络应用在不同的场景。但是BayNNs用马尔可夫链蒙特卡罗(Markov Chain Monte-Carlo,MCMC)方法对权重分布进行采样。
马尔可夫链蒙特卡罗方法由于要进行大量的抽样计算,对计算机速度依赖性强,在传统的硬件计算平台上,BayNN需要比较大的计算成本。在神经网络预测时,如图1所示,需要先用MCMC对权重进行采样的到权重矩阵Wsample,然后输入X与Wsample进行向量矩阵乘运算(VMM)。这通常会带来很高的计算成本,成为了贝叶斯神经网络应用的一个主要限制。
发明内容
本申请旨在至少在一定程度上解决相关技术中的技术问题之一。
为此,本申请的一个目的在于提出一种利用忆阻器本征噪声实现贝叶斯神经网络的方法及装置,该方法计算功耗低,速度快。
本申请的另一个目的在于提出一种利用忆阻器本征噪声实现贝叶斯神经网络的装置。
为达到上述目的,本申请一方面实施例提出了一种利用忆阻器本征噪声实现贝叶斯神经网络的方法,包括:
S1,获取贝叶斯网络,根据选定的数据集对所述贝叶斯网络进行训练得到所述贝叶斯网络的权重分布;
S2,对所述贝叶斯网络的权重分布进行处理,根据处理后的权重分布及多个忆阻器的电导进行计算,得到目标电导值,将所述目标电导值映射到所述忆阻器中。
本申请实施例的利用忆阻器本征噪声实现贝叶斯神经网络的方法,将贝叶斯神经网络的权重分布映射到忆阻器中,利用忆阻器交叉阵列实现分布采样和矩阵向量乘的方法以及利用忆阻器在读取时的本征噪声实现对随机变量的采样,通过忆阻器交叉阵列实现贝叶斯神经网络,功耗低,速度快。
另外,根据本申请上述实施例的利用忆阻器本征噪声实现贝叶斯神经网络的方法还可以具有以下附加的技术特征:
进一步地,在本申请的一个实施例中,在所述S2之后还包括:
通过所述贝叶斯网络进行预测计算时,将输入序列以READ电压脉冲施加到映射后的忆阻器的位线上,采集所述映射后的忆阻器的源线流出的输出电流,对所述输出电流进行计算,得到预测结果。
进一步地,在本申请的一个实施例中,所述对所述贝叶斯网络进行计算得到所述贝叶斯网络的权重分布,包括:
通过变分方法对所述选定的数据集进行计算,得到所述贝叶斯网络的权重分布。
进一步地,在本申请的一个实施例中,所述对所述贝叶斯网络的权重分布进行处理,包括:
对所述权重分布进行偏置和放缩,以使所述权重分布满足所述忆阻器的电导窗口。
进一步地,在本申请的一个实施例中,所述根据处理后的权重分布及多个忆阻器的电导进行计算,得到目标电导值,包括:
处理后的所述贝叶斯网络的权重分布为(μ,σ
2),其中,μ为均值,σ为标准差,多个忆阻器的电导G
ntarget(n=1,2,…,N);
使处理后的所述贝叶斯网络的权重分布与所述多个忆阻器的电导满足公式(1)和(2):
sum(G
ntarget)=μ (1)
sum[σ(G
ntarget)
2]=σ
2 (2)
对式(1)和式(2)进行求解,得到所述目标电导值。
进一步地,在本申请的一个实施例中,在将所述目标电导值映射到所述忆阻器中时,还包括:
测量所述忆阻器的电导值,验证所述忆阻器的电导值与所述目标电导值的差是否小于误差阈值,若小于,则通过验证,若不小于,则对所述忆阻器进行调节脉冲SET/RESET操作,直至通过验证或达到最大验证次数。
进一步地,在本申请的一个实施例中,所述贝叶斯网络包括但不限于全连接结构和卷积神经网络结构,所述贝叶斯网络的每个权重分布为相互独立的分布。
为达到上述目的,本申请另一方面实施例提出了一种利用忆阻器本征噪声实现贝叶斯神经网络的装置,包括:
训练模块,用于获取贝叶斯网络,根据选定的数据集对所述贝叶斯网络进行训练得到所述贝叶斯网络的权重分布;
映射模块,用于对所述贝叶斯网络的权重分布进行处理,根据处理后的权重分布及多个忆阻器的电导进行计算,得到目标电导值,将所述目标电导值映射到所述忆阻器中。
本申请实施例的利用忆阻器本征噪声实现贝叶斯神经网络的装置,将贝叶斯神经网络的权重分布映射到忆阻器中,利用忆阻器交叉阵列实现分布采样和矩阵向量乘的方法以及利用忆阻器在读取时的本征噪声实现对随机变量的采样,通过忆阻器交叉阵列实现贝叶斯神经网络,功耗低,速度快。
另外,根据本申请上述实施例的利用忆阻器本征噪声实现贝叶斯神经网络的装置还可以具有以下附加的技术特征:
进一步地,在本申请的一个实施例中,还包括:
预测模块,用于通过所述贝叶斯网络进行预测计算时,将输入序列以READ电压脉冲施加到映射后的忆阻器的位上,采集所述映射后的忆阻器的源线流出的输出电流,对所述输出电流进行计算,得到预测结果。
进一步地,在本申请的一个实施例中,所述映射模块,具体用于,
对所述权重分布进行偏置和放缩处理,处理后的所述贝叶斯网络的权重分布为(μ,σ
2),其中,μ为均值,σ为标准差,多个忆阻器的电导G
ntarget(n=1,2,…,N);
使处理后的所述贝叶斯网络的权重分布与所述多个忆阻器的电导满足公式(1)和(2):
sum(G
ntarget)=μ (1)
sum[σ(G
ntarget)
2]=σ
2 (2)
对式(1)和式(2)进行求解,得到所述目标电导值。
本申请附加的方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本申请的实践了解到。
本申请上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:
图1为基于MCMC采样方法的贝叶斯网络计算过程示意图;
图2为根据本申请一个实施例的利用忆阻器本征噪声实现贝叶斯神经网络的方法流程图;
图3为根据本申请一个实施例的基于忆阻器的贝叶斯神经网络系统架构图;
图4为根据本申请一个实施例的目标电导值计算及写入过程图;
图5为根据本申请一个实施例的贝叶斯网络映射到忆阻器阵列的示意图;
图6为根据本申请一个实施例的多个忆阻器在READ时,输出的总电流呈现高斯分布的示意图;
图7为根据本申请一个实施例的利用忆阻器本征噪声实现贝叶斯神经网络的装置结构示意图。
下面详细描述本申请的实施例,实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本申请,而不能理解为对本申请的限制。
下面参照附图描述根据本申请实施例提出的利用忆阻器本征噪声实现贝叶斯神经网络的方法及装置。
首先将参照附图描述根据本申请实施例提出的利用忆阻器本征噪声实现贝叶斯神经网络的方法。
图2为根据本申请一个实施例的利用忆阻器本征噪声实现贝叶斯神经网络的方法流程图。
如图2所示,该利用忆阻器本征噪声实现贝叶斯神经网络的方法包括以下步骤:
步骤S1,获取贝叶斯网络,根据选定的数据集对贝叶斯网络进行训练得到贝叶斯网络的权重分布。
图3为根据本申请一个实施例的基于忆阻器的贝叶斯神经网络系统架构图,贝叶 斯神经网络的结构包括但不限于全连接结构和CNN结构等,但网络权值是随机变量。在训练完成后,全连接网络/CNN等的权值是固定值,而贝叶斯神经网络的每一个权重都是一个分布。如图3所示的,贝叶斯网络中的权重都是一个分布,如高斯分布或者拉普拉斯分布。
本申请的实施例中针对的贝叶斯神经网络,每一个权重都是相互独立的分布。
进一步地,对一个贝叶斯网络进行offline训练,offline训练为在计算机上,针对选定的数据集利用变分方法计算贝叶斯神经网络中权重的分布。
根据所要实现的不同目的选取不同的数据集对贝叶斯网络进行训练,得到贝叶斯网络的权重分布。
步骤S2,对贝叶斯网络的权重分布进行处理,根据处理后的权重分布及多个忆阻器的电导进行计算,得到目标电导值,将目标电导值映射到忆阻器中。
在BayNN offline训练完成得到权重分布后,对权重分布进行偏置和放缩,直至权重分布满足合适的电导窗口,其中,在进行偏置和放缩时,同一层的权重都进行相同的偏置和放缩。在后续进行神经网络预测时,需要除去偏置和放缩。
对权重分布进行偏置和放缩处理后,根据处理后的权重分布和忆阻器的电导值计算目标电导值,将计算出的目标电导值映射到忆阻器阵列中。
如图4所示,假设偏置和放缩后得到的某一个权重为(μ,σ
2),其中,μ为均值,σ为标准差,多个忆阻器的电导G
ntarget(n=1,2,…,N);
为了在阵列上使用N个忆阻器实现期望的分布电导权重(μ,σ
2),N个忆阻器的电导G
ntarget(n=1,2,…,N)需要满足方程sum(G
ntarget)=μ与sum[σ(G
ntarget)
2]=σ
2。
对不定方程进行求解,可以得到一组目标电导值G
ntarget(n=1,2,…,N),将目标电导值写入忆阻器中。
如图5所示,展示了将权重分布映射到忆阻器阵列的过程,其中,N个忆阻器充当网络中层与层之间的一个权重。将贝叶斯网络中的权重分布转换为电导值映射到忆阻器阵列的交叉序列中。
传统的贝叶斯神经网络在预测计算时,首先对这一层的所有的权重进行采样得到权重采样值矩阵,然后输入向量与该权重采样值矩阵相乘,计算功耗大,速度慢。
在本申请的实施例中,通过贝叶斯网络进行预测计算时,将输入序列以READ电压脉冲施加到映射后的忆阻器的位线(Bit-line)上,采集映射后的忆阻器的源线(Source-line)流出的输出电流,对输出电流进行计算,得到预测结果。
可以理解的是,如图6所示,输入序列以READ电压脉冲施加在BL(Bit-line)上,然后采集从SL(Source-line)流出的输出电流进行进一步计算处理。N个忆阻器在读电流且N比较大时,输出的总电流呈现类似于高斯的分布。所有电压脉冲的输出总电流就是输入向量与权重采样值矩阵相乘的结果。在忆阻器交叉阵列中,这样的一次并行读操作就相当于实现了采样和向量矩阵乘法的两个操作。
进一步地,在本申请的实施例中,还包括:测量忆阻器的电导值,验证忆阻器的电导值与目标电导值的差是否小于误差阈值,若小于,则通过验证,若不小于,则对忆阻器进行调节脉冲SET/RESET操作,直至通过验证或达到最大验证次数。
可以理解的是,在将目标电导值写入忆阻器的过程中,为减小忆阻器电导在调节时存在波动以及非线性因素的带来的影响,确保更新的有效,先测量忆阻器电导值G,验证电导值与目标电导值的差是否达到误差ε。如果没有达到再对忆阻器阵列进行调节脉冲SET/RESET操作。如果达到误差ε,则通过验证;否则,继续对对应的忆阻器进行SET/RESET操作,直到通过验证或者达到最大验证次数为止。
根据本申请实施例提出的利用忆阻器本征噪声实现贝叶斯神经网络的方法,将贝叶斯神经网络的权重分布映射到忆阻器中,利用忆阻器交叉阵列实现分布采样和矩阵向量乘的方法以及利用忆阻器在读取时的本征噪声实现对随机变量的采样,通过忆阻器交叉阵列实现贝叶斯神经网络,功耗低,速度快。
其次参照附图描述根据本申请实施例提出的利用忆阻器本征噪声实现贝叶斯神经网络的装置。
图7为根据本申请一个实施例的利用忆阻器本征噪声实现贝叶斯神经网络的装置结构示意图。
如图7所示,该利用忆阻器本征噪声实现贝叶斯神经网络的装置包括:训练模块100和映射模块200。
训练模块100,用于获取贝叶斯网络,根据选定的数据集对贝叶斯网络进行训练得到贝叶斯网络的权重分布。
映射模块200,用于对贝叶斯网络的权重分布进行处理,根据处理后的权重分布及多个忆阻器的电导进行计算,得到目标电导值,将目标电导值映射到忆阻器中。
进一步地,在本申请的一个实施例中,还包括:预测模块,用于通过贝叶斯网络进行预测计算时,将输入序列以READ电压脉冲施加到映射后的忆阻器的位线(Bit-line)上,采集映射后的忆阻器的源线(Source-line)流出的输出电流,对输出电流进行计算,得到预测结果。
进一步地,在本申请的一个实施例中,对贝叶斯网络进行计算得到贝叶斯网络的权重 分布,包括:
通过变分方法对选定的数据集进行计算,得到贝叶斯网络的权重分布。
进一步地,在本申请的一个实施例中,对贝叶斯网络的权重分布进行处理,包括:
对权重分布进行偏置和放缩,以使权重分布满足忆阻器的电导窗口。
进一步地,在本申请的一个实施例中,根据处理后的权重分布及多个忆阻器的电导进行计算,得到目标电导值,包括:
处理后的贝叶斯网络的权重分布为(μ,σ
2),其中,μ为均值,σ为标准差,多个忆阻器的电导G
ntarget(n=1,2,…,N);
使处理后的贝叶斯网络的权重分布与多个忆阻器的电导满足公式(1)和(2):
sum(G
ntarget)=μ (1)
sum[σ(G
ntarget)
2]=σ
2 (2)
对式(1)和式(2)进行求解,得到目标电导值。
进一步地,在本申请的一个实施例中,在将目标电导值映射到忆阻器中时,还包括:
测量忆阻器的电导值,验证忆阻器的电导值与目标电导值的差是否小于误差阈值,若小于,则通过验证,若不小于,则对忆阻器进行调节脉冲SET/RESET操作,直至通过验证或达到最大验证次数。
需要说明的是,前述对利用忆阻器本征噪声实现贝叶斯神经网络的方法实施例的解释说明也适用于该实施例的装置,此处不再赘述。
根据本申请实施例提出的利用忆阻器本征噪声实现贝叶斯神经网络的装置,将贝叶斯神经网络的权重分布映射到忆阻器中,利用忆阻器交叉阵列实现分布采样和矩阵向量乘的方法以及利用忆阻器在读取时的本征噪声实现对随机变量的采样,通过忆阻器交叉阵列实现贝叶斯神经网络,功耗低,速度快。
在本申请的描述中,需要理解的是,术语“中心”、“纵向”、“横向”、“长度”、“宽度”、“厚度”、“上”、“下”、“前”、“后”、“左”、“右”、“竖直”、“水平”、“顶”、“底”“内”、“外”、“顺时针”、“逆时针”、“轴向”、“径向”、“周向”等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本申请和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本申请的限制。
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本申请的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。
在本申请中,除非另有明确的规定和限定,术语“安装”、“相连”、“连接”、“固定”等术语应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或成一体;可以是机械连接,也可以是电连接;可以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通或两个元件的相互作用关系,除非另有明确的限定。对于本领域的普通技术人员而言,可以根据具体情况理解上述术语在本申请中的具体含义。
在本申请中,除非另有明确的规定和限定,第一特征在第二特征“上”或“下”可以是第一和第二特征直接接触,或第一和第二特征通过中间媒介间接接触。而且,第一特征在第二特征“之上”、“上方”和“上面”可是第一特征在第二特征正上方或斜上方,或仅仅表示第一特征水平高度高于第二特征。第一特征在第二特征“之下”、“下方”和“下面”可以是第一特征在第二特征正下方或斜下方,或仅仅表示第一特征水平高度小于第二特征。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本申请的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
尽管上面已经示出和描述了本申请的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本申请的限制,本领域的普通技术人员在本申请的范围内可以对上述实施例进行变化、修改、替换和变型。
Claims (10)
- 一种利用忆阻器本征噪声实现贝叶斯神经网络的方法,其特征在于,包括以下步骤:S1,获取贝叶斯网络,根据选定的数据集对所述贝叶斯网络进行训练得到所述贝叶斯网络的权重分布;S2,对所述贝叶斯网络的权重分布进行处理,根据处理后的权重分布及多个忆阻器的电导进行计算,得到目标电导值,将所述目标电导值映射到所述忆阻器中。
- 根据权利要求1所述的利用忆阻器本征噪声实现贝叶斯神经网络的方法,其特征在于,在所述S2之后还包括:通过所述贝叶斯网络进行预测计算时,将输入序列以READ电压脉冲施加到映射后的忆阻器的位线上,采集所述映射后的忆阻器的源线流出的输出电流,对所述输出电流进行计算,得到预测结果。
- 根据权利要求1所述的利用忆阻器本征噪声实现贝叶斯神经网络的方法,其特征在于,所述对所述贝叶斯网络进行计算得到所述贝叶斯网络的权重分布,包括:通过变分方法对所述选定的数据集进行计算,得到所述贝叶斯网络的权重分布。
- 根据权利要求1所述的利用忆阻器本征噪声实现贝叶斯神经网络的方法,其特征在于,所述对所述贝叶斯网络的权重分布进行处理,包括:对所述权重分布进行偏置和放缩,以使所述权重分布满足所述忆阻器的电导窗口。
- 根据权利要求1所述的利用忆阻器本征噪声实现贝叶斯神经网络的方法,其特征在于,所述根据处理后的权重分布及多个忆阻器的电导进行计算,得到目标电导值,包括:处理后的所述贝叶斯网络的权重分布为(μ,σ 2),其中,μ为均值,σ为标准差,多个忆阻器的电导G ntarget(n=1,2,…,N);使处理后的所述贝叶斯网络的权重分布与所述多个忆阻器的电导满足公式(1)和(2):sum(G ntarget)=μ (1)sum[σ(G ntarget) 2]=σ 2 (2)对式(1)和式(2)进行求解,得到所述目标电导值。
- 根据权利要求1所述的利用忆阻器本征噪声实现贝叶斯神经网络的方法,其特征在于,在将所述目标电导值映射到所述忆阻器中时,还包括:测量所述忆阻器的电导值,验证所述忆阻器的电导值与所述目标电导值的差是否小于误差阈值,若小于,则通过验证,若不小于,则对所述忆阻器进行调节脉冲SET/RESET操作,直至通过验证或达到最大验证次数。
- 根据权利要求1所述的利用忆阻器本征噪声实现贝叶斯神经网络的方法,其特征在于,所述贝叶斯网络包括但不限于全连接结构和卷积神经网络结构,所述贝叶斯网络的每个权重分布为相互独立的分布。
- 一种利用忆阻器本征噪声实现贝叶斯神经网络的装置,其特征在于,包括:训练模块,用于获取贝叶斯网络,根据选定的数据集对所述贝叶斯网络进行训练得到所述贝叶斯网络的权重分布;映射模块,用于对所述贝叶斯网络的权重分布进行处理,根据处理后的权重分布及多个忆阻器的电导进行计算,得到目标电导值,将所述目标电导值映射到所述忆阻器中。
- 根据权利要求8所述的利用忆阻器本征噪声实现贝叶斯神经网络的装置,其特征在于,还包括:预测模块,用于通过所述贝叶斯网络进行预测计算时,将输入序列以READ电压脉冲施加到映射后的忆阻器的位线上,采集所述映射后的忆阻器的源线流出的输出电流,对所述输出电流进行计算,得到预测结果。
- 根据权利要求8所述的利用忆阻器本征噪声实现贝叶斯神经网络的装置,其特征在于,所述映射模块,具体用于,对所述权重分布进行偏置和放缩处理,处理后的所述贝叶斯网络的权重分布为(μ,σ 2),其中,μ为均值,σ为标准差,多个忆阻器的电导G ntarget(n=1,2,…,N);使处理后的所述贝叶斯网络的权重分布与所述多个忆阻器的电导满足公式(1)和(2):sum(G ntarget)=μ (1)sum[σ(G ntarget) 2]=σ 2 (2)对式(1)和式(2)进行求解,得到所述目标电导值。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911251361.2 | 2019-12-09 | ||
CN201911251361.2A CN110956256B (zh) | 2019-12-09 | 2019-12-09 | 利用忆阻器本征噪声实现贝叶斯神经网络的方法及装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021114859A1 true WO2021114859A1 (zh) | 2021-06-17 |
Family
ID=69980472
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/121244 WO2021114859A1 (zh) | 2019-12-09 | 2020-10-15 | 利用忆阻器本征噪声实现贝叶斯神经网络的方法及装置 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110956256B (zh) |
WO (1) | WO2021114859A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113610220A (zh) * | 2021-08-27 | 2021-11-05 | 中国人民解放军国防科技大学 | 神经网络模型的训练方法、应用方法及装置 |
WO2023217017A1 (zh) * | 2022-05-09 | 2023-11-16 | 清华大学 | 基于忆阻器阵列的贝叶斯神经网络的变分推理方法和装置 |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110956256B (zh) * | 2019-12-09 | 2022-05-17 | 清华大学 | 利用忆阻器本征噪声实现贝叶斯神经网络的方法及装置 |
CN111582473B (zh) * | 2020-04-23 | 2023-08-25 | 中科物栖(南京)科技有限公司 | 一种对抗样本的生成方法及装置 |
CN111553415B (zh) * | 2020-04-28 | 2022-11-15 | 宁波工程学院 | 一种基于忆阻器的esn神经网络图像分类处理方法 |
EP3907665A1 (en) * | 2020-05-06 | 2021-11-10 | Commissariat à l'énergie atomique et aux énergies alternatives | Bayesian neural network with resistive memory hardware accelerator and method for programming the same |
CN111681696B (zh) * | 2020-05-28 | 2022-07-08 | 中国科学院微电子研究所 | 基于非易失存储器的存储和数据处理方法、装置及设备 |
CN113191402B (zh) * | 2021-04-14 | 2022-05-20 | 华中科技大学 | 基于忆阻器的朴素贝叶斯分类器设计方法、系统及分类器 |
CN113505887B (zh) * | 2021-09-12 | 2022-01-04 | 浙江大学 | 一种针对忆阻器误差的忆阻器存储器神经网络训练方法 |
CN114781628A (zh) * | 2022-03-29 | 2022-07-22 | 清华大学 | 基于忆阻器噪声的数据增强方法、装置、电子设备及介质 |
CN114742218A (zh) * | 2022-05-09 | 2022-07-12 | 清华大学 | 基于忆阻器阵列的数据处理方法和数据处理装置 |
CN114819093A (zh) * | 2022-05-09 | 2022-07-29 | 清华大学 | 利用基于忆阻器阵列的环境模型的策略优化方法和装置 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180082177A1 (en) * | 2016-09-16 | 2018-03-22 | International Business Machines Corporation | Multi-memristive synapse with clock-arbitrated weight update |
CN109460817A (zh) * | 2018-09-11 | 2019-03-12 | 华中科技大学 | 一种基于非易失存储器的卷积神经网络片上学习系统 |
CN109902801A (zh) * | 2019-01-22 | 2019-06-18 | 华中科技大学 | 一种基于变分推理贝叶斯神经网络的洪水集合预报方法 |
CN110020718A (zh) * | 2019-03-14 | 2019-07-16 | 上海交通大学 | 基于变分推断的逐层神经网络剪枝方法和系统 |
CN110956256A (zh) * | 2019-12-09 | 2020-04-03 | 清华大学 | 利用忆阻器本征噪声实现贝叶斯神经网络的方法及装置 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150242745A1 (en) * | 2014-02-21 | 2015-08-27 | Qualcomm Incorporated | Event-based inference and learning for stochastic spiking bayesian networks |
US11403529B2 (en) * | 2018-04-05 | 2022-08-02 | Western Digital Technologies, Inc. | Noise injection training for memory-based learning |
CN109543827B (zh) * | 2018-12-02 | 2020-12-29 | 清华大学 | 生成式对抗网络装置及训练方法 |
CN109657787B (zh) * | 2018-12-19 | 2022-12-06 | 电子科技大学 | 一种二值忆阻器的神经网络芯片 |
CN110443168A (zh) * | 2019-07-23 | 2019-11-12 | 华中科技大学 | 一种基于忆阻器的神经网络人脸识别系统 |
-
2019
- 2019-12-09 CN CN201911251361.2A patent/CN110956256B/zh active Active
-
2020
- 2020-10-15 WO PCT/CN2020/121244 patent/WO2021114859A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180082177A1 (en) * | 2016-09-16 | 2018-03-22 | International Business Machines Corporation | Multi-memristive synapse with clock-arbitrated weight update |
CN109460817A (zh) * | 2018-09-11 | 2019-03-12 | 华中科技大学 | 一种基于非易失存储器的卷积神经网络片上学习系统 |
CN109902801A (zh) * | 2019-01-22 | 2019-06-18 | 华中科技大学 | 一种基于变分推理贝叶斯神经网络的洪水集合预报方法 |
CN110020718A (zh) * | 2019-03-14 | 2019-07-16 | 上海交通大学 | 基于变分推断的逐层神经网络剪枝方法和系统 |
CN110956256A (zh) * | 2019-12-09 | 2020-04-03 | 清华大学 | 利用忆阻器本征噪声实现贝叶斯神经网络的方法及装置 |
Non-Patent Citations (1)
Title |
---|
LIN YUDENG; HU XIAOBO SHARON; QIAN HE; WU HUAQIANG; ZHANG QINGTIAN; TANG JIANSHI; GAO BIN; LI CHONGXUAN; YAO PENG; LIU ZHENGWU; ZH: "Bayesian Neural Network Realization by Exploiting Inherent Stochastic Characteristics of Analog RRAM", 2019 IEEE INTERNATIONAL ELECTRON DEVICES MEETING (IEDM), 7 December 2019 (2019-12-07), pages 1 - 4, XP033714530, DOI: 10.1109/IEDM19573.2019.8993616 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113610220A (zh) * | 2021-08-27 | 2021-11-05 | 中国人民解放军国防科技大学 | 神经网络模型的训练方法、应用方法及装置 |
CN113610220B (zh) * | 2021-08-27 | 2023-12-26 | 中国人民解放军国防科技大学 | 神经网络模型的训练方法、应用方法及装置 |
WO2023217017A1 (zh) * | 2022-05-09 | 2023-11-16 | 清华大学 | 基于忆阻器阵列的贝叶斯神经网络的变分推理方法和装置 |
Also Published As
Publication number | Publication date |
---|---|
CN110956256B (zh) | 2022-05-17 |
CN110956256A (zh) | 2020-04-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021114859A1 (zh) | 利用忆阻器本征噪声实现贝叶斯神经网络的方法及装置 | |
Liu et al. | Remaining useful life estimation for proton exchange membrane fuel cells using a hybrid method | |
Chen et al. | Accurate discharge coefficient prediction of streamlined weirs by coupling linear regression and deep convolutional gated recurrent unit | |
Tsai et al. | Inference of long-short term memory networks at software-equivalent accuracy using 2.5 M analog phase change memory devices | |
Yu et al. | Dissolved oxygen content prediction in crab culture using a hybrid intelligent method | |
CN106470122A (zh) | 一种网络故障定位方法及装置 | |
Li et al. | Deep learning nonlinear multiscale dynamic problems using Koopman operator | |
Cheng et al. | A new approach for solving inverse reliability problems with implicit response functions | |
Xu et al. | A multi-location short-term wind speed prediction model based on spatiotemporal joint learning | |
WO2020191001A1 (en) | Real-world network link analysis and prediction using extended probailistic maxtrix factorization models with labeled nodes | |
Li et al. | Uncertainty degree and modeling of interval type-2 fuzzy sets: definition, method and application | |
Valyrakis et al. | Prediction of coarse particle movement with adaptive neuro‐fuzzy inference systems | |
Chen et al. | A modified neighborhood preserving embedding-based incipient fault detection with applications to small-scale cyber–physical systems | |
Kuok et al. | Broad Bayesian learning (BBL) for nonparametric probabilistic modeling with optimized architecture configuration | |
Xu et al. | An enhanced squared exponential kernel with Manhattan similarity measure for high dimensional Gaussian process models | |
Su et al. | Local prediction of chaotic time series based on polynomial coefficient autoregressive model | |
Li et al. | A lstm-based method for comprehension and evaluation of network security situation | |
Yin et al. | PowerFDNet: Deep learning-based stealthy false data injection attack detection for AC-model transmission systems | |
Al-Ridha et al. | Expecting confirmed and death cases of covid-19 in Iraq by utilizing backpropagation neural network | |
Wang et al. | Exploiting outlier value effects in sparse urban crowdsensing | |
Li et al. | Nonlinear model identification from multiple data sets using an orthogonal forward search algorithm | |
Luo et al. | Knowledge grounded conversational symptom detection with graph memory networks | |
US11430524B2 (en) | Method for designing an initialization function for programming a memory element | |
Cai et al. | Maximum Gaussianality training for deep speaker vector normalization | |
Zhu et al. | Causal deep reinforcement learning using observational data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20899633 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20899633 Country of ref document: EP Kind code of ref document: A1 |