WO2019127363A1 - Weight coding method for neural network, computing apparatus, and hardware system - Google Patents

Weight coding method for neural network, computing apparatus, and hardware system Download PDF

Info

Publication number
WO2019127363A1
WO2019127363A1 PCT/CN2017/119821 CN2017119821W WO2019127363A1 WO 2019127363 A1 WO2019127363 A1 WO 2019127363A1 CN 2017119821 W CN2017119821 W CN 2017119821W WO 2019127363 A1 WO2019127363 A1 WO 2019127363A1
Authority
WO
WIPO (PCT)
Prior art keywords
weight
matrix
splicing
analog circuit
training
Prior art date
Application number
PCT/CN2017/119821
Other languages
French (fr)
Chinese (zh)
Inventor
张悠慧
季宇
张优扬
Original Assignee
清华大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 清华大学 filed Critical 清华大学
Priority to CN201780042640.0A priority Critical patent/CN109791626B/en
Priority to PCT/CN2017/119821 priority patent/WO2019127363A1/en
Publication of WO2019127363A1 publication Critical patent/WO2019127363A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Definitions

  • the present invention relates generally to the field of neural network technologies, and more particularly to a weight coding method, a computing device, and a hardware system for a neural network.
  • the neural network has made breakthroughs in computing, and has achieved high accuracy in many fields such as image recognition, speech recognition, and natural language processing.
  • neural networks require massive computing resources.
  • the general-purpose processor has been difficult to meet the computational needs of deep learning, and designing a dedicated chip has become an important development direction.
  • memristor provides an efficient solution for neural network chip design.
  • Memristor has the advantages of high density, non-volatile, low power consumption, cost-effective, easy 3D, etc.
  • the characteristics of adjustable resistance can be used as the programmable weight, and the advantage of the combination of the calculation and the calculation can be used as the high-speed multiplier.
  • the neural network components are all neurons, which are connected to each other by a large number of neurons.
  • the connections between neurons can be thought of as directed edges with weights, the outputs of the neurons are weighted by the connections between the neurons, and then passed to the connected neurons, and all the neurons receive The inputs are added together for further processing, producing the output of the neurons.
  • Neural network modeling usually consists of several neurons as a layer, and layers are connected to each other to construct.
  • Figure 1 shows a chain of neural networks. Each circle in the figure represents a neuron, each The arrows indicate the connections between the neurons, each of which has a weight, and the structure of the actual neural network is not limited to a chain-like network structure.
  • the core computation of the neural network is a matrix vector multiplication operation.
  • the output produced by the layer L n containing n neurons can be represented by a vector V n of length n, which is fully associated with the layer L m containing m neurons, and the connection weights It can be expressed as a matrix M n ⁇ m , the matrix size is n rows and m columns, and each matrix element represents the weight of one connection.
  • the vector input to L m after weighting is M n ⁇ m V n , and such matrix vector multiplication is the core calculation of the neural network.
  • the neural network acceleration chip also has the main design goal of accelerating matrix multiplication.
  • the memristor array is just right for the above work.
  • V is a set of input voltage
  • the voltage is multiplied by the memristor conductance G and superimposed output current
  • the output current is multiplied by the grounding resistance Rs to obtain the output voltage V'.
  • the whole process is realized under the analog circuit, and has a fast speed and a small area.
  • chip computing based on memristor also has the disadvantages of low precision, large disturbance, large cost of digital-to-analog/analog conversion, and limited matrix size.
  • the memristor can perform matrix vector multiplication operations efficiently, since the memristor chip matrix vector multiplication is implemented in an analog circuit, noise and disturbance are inevitably brought about, so compared with the neural network, the memristor The calculation results are not accurate.
  • the use of a memristor indicates that the weight has a certain error. As shown in Figure 3, the weights of different levels will overlap. In order to avoid overlap, the existing methods generally use a number of low-precision memristor splicing to represent a high-precision weight, and in the case where each memristor has a low precision, the weight data can be considered accurate. Taking a 2-bit memristor to represent a 4-bit weight as an example, a 2-bit memristor is used to indicate a lower weight of 2 bits and the other represents a high 2 bits.
  • ISAAC existing ISAAC technology first trains a neural network with floating point numbers and then "writes" the weighted data to the memristor.
  • ISAAC uses four 2-bit memristor devices to represent an 8-bit weight, which allows more resources to be used to improve matrix operation accuracy.
  • ISAAC uses splicing methods to represent weights, which is relatively inefficient and requires a lot of resources. For example, if you represent one weight, you need 4 memristor devices.
  • the existing PRIME technology first trains a neural network with floating point numbers, then uses two 3-bit precision input voltages to represent a 6-bit input, and two 4-bit memristor devices to represent an 8-bit.
  • the weights are weighted, and the positive and negative weights are represented by two sets of arrays.
  • PRIME uses positive and negative addition and high and low splicing methods to represent weights, and also requires a lot of resources. That is, to represent one weight, four memristor devices are needed.
  • the present invention has been made in view of the above circumstances.
  • a non-splicing weight training method for a neural network comprising: a weight-spotting step of converting each matrix element of a weight matrix into a first number having a predetermined number of bits; a step of introducing a noise having a predetermined standard deviation into the first number to obtain a second number; and a training step of training the weight matrix represented by the second number, training to convergence, and obtaining a training result, wherein The training result will be used as the final weight matrix, each matrix element being written one by one into a single analog circuit device corresponding to a matrix element, wherein a single matrix is represented by a single analog circuit device rather than a splicing of multiple analog circuit devices. element.
  • the first number conversion in the weight setting step, can be performed by a linear relationship or a logarithmic relationship.
  • the noise may be a read/write error of an analog circuit, and obey a normal distribution law.
  • the analog circuit device may be a memristor, a capacitor comparator or a voltage comparator.
  • the first number may be a fixed point number and the second number may be a floating point number.
  • a non-splicing weight coding method for a neural network comprising the steps of: writing each matrix element of a weight matrix one by one into a single analog circuit device corresponding to a matrix element, A single matrix element is represented by a splicing of a single analog circuit device rather than a plurality of analog circuit devices, wherein the weight matrix is obtained by the non-splicing weight training method described above.
  • the method may further include the following steps: a weight-spotting step of converting each matrix element of the weight matrix into a first number having a predetermined number of bits; and an error introduction step in Introducing noise with a predetermined standard deviation into the first number to obtain a second number; and training step, training the weight matrix represented by the second number, training until convergence, and obtaining a training result.
  • a neural network chip having a basic module for performing an operation of matrix vector multiplication in hardware by an analog circuit device, wherein each matrix element of the weight matrix is written one by one to represent one A single analog circuit device of matrix elements to represent a single matrix element of a weight matrix by splicing of a single analog circuit device rather than multiple analog circuit devices.
  • the weight matrix may be obtained by the above non-splicing weight training method.
  • a computing device includes a memory and a processor having stored thereon computer executable instructions that, when executed by a processor, perform a non-splicing weight training method according to the above Or according to the above non-splicing weight coding method.
  • a neural network system comprising: the computing device according to the above; and the neural network chip according to the above.
  • an encoding method for a neural network which can greatly reduce resource consumption without affecting effects, thereby saving resource overhead, and arranging a large-scale nerve under conditions of limited resources.
  • the internet The internet.
  • Figure 1 shows a schematic of a chained neural network.
  • Figure 2 shows a schematic diagram of a memristor based crossbar switch structure.
  • Figure 3 shows a weighted statistical distribution map of eight levels of weights on a memristor.
  • Fig. 4 shows a schematic diagram of an application scenario of an encoding technique of a neural network according to the present invention.
  • Figure 5 shows a general flow chart of an encoding method in accordance with the present invention.
  • Fig. 6 shows a comparison of experimental effects using the existing high and low level stitching method and the encoding method according to the present invention.
  • the present application provides a new encoding method (hereinafter referred to as RLevel encoding method), which is essentially different from the existing method in that the new encoding method does not require that the weight values represented by a single device do not overlap, but instead Kinds of errors are introduced into the training.
  • RLevel encoding method By training the weight matrix containing noise and enabling it to train to convergence, the converged values are finally written into a single device, thereby enhancing the noise immunity of the model and reducing the representation of matrix elements.
  • the number of costs reduces resource and resource consumption.
  • Figure 3 shows a weighted statistical distribution map of eight levels of weights on a memristor.
  • circuit devices other than the memristor capable of realizing matrix vector multiplication are also possible, such as a capacitor or a voltage comparator.
  • l and h represent low and high-order devices respectively
  • the weight is expressed as 2 n *h+l
  • the errors of low and high are L ⁇ (l, ⁇ 2 ), H ⁇ (h, ⁇ 2 ), then 2 n *H ⁇ (h, 2 2n * ⁇ 2 ).
  • the weight range is 2 2n -l
  • the standard deviation of the weight error is We use the range of values and the standard deviation as the standard for the final accuracy.
  • the accuracy of the splicing weight method is:
  • a device is used to represent a high-precision weight with an accuracy of (2 n -l) / ⁇ .
  • Fig. 4 shows a schematic diagram of an application scenario of an encoding technique of a neural network according to the present invention.
  • the general inventive concept of the present disclosure is to solve the problem that the network model 1200 employed by the neural network application 1100 is weight-encoded by the encoding method 1300, and the result is written into the memristor device of the neural network chip 1400.
  • the weight based on the memristor neural network indicates the problem of requiring a large number of devices, and finally saves a lot of resources without significant loss of accuracy.
  • FIG. 5 shows a general flow diagram of an encoding method in accordance with the present invention, comprising the following steps:
  • Weight setting process S210 converting each matrix element of the weight matrix into a first number having a predetermined number of bits
  • each weight value is converted into a fixed-point number with a certain precision, and the fixed-point weight is obtained.
  • each weight value is converted into 4 to a specific number of points.
  • step S220 in which noise having a predetermined standard deviation is introduced in the first number to obtain a second number.
  • the first number is set to a fixed point number
  • the second number is equal to the first number plus noise, so the second number is a floating point number.
  • the number of fixed points of 0, 1, 2, and 3 adds noise and becomes four floating point numbers of -0.1, 1.02, 2.03, and 2.88.
  • the first number may be a floating point number. .
  • Training step S230 training the weight matrix represented by the second number, training to convergence, and then writing the training result as a final weight matrix into the circuit device for weight matrix calculation.
  • the fixed point matrix B (Table 2) is decomposed into a high order matrix H (Table 3) and a low order matrix L (Table 4):
  • the fixed-point matrix B is converted into the conductance value of 4*10 -6 to 4*10 -5 according to the RLevel method and the high-low level splicing method respectively, and the Rlevel conductance matrix of Table 5 is obtained.
  • RC, high conductivity matrix HC and low conductivity matrix LC are converted into the conductance value of 4*10 -6 to 4*10 -5 according to the RLevel method and the high-low level splicing method respectively.
  • the training process according to the present invention does not convert the matrix into conductance values, but rather increases the noise of a normal distribution with a standard deviation of ⁇ on the basis of the first number.
  • the introduction of the actual error is caused by noise and disturbance during the reading and writing process of the memristor device or other used circuit device, so the data is based on the conductance value as the analog value below. analysis.
  • the outputs of the Rlevel conductance matrix RC, the high-order conductance matrix HC, and the low-level conductance matrix LC are respectively:
  • the above spliced output is output according to the high bit output *4+ low bit.
  • the RLevel coding method according to the present invention has very close precision to the output of the prior art high and low level splicing method, whether noise is added or not. Therefore, the solution of the present invention is verified from a theoretical point of view. Practicality and feasibility.
  • Fig. 6 shows a comparison of experimental effects using the existing high and low level stitching method and the RLevel encoding method according to the present invention.
  • This experiment used a convolutional neural network to classify the CIFAR10 data set.
  • the data set has 60,000 32*32 pixel color images, each of which belongs to one of 10 categories.
  • the abscissa is the weight precision and the ordinate is the correct rate.
  • the lower line uses the RLevel method to represent the weights of 2, 4, 6, and 8 bits by one device, and the upper line is 2 of 1, 2, 3, and 4 bits, respectively.
  • the devices are spliced to represent 2, 4, 6, and 8 bits.
  • the accuracy of the RLevel method is very close to that of the high-low-level stitching method, but since only one device is used, and no splicing of multiple devices is required, it is non-splicing. Encoding, so you can save 50% of resources.
  • the weight coding method of the present invention it is possible to provide substantially the same accuracy as the existing high and low bit splicing without using high and low bit splicing, and the weight matrix calculation of the neural network by the analog circuit such as a memristor is solved.
  • the need to arrange a large number of circuit devices also reduces costs and saves resources.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Complex Calculations (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

A non-assembling-based weight coding method for a neural network, comprising: a weight fixed-point conversion step of converting each matrix element of a weight matrix into a first number having a predetermined number of bits (S210); an error introduction step of introducing noise having a predetermined standard deviation into the first number to obtain a second number (S220); and a training step of training a weight matrix represented by the second numbers until convergence occurs, and then writing the training result as a final weight matrix into a single analog circuit device correspondingly representing one matrix element (S230), wherein a single matrix element is represented by a single analog circuit device rather than multiple analog circuit devices assembled together. The coding method for a neural network can greatly reduce resource consumption without affecting the effect, thereby saving resource overhead, and thus, a large-scale neural network can be arranged with limited resources.

Description

神经网络权重编码方法、计算装置及硬件系统Neural network weight coding method, computing device and hardware system 技术领域Technical field
本发明总体地涉及神经网络技术领域,更具体地涉及用于神经网络的权重编码方法、计算装置以及硬件系统。The present invention relates generally to the field of neural network technologies, and more particularly to a weight coding method, a computing device, and a hardware system for a neural network.
背景技术Background technique
随着摩尔定律逐渐失效,现有芯片工艺进步放缓,人们不得不面向新应用和新器件。近年来,神经网络(Neural Network,NN)计算取得了突破性进展,在图像识别、语言识别、自然语言处理等诸多领域均取得了很高的准确率,但神经网络需要海量计算资源,现有的通用处理器已经很难满足深度学习的计算需求,设计专用芯片已经成为了一个重要的发展方向。与此同时,忆阻器的出现为神经网络芯片设计提供了一种高效的解决方案,忆阻器具有高密度、非易失、低功耗、存算合一、易于3D等优点,在神经网络计算中可以利用其阻值可调的特点作为可编程权重,并利用其存算合一的优点作高速乘加器。As Moore's Law gradually fails, the progress of existing chip processes has slowed down, and people have to face new applications and new devices. In recent years, the neural network (NN) has made breakthroughs in computing, and has achieved high accuracy in many fields such as image recognition, speech recognition, and natural language processing. However, neural networks require massive computing resources. The general-purpose processor has been difficult to meet the computational needs of deep learning, and designing a dedicated chip has become an important development direction. At the same time, the emergence of memristor provides an efficient solution for neural network chip design. Memristor has the advantages of high density, non-volatile, low power consumption, cost-effective, easy 3D, etc. In the network calculation, the characteristics of adjustable resistance can be used as the programmable weight, and the advantage of the combination of the calculation and the calculation can be used as the high-speed multiplier.
神经网络组成单元均为神经元,由大量神经元相互连接成网络。神经元之间的连接可以看作带权重的有向边,神经元的输出会被神经元之间的连接所加权,然后传递给所连到的神经元,而每个神经元接收到的所有输入会被累加起来进行进一步处理,产生神经元的输出。神经网络的建模通常以若干神经元为一层,层与层之间相互连接来构建,图1所示的是一种链状的神经网络,图中每一个圆表示一个神经元,每一个箭头表示神经元之间的连接,每个连接均有权重,实际神经网络的结构不限于链状的网络结构。The neural network components are all neurons, which are connected to each other by a large number of neurons. The connections between neurons can be thought of as directed edges with weights, the outputs of the neurons are weighted by the connections between the neurons, and then passed to the connected neurons, and all the neurons receive The inputs are added together for further processing, producing the output of the neurons. Neural network modeling usually consists of several neurons as a layer, and layers are connected to each other to construct. Figure 1 shows a chain of neural networks. Each circle in the figure represents a neuron, each The arrows indicate the connections between the neurons, each of which has a weight, and the structure of the actual neural network is not limited to a chain-like network structure.
神经网络的核心计算是矩阵向量乘操作,包含n个神经元的层L n产生的输出可以用长度为n的向量V n表示,与包含m个神经元的层L m全相联,连接权重可以表示成矩阵M n×m,矩阵大小为n行m列,每个矩阵元素表示一个连接的权重。则加权之后输入到L m的向量为M n×mV n,这样的矩阵向量乘法运算是神经网络最核心的计算。 The core computation of the neural network is a matrix vector multiplication operation. The output produced by the layer L n containing n neurons can be represented by a vector V n of length n, which is fully associated with the layer L m containing m neurons, and the connection weights It can be expressed as a matrix M n × m , the matrix size is n rows and m columns, and each matrix element represents the weight of one connection. Then the vector input to L m after weighting is M n × m V n , and such matrix vector multiplication is the core calculation of the neural network.
由于矩阵向量乘计算量非常大,在现有的通用处理器上进行大量的矩阵 乘运算需要耗费大量的时间,因此神经网络加速芯片也都是以加速矩阵乘法运算为主要的设计目标。忆阻器阵列恰好能胜任上述工作。首先V为一组输入电压,电压与忆阻器电导G相乘并叠加输出电流,输出电流与接地电阻Rs相乘得到输出电压V’,整个过程在模拟电路下实现,具有速度快,面积小的优点。Since the matrix vector multiplication is very large, it takes a lot of time to perform a large number of matrix multiplication operations on the existing general-purpose processor. Therefore, the neural network acceleration chip also has the main design goal of accelerating matrix multiplication. The memristor array is just right for the above work. First, V is a set of input voltage, the voltage is multiplied by the memristor conductance G and superimposed output current, and the output current is multiplied by the grounding resistance Rs to obtain the output voltage V'. The whole process is realized under the analog circuit, and has a fast speed and a small area. The advantages.
然而使用基于忆阻器的芯片计算也存在精度低、扰动大,数模/模数转换开销大,矩阵规模受限等不足。而且虽然忆阻器可以高效地进行矩阵向量乘法运算,但是由于忆阻器芯片矩阵向量乘是在模拟电路中实现,所以不可避免的带来噪声和扰动,所以相对于神经网络,忆阻器的计算结果是不准确的。However, chip computing based on memristor also has the disadvantages of low precision, large disturbance, large cost of digital-to-analog/analog conversion, and limited matrix size. Moreover, although the memristor can perform matrix vector multiplication operations efficiently, since the memristor chip matrix vector multiplication is implemented in an analog circuit, noise and disturbance are inevitably brought about, so compared with the neural network, the memristor The calculation results are not accurate.
由于忆阻器的工艺限制,使用忆阻器表示权重会有一定的误差。如图3所示,不同级的权重会有一定的重叠。为了避免重叠,现有方法一般是使用若干个低精度的忆阻器拼接来表示一个高精度权重,每个忆阻器精度都很低的情况下,可以认为权重数据是准确的。以用2个2比特忆阻器表示4比特权重为例,用一个2比特忆阻器来表示权重低2位,另一个表示高2位。Due to the process limitations of the memristor, the use of a memristor indicates that the weight has a certain error. As shown in Figure 3, the weights of different levels will overlap. In order to avoid overlap, the existing methods generally use a number of low-precision memristor splicing to represent a high-precision weight, and in the case where each memristor has a low precision, the weight data can be considered accurate. Taking a 2-bit memristor to represent a 4-bit weight as an example, a 2-bit memristor is used to indicate a lower weight of 2 bits and the other represents a high 2 bits.
现有的ISAAC技术首先用浮点数训练一个神经网络,然后将权重数据“写入”忆阻器。ISAAC是用4个2比特忆阻器器件来表示一个8比特的权重,这样可以利用更多的资源来提高矩阵运算精度。The existing ISAAC technology first trains a neural network with floating point numbers and then "writes" the weighted data to the memristor. ISAAC uses four 2-bit memristor devices to represent an 8-bit weight, which allows more resources to be used to improve matrix operation accuracy.
ISAAC使用拼接的方法来表示权重,效率比较低,需要很多的资源,比如表示一个1个权重,就需要4个忆阻器器件。ISAAC uses splicing methods to represent weights, which is relatively inefficient and requires a lot of resources. For example, if you represent one weight, you need 4 memristor devices.
与ISAAC类似,现有的PRIME技术首先也用浮点数训练一个神经网络,然后使用2个3比特精度的输入电压来表示一个6比特输入,2个4比特的忆阻器器件来表示一个8比特的权重,并且将正、负的权重分别用两组阵列表示。Similar to ISAAC, the existing PRIME technology first trains a neural network with floating point numbers, then uses two 3-bit precision input voltages to represent a 6-bit input, and two 4-bit memristor devices to represent an 8-bit. The weights are weighted, and the positive and negative weights are represented by two sets of arrays.
PRIME使用正负相加和高低位拼接方法来来表示权重,也需要大量的资源。即表示一个1个权重,就需要4个忆阻器器件。PRIME uses positive and negative addition and high and low splicing methods to represent weights, and also requires a lot of resources. That is, to represent one weight, four memristor devices are needed.
基于忆阻器器件实现神经网络,必须克服权重读取误差的问题,这个问题是由器件特性和现有工艺造成的,目前难以避免。这些现有技术使用若干个低精度可以认为是“没有误差”的忆阻器拼接来表示一个高精度的权重,需要大量的资源,资源利用效率低。The realization of neural networks based on memristor devices must overcome the problem of weight reading errors, which are caused by device characteristics and existing processes and are currently difficult to avoid. These prior art techniques use a number of low-precision memristor splices that can be considered "no error" to represent a high-precision weight, require a large amount of resources, and have low resource utilization efficiency.
因此,需要一种用于基于忆阻器神经网络的权重表示技术,以解决上述问题。Therefore, there is a need for a weighting technique for a memristor based neural network to solve the above problems.
发明内容Summary of the invention
鉴于上述情况,做出了本发明。The present invention has been made in view of the above circumstances.
根据本发明的一个方面,提供了一种用于神经网络的非拼接权重训练方法,包括:权重定点化步骤,将权重矩阵的每个矩阵元素转换为具有预定比特位数的第一数;误差引入步骤,在所述第一数中引入具有预定标准差的噪声,获得第二数;和训练步骤,对以第二数表示的权重矩阵进行训练,训练至收敛后,得到训练结果,其中,所述训练结果将作为最终的权重矩阵,其各个矩阵元素被逐个写入对应表示一个矩阵元素的单个模拟电路器件中,其中通过单个模拟电路器件而非多个模拟电路器件的拼接来表示单个矩阵元素。According to an aspect of the present invention, a non-splicing weight training method for a neural network is provided, comprising: a weight-spotting step of converting each matrix element of a weight matrix into a first number having a predetermined number of bits; a step of introducing a noise having a predetermined standard deviation into the first number to obtain a second number; and a training step of training the weight matrix represented by the second number, training to convergence, and obtaining a training result, wherein The training result will be used as the final weight matrix, each matrix element being written one by one into a single analog circuit device corresponding to a matrix element, wherein a single matrix is represented by a single analog circuit device rather than a splicing of multiple analog circuit devices. element.
根据上述非拼接权重训练方法,在权重定点化步骤中,可以通过线性关系或对数关系进行第一数的转换。According to the above non-splicing weight training method, in the weight setting step, the first number conversion can be performed by a linear relationship or a logarithmic relationship.
根据上述非拼接权重训练方法,所述噪声可以为模拟电路的读写误差,并且服从正态分布规律。According to the above non-splicing weight training method, the noise may be a read/write error of an analog circuit, and obey a normal distribution law.
根据上述非拼接权重训练方法,所述模拟电路器件可以为忆阻器、电容比较器或者电压比较器。According to the above non-splicing weight training method, the analog circuit device may be a memristor, a capacitor comparator or a voltage comparator.
根据上述非拼接权重训练方法,所述第一数可以为定点数并且第二数可以为浮点数。According to the above non-splicing weight training method, the first number may be a fixed point number and the second number may be a floating point number.
根据本发明的另一方面,提供了一种用于神经网络的非拼接权重编码方法,包括如下步骤:将权重矩阵的每个矩阵元素逐个写入对应表示一个矩阵元素的单个模拟电路器件中,以便通过单个模拟电路器件而非多个模拟电路器件的拼接来表示单个矩阵元素,其中,所述权重矩阵是通过上述非拼接权重训练方法得到的。According to another aspect of the present invention, there is provided a non-splicing weight coding method for a neural network, comprising the steps of: writing each matrix element of a weight matrix one by one into a single analog circuit device corresponding to a matrix element, A single matrix element is represented by a splicing of a single analog circuit device rather than a plurality of analog circuit devices, wherein the weight matrix is obtained by the non-splicing weight training method described above.
根据上述非拼接权重编码方法,在写入步骤之前,可以还包括如下步骤:权重定点化步骤,将权重矩阵的每个矩阵元素转换为具有预定比特位数的第一数;误差引入步骤,在所述第一数中引入具有预定标准差的噪声,获得第二数;和训练步骤,对以第二数表示的权重矩阵进行训练,训练至收敛后,得到训练结果。According to the above non-splicing weight coding method, before the writing step, the method may further include the following steps: a weight-spotting step of converting each matrix element of the weight matrix into a first number having a predetermined number of bits; and an error introduction step in Introducing noise with a predetermined standard deviation into the first number to obtain a second number; and training step, training the weight matrix represented by the second number, training until convergence, and obtaining a training result.
根据本发明的另一方面,提供了一种神经网络芯片,具有通过模拟电路器件以硬件形式执行矩阵向量乘的操作的基本模块,其中,权重矩阵的每个矩阵元素被逐个写入对应表示一个矩阵元素的单个模拟电路器件中,以便通 过单个模拟电路器件而非多个模拟电路器件的拼接来表示权重矩阵的单个矩阵元素。According to another aspect of the present invention, there is provided a neural network chip having a basic module for performing an operation of matrix vector multiplication in hardware by an analog circuit device, wherein each matrix element of the weight matrix is written one by one to represent one A single analog circuit device of matrix elements to represent a single matrix element of a weight matrix by splicing of a single analog circuit device rather than multiple analog circuit devices.
根据上述神经网络芯片,所述权重矩阵可以是上述非拼接权重训练方法得到的。According to the above neural network chip, the weight matrix may be obtained by the above non-splicing weight training method.
根据本发明的又一方面,提供了一种计算装置,包括存储器和处理器,存储器上存储有计算机可执行指令,所述计算机可执行指令当被处理器执行时执行根据上述非拼接权重训练方法或根据上述非拼接权重编码方法。According to still another aspect of the present invention, a computing device includes a memory and a processor having stored thereon computer executable instructions that, when executed by a processor, perform a non-splicing weight training method according to the above Or according to the above non-splicing weight coding method.
根据本发明的又一方面,提供了一种神经网络系统,包括:根据上述的计算装置;以及根据上述的神经网络芯片。According to still another aspect of the present invention, there is provided a neural network system comprising: the computing device according to the above; and the neural network chip according to the above.
根据本发明,提供了一种用于神经网络的编码方法,可以在不影响效果的情况下,极大的降低资源的消耗,从而节省资源开销,以在有限资源的条件下布置规模巨大的神经网络。According to the present invention, an encoding method for a neural network is provided, which can greatly reduce resource consumption without affecting effects, thereby saving resource overhead, and arranging a large-scale nerve under conditions of limited resources. The internet.
附图说明DRAWINGS
从下面结合附图对本发明实施例的详细描述中,本发明的这些和/或其它方面和优点将变得更加清楚并更容易理解,其中:These and/or other aspects and advantages of the present invention will become more apparent from the following detailed description of the embodiments of the invention.
图1示出了链状的神经网络的示意图。Figure 1 shows a schematic of a chained neural network.
图2示出了基于忆阻器的交叉开关结构的示意图。Figure 2 shows a schematic diagram of a memristor based crossbar switch structure.
图3示出了在一个忆阻器上划分8级权重的权重统计分布图。Figure 3 shows a weighted statistical distribution map of eight levels of weights on a memristor.
图4示出了根据本发明的神经网络的编码技术的应用情境的示意图。Fig. 4 shows a schematic diagram of an application scenario of an encoding technique of a neural network according to the present invention.
图5示出了根据本发明的编码方法的总体流程图。Figure 5 shows a general flow chart of an encoding method in accordance with the present invention.
图6示出了使用现有高低位拼接方法和根据本发明的编码方法的实验效果对比。Fig. 6 shows a comparison of experimental effects using the existing high and low level stitching method and the encoding method according to the present invention.
具体实施方式Detailed ways
为了使本领域技术人员更好地理解本发明,下面结合附图和具体实施方式对本发明作进一步详细说明。The present invention will be further described in detail below in conjunction with the drawings and specific embodiments.
本申请提供一种新的编码方法(下文称为RLevel编码方法),其与现有方法的本质区别在于,新的编码方法并不要求使用单个器件表示的权重值不发生重叠,而是将这种误差引入训练当中。通过对含有噪声的权重矩阵进行训练,并使之能够训练至收敛,最后将收敛后的数值写入单个器件中,由此 既能够增强该模型的抗噪能力,也能够减少表示矩阵元素的器件的数量,降低了成本和资源消耗。The present application provides a new encoding method (hereinafter referred to as RLevel encoding method), which is essentially different from the existing method in that the new encoding method does not require that the weight values represented by a single device do not overlap, but instead Kinds of errors are introduced into the training. By training the weight matrix containing noise and enabling it to train to convergence, the converged values are finally written into a single device, thereby enhancing the noise immunity of the model and reducing the representation of matrix elements. The number of costs reduces resource and resource consumption.
下面将结合附图对本申请的技术原理和实施方式进行详细分析。The technical principle and embodiment of the present application will be analyzed in detail below with reference to the accompanying drawings.
图3示出了在一个忆阻器上划分8级权重的权重统计分布图。Figure 3 shows a weighted statistical distribution map of eight levels of weights on a memristor.
如图3所示,由于忆阻器器件引起的误差近似于正态分布,假设器件的误差服从正态分布N(μ,σ 2),如果用忆阻器电导值来表示一个n比特的值,则μ有2 n个可能的值。这里,本领域技术人员可以理解,为了简化计算,对应不同电导值μ,采用相同的标准差σ。 As shown in Figure 3, since the error caused by the memristor device approximates a normal distribution, it is assumed that the error of the device obeys the normal distribution N(μ, σ 2 ), if the memristor conductance value is used to represent an n-bit value. , then μ has 2 n possible values. Here, it will be understood by those skilled in the art that in order to simplify the calculation, the same standard deviation σ is used corresponding to different conductance values μ.
虽然,在下面的具体实施方式中以忆阻器作为示例进行说明,但是除了忆阻器以外的其他能够实现矩阵向量乘的电路器件也是可以的,比如电容或电压比较器。Although the memristor is described as an example in the following detailed description, circuit devices other than the memristor capable of realizing matrix vector multiplication are also possible, such as a capacitor or a voltage comparator.
根据正态分布叠加的性质:统计独立的常态随机变量X~(μ xx 2),Y~(μ yy 2),那么它们的和也满足正态分布U=X+Y~(μ xyx 2y 2)。 According to the nature of normal distribution superposition: statistically independent normal random variables X~(μ x , σ x 2 ), Y~(μ y , σ y 2 ), then their sum also satisfies the normal distribution U=X+Y ~(μ x + μ y , σ x 2 + σ y 2 ).
假设如现有技术那样,用2个器件拼接来表示一个高精度权重。l和h分别代表低位和高位器件,权重表示为2 n*h+l,低位和高位的误差分别是L~(l,σ 2),H~(h,σ 2),则2 n*H~(h,2 2n2)。权重的取值范围是2 2n-l,权重误差的标准差是
Figure PCTCN2017119821-appb-000001
我们把取值范围与标准差作为最终精度的标准,则拼接权重方法的精度为:
Figure PCTCN2017119821-appb-000002
Assume that, as in the prior art, two devices are spliced to represent a high precision weight. l and h represent low and high-order devices respectively, the weight is expressed as 2 n *h+l, and the errors of low and high are L~(l,σ 2 ), H~(h,σ 2 ), then 2 n *H ~(h, 2 2n * σ 2 ). The weight range is 2 2n -l, and the standard deviation of the weight error is
Figure PCTCN2017119821-appb-000001
We use the range of values and the standard deviation as the standard for the final accuracy. The accuracy of the splicing weight method is:
Figure PCTCN2017119821-appb-000002
相比,本申请中,用一个器件来表示高精度权重,其精度是(2 n-l)/σ。 In contrast, in the present application, a device is used to represent a high-precision weight with an accuracy of (2 n -l) / σ.
从上述结果可见,使用高低位拼接方法和单个器件表示权重的精度基本相同。It can be seen from the above results that the accuracy of using the high and low bit stitching method and the single device to represent the weight is basically the same.
图4示出了根据本发明的神经网络的编码技术的应用情境的示意图。如图4所示,本公开的总体发明构思在于:将神经网络应用1100所采用的网络模型1200通过编码方法1300进行权重编码,将结果写入神经网络芯片1400的忆阻器器件,从而解决了基于忆阻器神经网络的权重表示需要大量器件的问题,最终在不明显损失精度的前提下,节省了大量的资源。Fig. 4 shows a schematic diagram of an application scenario of an encoding technique of a neural network according to the present invention. As shown in FIG. 4, the general inventive concept of the present disclosure is to solve the problem that the network model 1200 employed by the neural network application 1100 is weight-encoded by the encoding method 1300, and the result is written into the memristor device of the neural network chip 1400. The weight based on the memristor neural network indicates the problem of requiring a large number of devices, and finally saves a lot of resources without significant loss of accuracy.
一、编码方法First, the coding method
图5示出了根据本发明的编码方法的总体流程图,包括如下步骤:Figure 5 shows a general flow diagram of an encoding method in accordance with the present invention, comprising the following steps:
1、权重定点化步骤S210,将权重矩阵的每个矩阵元素转换为具有预定比特位数的第一数;1. Weight setting process S210, converting each matrix element of the weight matrix into a first number having a predetermined number of bits;
根据硬件设计(单个忆阻器器件的精度需要硬件支持),在前向网络中, 将每一个权重值转换为一定精度的定点数,得到定点化权重。According to the hardware design (the precision of a single memristor device requires hardware support), in the forward network, each weight value is converted into a fixed-point number with a certain precision, and the fixed-point weight is obtained.
这里,为了更好地说明本发明的方法,以下面的表1的2*2大小的权重矩阵A为例子进一步说明。Here, in order to better explain the method of the present invention, the weight matrix A of 2*2 size of Table 1 below will be further described as an example.
表1初始权重矩阵ATable 1 initial weight matrix A
0.26410.2641 0.85090.8509
0.32960.3296 0.67400.6740
当以4比特作为预定比特位数,将每个权重值转化为4比特定点数。其中,矩阵中的最大值0.8509对应于4比特的最大值,即2 4-1=15,而其他值相应地进行线性换算而得到定点化权重,获得表2的定点数矩阵。 When 4 bits are used as the predetermined number of bits, each weight value is converted into 4 to a specific number of points. The maximum value of 0.8509 in the matrix corresponds to the maximum value of 4 bits, that is, 2 4 -1=15, and other values are linearly converted correspondingly to obtain fixed-point weights, and the fixed-point matrix of Table 2 is obtained.
表2定点数矩阵BTable 2 fixed point matrix B
5.00005.0000 15.000015.0000
6.00006.0000 12.000012.0000
需要说明的是,上面是通过线性方式进行定点数的转换,但是本领域技术人员可以理解,也可以不通过线性方式,而通过对数或者其他计算方式进行转换。It should be noted that the above is the conversion of the fixed point number in a linear manner, but those skilled in the art can understand that the conversion can be performed by logarithm or other calculation methods without using the linear method.
2、误差引入步骤S220,在所述第一数中引入具有预定标准差的噪声,获得第二数。2. The error is introduced into step S220, in which noise having a predetermined standard deviation is introduced in the first number to obtain a second number.
根据忆阻器器件特性,加入标准差为σ的正态分布的噪声进行训练,即权重w=w+Noise,Noise~(0,σ 2)。需要说明的是,在这里,将第一数设定为定点数,而第二数等于第一数加噪音,因此第二数为浮点数。例如0、1、2、3四个定点数,加入了噪音,变成-0.1、1.02、2.03、2.88四个浮点数,但是这样的设定并非限制性的,第一数也可以是浮点数。 According to the characteristics of the memristor device, the noise with a normal distribution with a standard deviation of σ is added, that is, the weights w=w+Noise, Noise~(0, σ 2 ). It should be noted that here, the first number is set to a fixed point number, and the second number is equal to the first number plus noise, so the second number is a floating point number. For example, the number of fixed points of 0, 1, 2, and 3 adds noise and becomes four floating point numbers of -0.1, 1.02, 2.03, and 2.88. However, such a setting is not restrictive, and the first number may be a floating point number. .
3、训练步骤S230,对以第二数表示的权重矩阵进行训练,训练至收敛后,再将训练结果作为最终的权重矩阵写入用于权重矩阵计算的电路器件中。3. Training step S230, training the weight matrix represented by the second number, training to convergence, and then writing the training result as a final weight matrix into the circuit device for weight matrix calculation.
二、理论验证Second, theoretical verification
下面给出实际的示例来从理论角度说明以同样的输入,使用根据本发明的RLevel编码方法的输出和根据现有技术的高低位拼接方法的输出有着接近的精度。The actual examples are given below to theoretically illustrate that with the same input, the output using the RLevel encoding method according to the present invention and the output of the high and low level stitching method according to the prior art have close precision.
如果用两个2比特进行拼接,定点数矩阵B(表2)分解为高位矩阵H (表3)和低位矩阵L(表4):If splicing is performed with two 2 bits, the fixed point matrix B (Table 2) is decomposed into a high order matrix H (Table 3) and a low order matrix L (Table 4):
表3高位矩阵HTable 3 high order matrix H
1.00001.0000 3.00003.0000
1.00001.0000 3.00003.0000
表4低位矩阵LTable 4 low order matrix L
1.00001.0000 3.00003.0000
2.00002.0000 0.00000.0000
拼接中,定点数矩阵B等于高位矩阵H*4+低位矩阵L,即B=4*H+L,不管高位还是低位,最大值对应于2比特的最大值,即3。In the splicing, the fixed point matrix B is equal to the upper matrix H*4+lower matrix L, that is, B=4*H+L, and the maximum value corresponds to the maximum value of 2 bits, that is, 3, regardless of the high or low.
下面为了更好地模拟实际误差的引入,将定点数矩阵B分别按照RLevel方法和高低位拼接方法转换为4*10 -6至4*10 -5的电导值,则得到表5的Rlevel电导矩阵RC、高位电导矩阵HC和低位电导矩阵LC。 In order to better simulate the introduction of the actual error, the fixed-point matrix B is converted into the conductance value of 4*10 -6 to 4*10 -5 according to the RLevel method and the high-low level splicing method respectively, and the Rlevel conductance matrix of Table 5 is obtained. RC, high conductivity matrix HC and low conductivity matrix LC.
需要注意的是,根据本发明的训练过程并不会将矩阵转换为电导值,而是在第一数的基础上增加标准差为σ的正态分布的噪声进行训练。此处是为了说明,实际误差的引入是由于忆阻器器件或者其他所使用的电路器件在读取和写入过程中由于噪声和扰动而引起的,因此下面基于作为模拟值的电导值进行数据分析。It should be noted that the training process according to the present invention does not convert the matrix into conductance values, but rather increases the noise of a normal distribution with a standard deviation of σ on the basis of the first number. Here, for the sake of explanation, the introduction of the actual error is caused by noise and disturbance during the reading and writing process of the memristor device or other used circuit device, so the data is based on the conductance value as the analog value below. analysis.
表5电导矩阵Table 5 Conductance Matrix
Figure PCTCN2017119821-appb-000003
Figure PCTCN2017119821-appb-000003
假设输入电压为:Assume that the input voltage is:
0.10    0.150.10 0.15
【无噪声】[No noise]
如果没有噪声,则基于上述输入电压,Rlevel电导矩阵RC、高位电导矩阵HC和低位电导矩阵LC的输出分别为:If there is no noise, based on the above input voltage, the outputs of the Rlevel conductance matrix RC, the high-order conductance matrix HC, and the low-level conductance matrix LC are respectively:
表6电导矩阵输出Table 6 Conductance Matrix Output
Rlevel输出RC_outRlevel output RC_out 高位输出HC_outHigh output HC_out 低位输出LC_outLow output LC_out 拼接输出HLC_outSplicing output HLC_out
4.36000000E-064.36000000E-06 4.0000E-064.0000E-06 1.0000E-051.0000E-05 2.18000000E-052.18000000E-05
8.92000000E-068.92000000E-06 5.8000E-065.8000E-06 4.6000E-064.6000E-06 4.46000000E-054.46000000E-05
上述拼接输出为按照高位输出*4+低位输出。The above spliced output is output according to the high bit output *4+ low bit.
如果将表6的结果,即Rlevel输出RC_out和拼接输出HL_out,转换为8比特定点数进行比较,则可以看出两者均为:If the result of Table 6, that is, the Rlevel output RC_out and the spliced output HL_out, is converted to 8 and compared with a specific number of points, it can be seen that both are:
125.     255.125. 255.
【加入噪声】[Add noise]
如果对电导矩阵加入均值为0且标准差为0.05*4*10 -5(即大约为5%)的噪声,则得到表7的噪声矩阵。 If noise is added to the conductance matrix with a mean of 0 and a standard deviation of 0.05*4*10 -5 (i.e., approximately 5%), the noise matrix of Table 7 is obtained.
表7噪声矩阵Table 7 noise matrix
Figure PCTCN2017119821-appb-000004
Figure PCTCN2017119821-appb-000004
仍然假设输入电压为:Still assume that the input voltage is:
0.10     0.150.10 0.15
则Rlevel、高位和低位噪声矩阵输出分别为:Then the Rlevel, high and low noise matrix outputs are:
表8噪声矩阵输出Table 8 noise matrix output
Rlevel输出RN_outRlevel output RN_out 高位输出HN_outHigh output HN_out 低位输出LN_outLow output LN_out 拼接输出HLN_outSplicing output HLN_out
4.5550E-064.5550E-06 4.2578E-064.2578E-06 6.3242E-066.3242E-06 2.3355E-052.3355E-05
9.0081E-069.0081E-06 9.9704E-069.9704E-06 4.1181E-064.1181E-06 4.4000E-054.4000E-05
如果将表8的结果,即Rlevel输出RN_out和拼接输出HLN_out,转换为8比特定点数进行比较,则可以看出两者分别为:If the result of Table 8, that is, the Rlevel output RN_out and the spliced output HLN_out, is converted to 8 and compared with a specific number of points, it can be seen that the two are respectively:
Rlevel输出:129.00255.00Rlevel output: 129.00255.00
拼接输出:135.00255.00Splicing output: 135.00255.00
由最终结果可见,无论加入噪声还是不加入噪声,根据本发明的RLevel编码方法都与现有技术的高低位拼接方法的输出有着非常接近的精度,因此,从理论角度验证了本发明的方案的实用性和可行性。It can be seen from the final result that the RLevel coding method according to the present invention has very close precision to the output of the prior art high and low level splicing method, whether noise is added or not. Therefore, the solution of the present invention is verified from a theoretical point of view. Practicality and feasibility.
三、数据验证Third, data verification
为了从实验数据角度验证本发明的编码方法的有效性,申请人做了一系列的实验。In order to verify the effectiveness of the encoding method of the present invention from the experimental data, the applicant conducted a series of experiments.
图6示出了使用现有高低位拼接方法和根据本发明的RLevel编码方法的实验效果对比。Fig. 6 shows a comparison of experimental effects using the existing high and low level stitching method and the RLevel encoding method according to the present invention.
本次实验使用卷积神经网络对CIFAR10数据集进行分类。该数据集有60000个32*32像素的彩色图片,每张图片都属于10种分类之一。如图6所示,横坐标为权重精度,纵坐标为正确率。图中有两根线,下面的线是使用RLevel方法,分别用一个器件表示2、4、6、8比特的权重,而上面的线是以分别为1、2、3、4比特的2个器件拼接来表示2、4、6、8比特。This experiment used a convolutional neural network to classify the CIFAR10 data set. The data set has 60,000 32*32 pixel color images, each of which belongs to one of 10 categories. As shown in Fig. 6, the abscissa is the weight precision and the ordinate is the correct rate. There are two lines in the figure. The lower line uses the RLevel method to represent the weights of 2, 4, 6, and 8 bits by one device, and the upper line is 2 of 1, 2, 3, and 4 bits, respectively. The devices are spliced to represent 2, 4, 6, and 8 bits.
如图6所示,在本次实验中,用RLevel方法的准确率与用高低位拼接方法的准确率非常接近,但是由于只使用了一个器件,而不需要多个器件的拼接,属于非拼接编码,因此可以节省50%的资源。如此,根据本发明的权重编码方法,不需要采用高低位拼接,就能够提供与现有的高低位拼接基本相同的精度,既解决了通过忆阻器等模拟电路来进行神经网络的权重矩阵计算需要布置大量电路器件的问题,也降低了成本,节省了资源。As shown in Figure 6, in this experiment, the accuracy of the RLevel method is very close to that of the high-low-level stitching method, but since only one device is used, and no splicing of multiple devices is required, it is non-splicing. Encoding, so you can save 50% of resources. Thus, according to the weight coding method of the present invention, it is possible to provide substantially the same accuracy as the existing high and low bit splicing without using high and low bit splicing, and the weight matrix calculation of the neural network by the analog circuit such as a memristor is solved. The need to arrange a large number of circuit devices also reduces costs and saves resources.
以上已经描述了本发明的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。因此,本发明的保护范围应该以权利要求的保护范围为准。The embodiments of the present invention have been described above, and the foregoing description is illustrative, not limiting, and not limited to the disclosed embodiments. Numerous modifications and changes will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the scope of protection of the present invention should be determined by the scope of the claims.

Claims (11)

  1. 一种用于神经网络的非拼接权重训练方法,包括:A non-splicing weight training method for a neural network, comprising:
    权重定点化步骤,将权重矩阵的每个矩阵元素转换为具有预定比特位数的第一数;a weight-spotting step of converting each matrix element of the weight matrix into a first number having a predetermined number of bits;
    误差引入步骤,在所述第一数中引入具有预定标准差的噪声,获得第二数;和An error introducing step of introducing noise having a predetermined standard deviation into the first number to obtain a second number; and
    训练步骤,对以第二数表示的权重矩阵进行训练,训练至收敛后,得到训练结果,a training step of training the weight matrix represented by the second number, training to convergence, and obtaining training results,
    其中,所述训练结果将作为最终的权重矩阵,其各个矩阵元素被逐个写入对应表示一个矩阵元素的单个模拟电路器件中,其中通过单个模拟电路器件而非多个模拟电路器件的拼接来表示单个矩阵元素。Wherein, the training result will be used as the final weight matrix, and each matrix element is written one by one into a single analog circuit device corresponding to one matrix element, wherein the splicing is represented by a single analog circuit device instead of multiple analog circuit devices. A single matrix element.
  2. 根据权利要求1所述的非拼接权重训练方法,其中,在权重定点化步骤中,通过线性关系或对数关系进行第一数的转换。The non-splicing weight training method according to claim 1, wherein in the weight setting step, the first number conversion is performed by a linear relationship or a logarithmic relationship.
  3. 根据权利要求1所述的非拼接权重训练方法,其中,所述噪声为模拟电路的读写误差,并且服从正态分布规律。The non-splicing weight training method according to claim 1, wherein the noise is a read/write error of an analog circuit and obeys a normal distribution law.
  4. 根据权利要求1所述的非拼接权重训练方法,其中,所述模拟电路器件为忆阻器、电容比较器或者电压比较器。The non-splicing weight training method according to claim 1, wherein the analog circuit device is a memristor, a capacitor comparator or a voltage comparator.
  5. 根据权利要求1所述的非拼接权重训练方法,其中,所述第一数为定点数并且第二数为浮点数。The non-splicing weight training method according to claim 1, wherein the first number is a fixed point number and the second number is a floating point number.
  6. 一种用于神经网络的非拼接权重编码方法,包括如下步骤:将权重矩阵的每个矩阵元素逐个写入对应表示一个矩阵元素的单个模拟电路器件中,以便通过单个模拟电路器件而非多个模拟电路器件的拼接来表示单个矩阵元素,A non-splicing weight coding method for a neural network, comprising the steps of: writing each matrix element of a weight matrix one by one into a single analog circuit device corresponding to a matrix element, so as to pass through a single analog circuit device instead of multiple Splicing of analog circuit devices to represent a single matrix element,
    其中,所述权重矩阵是通过权利要求1到5中任一项所述的非拼接权重训练方法得到的。The weight matrix is obtained by the non-splicing weight training method according to any one of claims 1 to 5.
  7. 根据权利要求6所述的非拼接权重编码方法,其中,在写入步骤之前,还包括如下步骤:The non-splicing weight coding method according to claim 6, wherein before the writing step, the method further comprises the following steps:
    权重定点化步骤,将权重矩阵的每个矩阵元素转换为具有预定比特位数的第一数;a weight-spotting step of converting each matrix element of the weight matrix into a first number having a predetermined number of bits;
    误差引入步骤,在所述第一数中引入具有预定标准差的噪声,获得第二 数;和An error introducing step of introducing noise having a predetermined standard deviation into the first number to obtain a second number; and
    训练步骤,对以第二数表示的权重矩阵进行训练,训练至收敛后,得到训练结果。The training step trains the weight matrix represented by the second number, and after training, the training result is obtained.
  8. 一种神经网络芯片,具有通过模拟电路器件以硬件形式执行矩阵向量乘的操作的基本模块,A neural network chip having a basic module for performing matrix vector multiplication operations in hardware by analog circuit devices,
    其中,权重矩阵的每个矩阵元素被逐个写入对应表示一个矩阵元素的单个模拟电路器件中,以便通过单个模拟电路器件而非多个模拟电路器件的拼接来表示权重矩阵的单个矩阵元素。Wherein, each matrix element of the weight matrix is written one by one into a single analog circuit device corresponding to one matrix element to represent a single matrix element of the weight matrix by a single analog circuit device rather than a splicing of multiple analog circuit devices.
  9. 根据权利要求8的神经网络芯片,其中所述权重矩阵是通过权利要求1到5中任一项所述的非拼接权重训练方法得到的。The neural network chip according to claim 8, wherein said weight matrix is obtained by the non-splicing weight training method according to any one of claims 1 to 5.
  10. 一种计算装置,包括存储器和处理器,存储器上存储有计算机可执行指令,所述计算机可执行指令当被处理器执行时执行权利要求1到5中任一项所述的非拼接权重训练方法或权利要求6到7中任一项所述的非拼接权重编码方法。A computing device comprising a memory and a processor, the memory storing computer executable instructions, the computer executable instructions, when executed by the processor, performing the non-splicing weight training method of any one of claims 1 to 5. Or the non-splicing weight coding method according to any one of claims 6 to 7.
  11. 一种神经网络系统,包括:A neural network system comprising:
    如权利要求10所述的计算装置;以及The computing device of claim 10;
    如权利要求8-9中任一项所述的神经网络芯片。A neural network chip according to any of claims 8-9.
PCT/CN2017/119821 2017-12-29 2017-12-29 Weight coding method for neural network, computing apparatus, and hardware system WO2019127363A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201780042640.0A CN109791626B (en) 2017-12-29 2017-12-29 Neural network weight coding method, calculating device and hardware system
PCT/CN2017/119821 WO2019127363A1 (en) 2017-12-29 2017-12-29 Weight coding method for neural network, computing apparatus, and hardware system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/119821 WO2019127363A1 (en) 2017-12-29 2017-12-29 Weight coding method for neural network, computing apparatus, and hardware system

Publications (1)

Publication Number Publication Date
WO2019127363A1 true WO2019127363A1 (en) 2019-07-04

Family

ID=66495542

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/119821 WO2019127363A1 (en) 2017-12-29 2017-12-29 Weight coding method for neural network, computing apparatus, and hardware system

Country Status (2)

Country Link
CN (1) CN109791626B (en)
WO (1) WO2019127363A1 (en)

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10678244B2 (en) 2017-03-23 2020-06-09 Tesla, Inc. Data synthesis for autonomous control systems
US11893393B2 (en) 2017-07-24 2024-02-06 Tesla, Inc. Computational array microprocessor system with hardware arbiter managing memory requests
US11409692B2 (en) 2017-07-24 2022-08-09 Tesla, Inc. Vector computational unit
US11157441B2 (en) 2017-07-24 2021-10-26 Tesla, Inc. Computational array microprocessor system using non-consecutive data formatting
US10671349B2 (en) 2017-07-24 2020-06-02 Tesla, Inc. Accelerated mathematical engine
US11561791B2 (en) 2018-02-01 2023-01-24 Tesla, Inc. Vector computational unit receiving data elements in parallel from a last row of a computational array
US11215999B2 (en) 2018-06-20 2022-01-04 Tesla, Inc. Data pipeline and deep learning system for autonomous driving
US11361457B2 (en) 2018-07-20 2022-06-14 Tesla, Inc. Annotation cross-labeling for autonomous control systems
US11636333B2 (en) 2018-07-26 2023-04-25 Tesla, Inc. Optimizing neural network structures for embedded systems
US11562231B2 (en) 2018-09-03 2023-01-24 Tesla, Inc. Neural networks for embedded devices
CA3115784A1 (en) 2018-10-11 2020-04-16 Matthew John COOPER Systems and methods for training machine models with augmented data
US11196678B2 (en) 2018-10-25 2021-12-07 Tesla, Inc. QOS manager for system on a chip communications
US11816585B2 (en) 2018-12-03 2023-11-14 Tesla, Inc. Machine learning models operating at different frequencies for autonomous vehicles
US11537811B2 (en) 2018-12-04 2022-12-27 Tesla, Inc. Enhanced object detection for autonomous vehicles based on field view
US11610117B2 (en) 2018-12-27 2023-03-21 Tesla, Inc. System and method for adapting a neural network model on a hardware platform
US10997461B2 (en) 2019-02-01 2021-05-04 Tesla, Inc. Generating ground truth for machine learning from time series elements
US11567514B2 (en) 2019-02-11 2023-01-31 Tesla, Inc. Autonomous and user controlled vehicle summon to a target
US10956755B2 (en) 2019-02-19 2021-03-23 Tesla, Inc. Estimating object properties using visual image data
CN110796241B (en) * 2019-11-01 2022-06-17 清华大学 Training method and training device of neural network based on memristor
CN111027619B (en) * 2019-12-09 2022-03-15 华中科技大学 Memristor array-based K-means classifier and classification method thereof
WO2021163866A1 (en) * 2020-02-18 2021-08-26 杭州知存智能科技有限公司 Neural network weight matrix adjustment method, writing control method, and related device
CN115481562B (en) * 2021-06-15 2023-05-16 中国科学院微电子研究所 Multi-parallelism optimization method and device, recognition method and electronic equipment
CN114282478B (en) * 2021-11-18 2023-11-17 南京大学 Method for correcting array dot product error of variable resistor device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105224986A (en) * 2015-09-29 2016-01-06 清华大学 Based on the deep neural network system of memory resistor
US20170061281A1 (en) * 2015-08-27 2017-03-02 International Business Machines Corporation Deep neural network training with native devices
CN106650922A (en) * 2016-09-29 2017-05-10 清华大学 Hardware neural network conversion method, computing device, compiling method and neural network software and hardware collaboration system
CN107085628A (en) * 2017-03-21 2017-08-22 东南大学 A kind of adjustable weights modular simulation method of cell neural network

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10580401B2 (en) * 2015-01-27 2020-03-03 Google Llc Sub-matrix input for neural network layers
CN106796668B (en) * 2016-03-16 2019-06-14 香港应用科技研究院有限公司 Method and system for bit-depth reduction in artificial neural network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170061281A1 (en) * 2015-08-27 2017-03-02 International Business Machines Corporation Deep neural network training with native devices
CN105224986A (en) * 2015-09-29 2016-01-06 清华大学 Based on the deep neural network system of memory resistor
CN106650922A (en) * 2016-09-29 2017-05-10 清华大学 Hardware neural network conversion method, computing device, compiling method and neural network software and hardware collaboration system
CN107085628A (en) * 2017-03-21 2017-08-22 东南大学 A kind of adjustable weights modular simulation method of cell neural network

Also Published As

Publication number Publication date
CN109791626A (en) 2019-05-21
CN109791626B (en) 2022-12-27

Similar Documents

Publication Publication Date Title
WO2019127363A1 (en) Weight coding method for neural network, computing apparatus, and hardware system
US11748609B2 (en) On-chip training of memristor crossbar neuromorphic processing systems
US11907831B2 (en) Analog neuromorphic circuit implemented using resistive memories
US20220374688A1 (en) Training method of neural network based on memristor and training device thereof
US10346347B2 (en) Field-programmable crossbar array for reconfigurable computing
CN108009640B (en) Training device and training method of neural network based on memristor
Chen et al. Technology-design co-optimization of resistive cross-point array for accelerating learning algorithms on chip
WO2019127362A1 (en) Neural network model block compression method, training method, computing device and system
Kim et al. Input voltage mapping optimized for resistive memory-based deep neural network hardware
US20210049448A1 (en) Neural network and its information processing method, information processing system
US10643126B2 (en) Systems, methods and devices for data quantization
WO2021089009A1 (en) Data stream reconstruction method and reconstructable data stream processor
US20210209450A1 (en) Compressed weight distribution in networks of neural processors
CN108647184B (en) Method for realizing dynamic bit convolution multiplication
CN110119760B (en) Sequence classification method based on hierarchical multi-scale recurrent neural network
WO2023130725A1 (en) Hardware implementation method and apparatus for reservoir computing model based on random resistor array, and electronic device
CN111144027A (en) Approximation method based on BP neural network full characteristic curve function
CN117273109A (en) Quantum neuron-based hybrid neural network construction method and device
CN114897159B (en) Method for rapidly deducing electromagnetic signal incident angle based on neural network
CN113435581B (en) Data processing method, quantum computer, device and storage medium
CN113568845B (en) Memory address mapping method based on reinforcement learning
Shen et al. PRAP-PIM: A weight pattern reusing aware pruning method for ReRAM-based PIM DNN accelerators
TWI763975B (en) System and method for reducing computational complexity of artificial neural network
Li et al. Memory saving method for enhanced convolution of deep neural network
Guo et al. A multi-conductance states memristor-based cnn circuit using quantization method for digital recognition

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17936307

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17936307

Country of ref document: EP

Kind code of ref document: A1