WO2020052342A1 - 一种基于非易失存储器的卷积神经网络片上学习系统 - Google Patents

一种基于非易失存储器的卷积神经网络片上学习系统 Download PDF

Info

Publication number
WO2020052342A1
WO2020052342A1 PCT/CN2019/095680 CN2019095680W WO2020052342A1 WO 2020052342 A1 WO2020052342 A1 WO 2020052342A1 CN 2019095680 W CN2019095680 W CN 2019095680W WO 2020052342 A1 WO2020052342 A1 WO 2020052342A1
Authority
WO
WIPO (PCT)
Prior art keywords
module
neural network
convolutional neural
convolution
output
Prior art date
Application number
PCT/CN2019/095680
Other languages
English (en)
French (fr)
Inventor
缪向水
李祎
潘文谦
Original Assignee
华中科技大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华中科技大学 filed Critical 华中科技大学
Priority to US16/961,932 priority Critical patent/US11861489B2/en
Publication of WO2020052342A1 publication Critical patent/WO2020052342A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/15Correlation function computation including computation of convolution operations
    • G06F17/153Multidimensional correlation or convolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • G06N3/065Analogue means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the invention relates to the technical field of artificial neural networks, and more particularly, to a non-volatile memory-based convolutional neural network on-chip learning system.
  • Artificial neural network is an application similar to the structure of synaptic connections in the brain. It is a mathematical model of an algorithm that imitates the behavioral characteristics of animal neural networks and performs information processing. Among the many machine learning algorithms, neural networks are widely applicable and robust. This kind of network depends on the complexity of the system and adjusts the interconnection relationship between a large number of internal nodes to achieve the purpose of processing information.
  • CNN Convolutional Neural Network
  • synapses are the most processing elements in neural networks.
  • many devices have been reported such as magnetic memories, phase change memories, and memristors.
  • the memristor's analog memory function is similar to a biological synapse. Its conductance can be continuously changed by applying a relatively large voltage bias, but it remains unchanged when a small or no bias is applied.
  • Memristors can be integrated through a crossbar structure, which can be used to achieve a synaptic density that is close to or greater than that of brain tissue.
  • Memristor-based CNNs can use neural circuits and auxiliary circuits that are easy to implement, while reducing energy consumption, performing calculations at higher speeds, and achieving parallelism physically.
  • the present invention aims to solve the von Neumann bottleneck encountered by the existing computer to implement the convolutional neural network.
  • the separation of calculation and storage is quite time-consuming and low-speed, and it will cause huge hardware costs.
  • the off-chip learning realized by computer can only realize the pre-trained specific functions, and cannot solve the technical problems of the problem flexibly in real time.
  • the present invention provides a non-volatile memory-based convolutional neural network on-chip learning system, including: an input module, a convolutional neural network module, an output module, and a weight update module;
  • the input module converts an input signal into an input voltage pulse signal required by the convolutional neural network module and transmits the signal to the convolutional neural network module;
  • the convolutional neural network module calculates and converts the input voltage pulse signal corresponding to the input signal layer by layer to complete the on-chip learning to obtain the output signal.
  • the convolutional neural network module uses the conductivity modulation characteristic of the conductance of the memristive device to change with the applied pulse to achieve the synaptic function ,
  • the network convolution kernel value or synaptic weight value used in the on-chip learning process is stored in the memristive device;
  • the output module converts and sends the output signal generated by the convolutional neural network module to the weight update module;
  • the weight update module calculates an error signal and adjusts the conductance value of the memristive device according to the result of the output module, so as to update the network convolution kernel value or the synaptic weight value.
  • the input module converts an external input signal into an input voltage pulse signal required by the convolutional neural network, and the pulse width or pulse amplitude of the input signal and the input voltage pulse signal follow a proportional relationship, and the input voltage
  • the pulse signal should be less than the erase voltage of the memristor.
  • the convolutional neural network module uses a memristive device to simulate a network convolution kernel value and a synaptic weight value, and the resistance of the memristive device changes as an electrical signal is applied;
  • the convolutional neural network module includes a convolution layer circuit unit and a pooling layer circuit unit composed of a memristor array as a convolution kernel, and a fully connected layer circuit unit composed of a memristor array as a synapse;
  • the convolution layer circuit unit receives an input voltage pulse signal output by an input module, and the input voltage pulse signal is calculated and converted layer by layer through the convolution layer circuit unit, the pooling layer circuit unit, and the fully connected layer circuit unit, and calculates The result is sent to the output module as an output signal.
  • the convolution layer circuit unit is composed of a convolution operation circuit composed of a memristive array and an activation function part;
  • the convolution operation circuit uses two rows of memristive arrays as one convolution kernel to realize positive and negative convolution kernel values.
  • the convolution kernel value corresponds to the memristive conductance value
  • the convolution kernel value is mapped to be able to be compared with the entire input.
  • the convolution kernel is expanded into two large sparse matrices K + and K-, which are the positive and negative convolution kernel values corresponding to the neuron node.
  • the memristive device can be used to apply positive and negative readings accordingly.
  • the characteristics of the voltage pulse convert the input signal into two one-dimensional matrices of positive input X and negative input -X;
  • the convolution operation circuit performs a convolution operation on the input voltage pulse signal and the convolution kernel value stored in the memristor unit, and collects the current in the same column to obtain the convolution operation result.
  • the convolution operation process is Where y is the result of the convolution operation, Is the symbol for the convolution operation, X is the front-end synaptic input voltage signal of the neuron node, K + and K- are the positive and negative convolution kernel values corresponding to the neuron node, b is the bias term corresponding to the convolutional layer network, f (.) Is the activation function;
  • the convolution layer circuit unit transmits a convolution operation result to a pooling layer circuit unit
  • the activation function activates the result of the convolution operation and obtains two opposite output values of y and -y, and simultaneously converts the two opposite output values of y and -y into a voltage pulse signal so as to be used as an input of a circuit unit of the pooling layer.
  • the pooling layer circuit unit is divided into an average pooling operation and a maximum pooling operation, and is composed of a pooling operation circuit composed of a memristive array and a voltage conversion subunit;
  • the network convolution kernel value stored by the memristive array corresponding to the pooling operation circuit remains unchanged during the training process. Its circuit structure and convolution kernel map distribution is the same as the convolution layer circuit unit, but the stored convolution kernel value changes. ;
  • the voltage conversion sub-unit converts the result of the pooling operation circuit into two opposite voltage pulse signals h and -h so as to be used as an input of a fully connected layer circuit unit.
  • the fully connected layer circuit unit implements a classification function, and is composed of a fully connected layer circuit composed of a memristive array and a softmax function part, and the fully connected layer circuit unit and the convolution layer circuit unit have different weight mapping methods;
  • the fully connected layer circuit is used to store and calculate the weight matrix, and only completes a series of multiplication and addition operations. There is no translation of the weight matrix.
  • Two memristive devices are used as one synapse to achieve positive and negative weight values. One end is connected to the circuit unit of the pooling layer, and the other end is connected to the softmax function.
  • h k is the front-end synaptic input voltage pulse signal of the k-th neuron node, with The positive and negative synaptic weight values of the k-th input of the l-th neuron node stored by the memristor, respectively, then the effective weight value of the synapse is The positive and negative synaptic weight values can be realized, b k is the bias term corresponding to the k-th neuron node, and m l represents the l-th element that is output through the fully connected layer circuit operation. Is the exponential sum of all output signal elements, z l is the corresponding probability output value of the signal m l after passing through the softmax function;
  • the softmax function is implemented The function of normalizing the output value of the fully connected layer into a probability value, and then passing the result to the output module to obtain the output of the entire convolutional neural network, and sending the result to the weight update module.
  • the weight update module includes a result comparison unit, a calculation unit, and a driving unit;
  • the result comparison unit is respectively connected to the output module and the calculation unit.
  • the result comparison unit compares the output result of the current convolutional neural network module with a preset ideal result, and sends the comparison result to the calculation unit. ;
  • the calculation unit is respectively connected to the result comparison unit and the driving unit.
  • the calculation unit receives the error signal ⁇ sent by the result comparison unit, and calculates a network convolution kernel value or The adjustment value of the weight value and sends the result to the drive unit;
  • the drive unit includes a pulse generator and a read-write circuit, and the drive unit receives the adjustment amount of the convolution kernel value or the weight value sent by the calculation unit, and Layer circuit unit and fully connected layer circuit unit conductance adjustment of the memristive device, the pulse generator is used to generate the conductance modulation signal to adjust the memristive device;
  • the read-write circuit completes the network of the convolutional neural network module based on the memristive device Read and write operations on convolution kernel values or synaptic weight values.
  • the input module receives external information and converts it into a voltage pulse signal, and the signal passes through the convolutional layer, the pooling layer, and the full connection in the convolutional neural network module.
  • the layer-by-layer operation is passed to the output module and sent to the weight update module.
  • the weight update module calculates and adjusts the conductance value of the memristive device according to the result of the output module, and updates the network convolution kernel value or synaptic weight value.
  • the convolutional neural network module uses the multi-level conductance regulation characteristic of the conductance value of the memristive device to change with the application of the electrical signal to simulate the continuous adjustment of the convolution kernel value and the synaptic weight value. It is used in the convolution layer and pooling layer.
  • the two rows of memristive arrays act as a convolution kernel to achieve the positive and negative convolution kernel functions, and use two memristive devices as a synapse to achieve positive and negative weight values in the fully connected layer.
  • the convolution operation is the most time-consuming calculation part in the convolutional neural network.
  • the present invention uses a memristive device to implement the convolutional neural network operation. The use of its high parallelism can greatly increase the operation speed and density of the entire system.
  • Increased operation energy consumption is greatly reduced, which can achieve the integration of information storage and computing, while achieving on-chip learning of convolutional neural networks, is expected to achieve real-time and low energy consumption simulation of brain-scale neural networks, and solve the traditional Von Neumann architecture Disadvantages of Brain Computing Structure.
  • FIG. 1 is a schematic structural diagram of a non-volatile memory-based convolutional neural network on-chip learning system according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a matrix convolution operation principle provided by an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of a memristive device unit according to an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of a convolution layer circuit module composed of a memristive device array as a convolution kernel according to an embodiment of the present invention
  • FIG. 5 is a schematic diagram of a mapping formula of a convolution kernel matrix and an input matrix according to an embodiment of the present invention
  • FIG. 5 (a) is a schematic diagram of how the convolution kernel matrix K is converted into matrix K + and K- mapping formulas
  • FIG. 5 (b) is How the input matrix X is converted into two one-dimensional matrix X and -X mapping formulas;
  • FIG. 6 is a schematic structural diagram of a pooling layer circuit module composed of a memristive device array as a convolution kernel according to an embodiment of the present invention
  • FIG. 7 is a schematic structural diagram of a fully connected layer circuit module composed of a memristive device as a synapse according to an embodiment of the present invention
  • FIG. 8 is a schematic diagram of a weight matrix mapping formula provided by an embodiment of the present invention.
  • FIG. 9 is a schematic diagram of a weight update module according to an embodiment of the present invention.
  • FIG. 10 is a schematic circuit diagram of a memristive array operation of a convolution layer circuit module in a weight update phase according to an embodiment of the present invention.
  • On-chip learning of convolutional neural networks can not only overcome the effects of device variability, but also more in line with biological learning characteristics. It can also modify weights according to the task to be performed, which has good flexibility. Therefore, it is necessary to realize the hardwareization of convolutional neural networks, the integration of storage and computing, and on-chip learning.
  • the invention provides a non-volatile memory-based convolutional neural network on-chip learning system, which includes: an input module, a convolutional neural network module, an output module, and a weight update module.
  • the on-chip learning of the convolutional neural network module uses a memristive resistor.
  • the analog conductance modulation characteristics of the device's conductance changed with the applied pulses achieve the synaptic function, and the convolution kernel value or synaptic weight value is stored in the memristive device unit.
  • the input module converts an input signal into an input voltage pulse signal required by a convolutional neural network, and transmits the result to the convolutional neural network module; the convolutional neural network module transmits an input voltage pulse signal corresponding to the input signal. After layer-by-layer calculation and conversion, the result is passed to the output module to obtain the output of the entire network; the output module is respectively connected to the convolutional neural network module and the weight update module, and is configured to generate the convolutional neural network module.
  • the output signal is converted and sent to the weight update module; the weight update module calculates and adjusts the conductance value of the memristive device according to the result of the output module, and updates the network convolution kernel value or synaptic weight value.
  • the input module converts an external input signal into a voltage signal required by the convolutional neural network, and the pulse width or pulse amplitude of the input signal and the voltage pulse signal follow a proportional relationship.
  • the convolutional neural network module uses a memristive device to simulate a convolution kernel value and a synaptic weight value, and the resistance of the memristive device changes with the application of an electrical signal.
  • the convolutional neural network module includes a convolution layer circuit module and a pooling layer circuit module composed of a memristive device array as a convolution kernel, and a fully connected layer circuit module composed of a memristive device as a synapse;
  • the convolution layer circuit module receives the input voltage pulse signal output by the input module, and the input voltage pulse signal is calculated and converted layer by layer through the convolution layer circuit module, the pooling layer circuit module, and the fully connected layer circuit module, and sends the calculation result. To the output module.
  • the convolution layer circuit module composed of the memristive device array as a convolution kernel is composed of a convolution operation circuit composed of a memristive array and an activation function part. Because there are positive and negative weight values in the biological nervous system, the circuit uses two rows of memristive arrays as a convolution kernel to achieve positive and negative convolution kernel values. At the same time, in order to get all the convolution operation results in one step without the need for a complex storage layer, when the initial convolution kernel value corresponds to the memristive conductance value, the convolution kernel value is mapped to a matrix multiplication operation with the entire input signal. Matrix, the convolution kernel is expanded into two large sparse matrices K + and K-.
  • the characteristics of a memristive device that can apply positive and negative read voltage pulses are used to convert the input signal into positive input X and negative input -X.
  • Dimensional matrix The convolution operation circuit performs a convolution operation on an input voltage pulse signal and a convolution kernel value stored in a memristor unit, and collects currents in the same column to obtain a convolution operation result.
  • the convolution operation process is among them, Is the symbol of the convolution operation, X is the frontal synapse input voltage signal of the neuron node, and k + and K- are the positive and negative convolution kernel values corresponding to the neuron node, then the effective convolution kernel value is (K +)-( K-), which can realize the convolution kernel values of positive and negative values, b is the bias term corresponding to the convolutional layer network, and f (.) Is the activation function.
  • the output is then passed to the pooling layer module.
  • the activation function f (.) Mainly includes: sigmoid function, tanh function, ReLU function, ELU function, and PReLU function.
  • the activation function activates the result of the convolution operation and obtains two opposite output values of y and -y.
  • the two opposite output values of -y are converted into voltage pulse signals for use as inputs to the pooling layer.
  • the pooling layer circuit module composed of the memristive device array as a convolution kernel is mainly divided into an average pooling operation and a maximum pooling operation, and the pooling operation circuit and the voltage conversion module are composed of a memristive array.
  • the pooling operation is a simpler convolution operation.
  • the convolution kernel value stored in the memristive array remains unchanged during the training process.
  • the distribution of the circuit structure and the mapping of the convolution kernel is the same as the convolution layer circuit module, but Changes in stored convolution kernel values.
  • One end of the memristive device in the same row is connected together to connect the output of the convolution layer circuit module, and the other end of the memristive device in the same column is connected together to connect the voltage conversion module, which converts the result of the pooling operation circuit into h And -h two opposite voltage pulse signals in order to be used as inputs of the fully connected layer circuit module.
  • the fully connected layer circuit module composed of the memristor array as a synapse realizes the classification function, and is composed of a fully connected layer circuit composed of a memristive array and a softmax function part.
  • the neurons in the pooling layer are fully connected, so the fully connected layer circuit module has a different weight mapping method from the convolution layer circuit module.
  • the fully connected layer circuit is used to store and calculate the weight matrix, and only completes a series of multiplications. Addition operation, there is no translation of the weight matrix; two memristive devices are used as one synapse to achieve positive and negative weight values. One end of the memristive device is connected to the circuit module of the pooling layer, and the other end is connected to the softmax function.
  • h k is the front-end synaptic input voltage pulse signal of the k-th neuron node, with The positive and negative synaptic weight values of the k-th input of the l-th neuron node stored by the memristor, respectively, then the effective weight value of the synapse is The positive and negative synaptic weight values can be realized
  • b k is the bias term corresponding to the k-th neuron node
  • m l represents the l-th element that is output through the fully connected layer circuit operation. Is the exponential sum of all output signal elements
  • z l is the corresponding probability output value of the signal m l after passing through the softmax function.
  • the softmax function is implemented The function of normalizing the output value of the fully connected layer into a probability value, and then passing the result to the output module to obtain the output of the entire network, and sending the result to the weight update module.
  • the weight update module includes a result comparison module, a calculation module, and a driving module.
  • the result comparison module is respectively connected to the output module and the calculation module, and the result comparison module compares the output result of the current convolutional neural network module with an ideal result, and sends the comparison result to the calculation module;
  • the calculation module is respectively connected to the result comparison module and the driving circuit.
  • the calculation module accepts the error signal ⁇ sent by the result comparison module, and calculates a network convolution kernel value or weight according to a set neural network back propagation algorithm.
  • the driving unit includes a pulse generator and a read-write circuit, and the driving unit receives the adjustment amount of the convolution kernel value or weight value sent by the calculation unit, and
  • the conductance of the memristive device of the circuit unit and the fully connected layer circuit unit is adjusted.
  • the pulse generator is used to generate a conductance modulation signal that adjusts the memristive device.
  • the read-write circuit is used to complete the read-write operation of the convolution kernel value or the synaptic weight value of the convolutional neural network module based on the memristive device.
  • FIG. 1 is a schematic structural diagram of a non-volatile memory-based convolutional neural network on-chip learning system according to an embodiment of the present invention. As shown in Figure 1, the system includes: an input module, a convolutional neural network module, an output module, and a weight update module.
  • the input module converts external input signals into voltage signals required by the convolutional neural network.
  • the pulse width or pulse amplitude of the input signal and the voltage pulse signal follows a proportional relationship. The larger the value of the input signal, the pulse width of the corresponding voltage pulse signal. (Or the pulse amplitude) is wider (larger), and conversely, the corresponding voltage signal is narrower (smaller), and the voltage signal is passed to the convolutional neural network module.
  • the convolutional neural network module transforms the input voltage pulse signal corresponding to the input signal layer by layer, and passes the result to the output module to obtain the output of the entire network.
  • the output module is respectively connected to the convolutional neural network module and the weight update module, and is used for converting and sending the output signal generated by the convolutional neural network module to the weight update module.
  • the weight update module calculates and adjusts the conductance value of the memristive device according to the result of the output module, so as to update the network convolution kernel value or synaptic weight value.
  • the convolution operation in the convolutional neural network is the most important and the most computationally intensive part.
  • the convolution operation has important applications in image recognition and digital signal processing.
  • the convolution operation starts from the upper left corner of the input matrix and opens an active window of the same size as the template (that is, the convolution kernel).
  • the convolution kernel is usually a square grid structure, and each square in the area is There is a weight value.
  • the convolution kernel is inverted by 180 °.
  • the window matrix and the convolution kernel elements are multiplied and then added together, and the calculation result is used to replace the element in the center of the window. Then, move the active window one column to the right and do the same operation.
  • FIG. 1 illustrates the convolution calculation process of a 3 ⁇ 3 input matrix and a 2 ⁇ 2 convolution kernel to obtain a 2 ⁇ 2 output matrix.
  • FIG. 3 is a schematic diagram of a memristive device unit provided by an embodiment of the present invention.
  • the read and write speed, device density, and programming voltage of memristive devices are comparable to today's leading storage technologies, and energy consumption is quite low.
  • the memristive device's analog memory function is similar to biological synapses, and its conductance can be continuously changed by applying a relatively large voltage bias, but it remains unchanged when a small or no bias is applied.
  • the conductance gradient characteristic of the memristive device is used to simulate the change process of biological synaptic weights, that is, to simulate the function of adaptive learning of neural networks.
  • the type of memristive device can be two-terminal, three-terminal memristive device or other common types.
  • positive and negative read voltage pulses can be applied. This feature can avoid the introduction of additional subtraction circuits when realizing the positive and negative weight values, which reduces the circuit scale to a certain extent.
  • FIG. 4 is a schematic structural diagram of a convolution layer circuit module composed of a memristive device array as a convolution kernel provided by the present invention, which is composed of a convolution operation circuit and an activation function part composed of a memristive array.
  • the figure shows the adoption Size input signal matrix, convolution kernel size is n ⁇ n, output matrix size is Convolution operation circuit.
  • One end of the memristive device in the same row is connected to the input module, and the other end of the memristive device in the same column is connected to the activation function f (.). Because the weight value is positive or negative in the biological nervous system, the circuit uses two
  • the row memristive array is used as a convolution kernel to achieve positive and negative convolution kernel values.
  • the convolution kernels are shared in the convolution layer.
  • the same convolution kernel is used to continuously scan the input matrix until the input matrix.
  • the elements of are completely covered by the convolution kernel matrix, and a series of convolution operation results are obtained.
  • the convolution kernel value is mapped into a matrix that can be matrix multiplied with the entire input signal.
  • the convolution kernel is expanded into two large sparse matrices K + and K-.
  • the memristive device can be used to apply positive and negative read voltage pulses.
  • the feature converts the input signal into two one-dimensional matrices of positive input X and negative input -X. So the size of the memristive array we need is (2 ⁇ i + 1) ⁇ j.
  • the convolution operation circuit performs a convolution operation on the input voltage pulse signal and the convolution kernel value stored in the memristor unit, and collects the current in the same column to obtain the convolution operation result.
  • the convolution operation process is Where y is the result of the convolution operation, Is the symbol of the convolution operation, X is the frontal synapse input voltage signal of the neuron node, and K + and K- are the positive and negative convolution kernel values corresponding to the neuron node.
  • K- which can realize the convolution kernel values of positive and negative values
  • b is the bias term corresponding to the convolutional layer network
  • f (.) Is the activation function.
  • X i in FIG. 4 represents an input voltage signal
  • X b represents an input voltage signal of a bias term.
  • the output is then passed to the pooling layer module.
  • the activation function f (.) Mainly includes: sigmoid function, tanh function, ReLU function, ELU function, and PReLU function.
  • the activation function activates the result of the convolution operation and obtains two opposite output values of y and -y. Two opposite output values of y are converted into a voltage pulse signal for use as inputs to the pooling layer.
  • 2 ⁇ 2 convolution kernel matrix K and 3 ⁇ 3 input signal matrix X as examples to demonstrate how to expand the convolution kernel into large sparse matrices K + and K- and how the input matrix is converted into positive input X and negative input.
  • -X two one-dimensional matrices.
  • Fig. 5 (a) shows how the convolution kernel matrix K based on the memristive array is converted into the matrices K + and K- using the proposed method.
  • the convolution kernel is first rotated by 180 ° and then converted into two matrices.
  • the memristor corresponding to the matrix element is 0 is unformed and always maintains a high-impedance state during the learning process. Therefore, the memristive array can be easily explained to have a positive Convolution kernel for negative and positive values. Since the input signal matrix X has 9 elements, each convolution kernel matrix K + and K- must have 9 rows.
  • Figure 5 (b) shows how the input matrix X is converted into two one-dimensional matrices X and -X, multiplied by K + and K-, respectively. Since the size of K is 2 ⁇ 2 and the size of X is 3 ⁇ 3, the size of the output feature is 2 ⁇ 2. Therefore, the convolution kernel matrix must have 4 columns, and each output value corresponds to a column.
  • FIG. 6 is a schematic structural diagram of a pooling layer circuit module composed of a memristive device array as a convolution kernel according to an embodiment of the present invention, which is mainly divided into an average pooling operation and a maximum pooling operation.
  • the entire input matrix is divided into several small blocks of the same size without overlapping, and only the maximum or average value is taken in each small block. After discarding other nodes, the original planar structure is maintained to obtain the output.
  • the pooling operation can very effectively reduce the size of the matrix, thereby reducing the parameters in the last fully connected layer. At the same time, using the pooling layer can both speed up the calculation and prevent the problem of overfitting.
  • the pooling layer circuit module is connected to the convolution layer circuit module and the fully connected layer circuit module.
  • the pooling operation is a simpler convolution operation.
  • the convolution kernel value stored in the memristive array remains unchanged during the training process.
  • the distribution of the circuit structure and the mapping of the convolution kernel is the same as the circuit module of the convolution layer, except that the stored convolution kernel value changes.
  • One end of the memristive device in the same row is connected to connect the output of the convolution layer circuit module, the other end of the memristive device in the same column is connected to the voltage conversion module, and the output end of the voltage conversion module is connected to the fully connected layer circuit module.
  • the currents on the same column are brought together to achieve the addition calculation, and the result of the voltage converter output is collected to obtain the result of the pooling operation.
  • the voltage conversion module converts the result of the pooling operation circuit into two opposite voltage pulse signals, h and -h, so as to be used as the input of the fully connected layer circuit module.
  • a 2 ⁇ 2 matrix size pooling operation is used. Since the output matrix of the convolution layer circuit module is Then the output matrix is Therefore, the size of the memory layer circuit module memristive array is (2 ⁇ j + 1) ⁇ k.
  • the result is passed to the fully connected layer circuit module, where h k in FIG. 6 represents the pooling operation result, and y j represents the output result of the convolution layer unit.
  • FIG. 7 is a schematic structural diagram of a fully connected layer circuit module composed of a memristive device as a synapse according to an embodiment of the present invention, which is connected to a pooling layer circuit module and an output module, respectively.
  • the fully connected layer circuit module maps the final output to a linear Divided space, that is, the function of classification, is composed of a fully connected layer circuit and a softmax function part composed of a memristive array. Since the fully connected layer completes a simple series of multiplication and addition operations in the perceptron network, its neurons and pooling The neurons in the layer are fully connected, so the fully connected layer circuit module has a different weight mapping method from the convolution layer circuit module.
  • the fully connected layer circuit is used to store and calculate the weight matrix
  • the convolution operation circuit is used to store and calculate.
  • two memristive devices are used as one synapse to achieve positive and negative weight values.
  • One end of the memristive devices in the same row is connected together to connect the output of the pooling layer circuit module, and the other end of the memristive device in the same column is connected to the softmax Function that collects the current of the same column, that is, the output of the layer, and outputs the result
  • h k is the front-end synaptic input voltage pulse signal of the k-th neuron node, with The positive and negative synaptic weight values of the k-th input of the l-th neuron node stored by the memristor, respectively, then the effective weight value of the synapse is The positive and negative synaptic weight values can be realized
  • b k is the bias term corresponding to the k-th neuron node
  • z l is the corresponding probability output value of the signal m l after passing through the softmax function.
  • the softmax function is implemented The function of normalizing the output value of the fully connected layer into a probability value. Since the pooling layer output matrix size is If the final classification category is l, the size of the memristive array of the fully connected layer circuit in this embodiment is (2 ⁇ k + 1) ⁇ 1. The result is then passed to the output module to obtain the output of the entire network, and the result is sent to the weight update module. Wherein h b in FIG. 7 represents a bias term corresponding to the input signal. In the following, we take a 3 ⁇ 3 weight matrix as an example to demonstrate how to map the weight matrix into two matrices W + and W-.
  • Figure 8 shows how the weighted matrix W based on the memristive array 3 ⁇ 3 is converted into two one-dimensional matrices W + and W- using the proposed method, and the memristors corresponding to matrix elements of the two matrices are 0 in the unforming state.
  • the high-impedance state is always maintained during the learning process, so the memristive array can easily explain the weight values with positive and negative values.
  • FIG. 9 is a schematic diagram of a weight update module according to an embodiment of the present invention, which is respectively connected to an output module and a convolutional neural network module, and includes a result comparison unit, a calculation unit, and a driving unit.
  • the result comparison unit is respectively connected to the output unit and the calculation unit.
  • the result comparison unit compares the output result of the current convolutional neural network module with the ideal result and sends the comparison result ⁇ to the calculation unit.
  • the calculation unit connects the result comparison unit and the drive unit.
  • the calculation unit receives the error signal ⁇ sent by the result comparison unit, calculates the adjustment amount ⁇ of the network convolution kernel value or the weight value according to the set neural network back-propagation algorithm, and sends the result to the drive unit; the drive The unit includes a pulse generator and a read-write circuit.
  • the driving unit receives the adjustment amount of the convolution kernel value or the weight value sent by the calculation unit, and adjusts the conductance of the memristive device of the convolution layer circuit unit and the fully connected layer circuit unit.
  • the pulse generator is used to generate a conductance modulation signal that adjusts the memristive device.
  • the read-write circuit is used to complete the read-write operation of the convolution kernel value or connection weight of the convolutional neural network module based on the memristive device.
  • the convolution layer circuit module we take the operation of the convolution layer circuit module as an example to demonstrate how the memristive array as a convolution kernel and how its convolution kernel value is updated during the learning process.
  • FIG. 10 is a schematic circuit diagram of the operation of the memristive array of the convolution layer circuit module in the weight update phase according to an embodiment of the present invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Neurology (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Semiconductor Memories (AREA)
  • Image Analysis (AREA)
  • Complex Calculations (AREA)

Abstract

本发明公开一种基于非易失存储器的卷积神经网络片上学习系统,包括:输入模块、卷积神经网络模块、输出模块以及权重更新模块。卷积神经网络模块的片上学习利用忆阻器电导随着施加脉冲进行改变的特性实现突触功能,卷积核值或突触权重值储存在忆阻单元中;输入模块将输入信号转换成卷积神经网络模块所需的电压信号;卷积神经网络模块将输入电压信号经过逐层计算转换,并将结果传入输出模块得到网络的输出;权重更新模块根据输出模块的结果调整卷积神经网络模块中忆阻器的电导值,实现网络卷积核值或突触权重值的更新。本发明旨在实现卷积神经网络的片上学习,实现了数据的在线处理,基于网络的高度并行性实现了速度快功耗低,硬件成本低的需求。

Description

一种基于非易失存储器的卷积神经网络片上学习系统 [技术领域]
本发明涉及人工神经网络技术领域,更具体地,涉及一种基于非易失存储器的卷积神经网络片上学习系统。
[背景技术]
人工神经网络是一种应用类似于大脑神经突触联接的结构,它是一种模仿动物神经网络行为特征,进行信息处理的数学模型进行分布式并行信息处理的算法数学模型。在众多机器学习的算法中,神经网络的适用性广,鲁棒性强。这种网络依靠系统的复杂程度,通过调整内部大量节点之间相互连接的关系,从而达到处理信息的目的。
作为深度学习最重要的算法之一,卷积神经网络在大型图像识别中有很大的优越性。由多个卷积层和池化层组成的卷积神经网络(Convolutional Neural Network,CNN)体系结构可以提取有用的特征信息,而不需要大量的手动输入数据,这使得其在各种模式识别应用中的有很高的准确性,将其实现硬件化是一项很有前景的工作。现有的很多工作大多是基于CPU和GPU开展的,这样会导致很大的能耗,即遭遇所说的冯诺依曼瓶颈,所以急需寻找一种新型存储器件能模拟人脑,同时实现对信息的存储和处理。CNN利用输入图像的空间结构,比其他神经网络结构如完全连接的神经网络更适合于视觉任务。到目前为止,最大的挑战是将CNN集成到嵌入式系统中。要实现硬件化,突触是神经网络中最多的处理元件,在突触器件方面,已有很多器件被报道如磁性存储器,相变存储器和忆阻器等。这其中,由于忆阻器非易失,易集成,低功耗以及可实现多位存储,是非常有前途的候选者。忆阻器的模拟记忆功能类似于生物突触,其电导可以通过施加相对较大的电压偏置而连续改变,但在施加较小的偏压或无偏压时保持不变。忆阻器可以通过crossbar结构进行集成,使用这种结构可以实现接近或 大于脑组织的突触密度。基于忆阻器的CNN可以使用易于实现的神经电路和辅助电路,同时可以降低能耗,以更高的速度进行计算并在物理上实现并行性。
[发明内容]
针对现有技术的缺陷,本发明的目的在于解决现有计算机实现卷积神经网络所遭遇的冯诺依曼瓶颈,其计算和存储的分离相当耗时低速,并且会导致巨大的硬件成本,同时利用计算机实现的片外学习只能实现预先训练的特定功能,不能实时灵活地解决问题的技术问题。
为实现上述目的,本发明提供一种基于非易失存储器的卷积神经网络片上学习系统,包括:输入模块、卷积神经网络模块、输出模块以及权重更新模块;
所述输入模块将输入信号转换成卷积神经网络模块所需的输入电压脉冲信号后传入所述卷积神经网络模块;
所述卷积神经网络模块对输入信号对应的输入电压脉冲信号经过逐层计算转换以完成片上学习得到输出信号,其利用忆阻器件的电导随着施加脉冲进行改变的电导调制特性实现突触功能,片上学习过程中用到的网络卷积核值或突触权重值储存在忆阻器件中;
所述输出模块将所述卷积神经网络模块产生的输出信号转换并发送给所述权重更新模块;
所述权重更新模块根据输出模块的结果来计算误差信号和调整忆阻器件的电导值,实现网络卷积核值或突触权重值的更新。
可选地,所述输入模块将外界输入信号转换成所述卷积神经网络所需的输入电压脉冲信号,输入信号与输入电压脉冲信号的脉冲宽度或脉冲幅度遵循正比例关系,且所述输入电压脉冲信号应小于忆阻器的擦写电压。
可选地,所述卷积神经网络模块采用忆阻器件来模拟网络卷积核值和突触权重值,所述忆阻器件的电阻随着施加电信号进行改变;
所述卷积神经网络模块包括:由忆阻器阵列作为卷积核构成的卷积层电路单元和池化层电路单元,以及由忆阻器阵列作为突触构成的全连接层电路单元;
所述卷积层电路单元接收输入模块输出的输入电压脉冲信号,输入电压脉冲信号经过所述卷积层电路单元、池化层电路单元,以及全连接层电路单元逐层计算转换,并将计算结果作为输出信号发送至所述输出模块。
可选地,所述卷积层电路单元由忆阻阵列构成的卷积运算电路和激活函数部分组成;
所述卷积运算电路采用两行忆阻阵列作为一个卷积核来实现正负卷积核值,初始卷积核值与忆阻电导值对应时,卷积核值被映射成可以与整个输入信号进行矩阵乘法运算的矩阵,卷积核被扩展为两个大型稀疏矩阵K+和K-,分别为神经元节点对应的正负卷积核值,相应地利用忆阻器件可以施加正负读取电压脉冲的特性将输入信号转换为正输入X和负输入-X两个一维矩阵;
所述卷积运算电路将输入电压脉冲信号与存储在忆阻器单元中的卷积核值进行卷积运算,收集同一列的电流即得到卷积运算结果,卷积运算过程为
Figure PCTCN2019095680-appb-000001
其中,y为卷积运算结果,
Figure PCTCN2019095680-appb-000002
为卷积运算符号,X为神经元节点的前端突触输入电压信号,K+和K-别为神经元节点对应的正负卷积核值,b为卷积层网络对应的偏置项,f(.)为激活函数;
所述卷积层电路单元将卷积运算结果传入池化层电路单元;
所述激活函数将卷积运算结果进行激活并得到y和-y两个相反输出值,同时将y和-y两个相反输出值转换成电压脉冲信号以便作为池化层电路单元的输入。
可选地,所述池化层电路单元分为平均池化操作和最大池化操作,由忆阻阵列构成的池化操作电路和电压转换子单元组成;
池化操作电路对应的忆阻阵列存储的网络卷积核值在训练过程中保持不变,其电路结构和卷积核的映射分布和卷积层电路单元一样,只是存储的卷积核值改变;
所述电压转换子单元将池化操作电路的结果转换成h和-h两个相反电压脉冲信号以便作为全连接层电路单元的输入。
可选地,所述全连接层电路单元实现分类的功能,由忆阻阵列构成的全连接层电路和softmax函数部分组成,全连接层电路单元与卷积层电路单元的权重映射方法不同;
所述全连接层电路用于存储与计算权重矩阵,只完成一系列乘法加法运算,没有权重矩阵的平移,利用两个忆阻器件作为一个突触实现正负权重值,所述忆阻器件的一端连接池化层电路单元,另一端连接softmax函数,收集同一列的电流即该层的输出结果,输出结果
Figure PCTCN2019095680-appb-000003
Figure PCTCN2019095680-appb-000004
Figure PCTCN2019095680-appb-000005
公式中,h k为第k个神经元节点的前端突触输入电压脉冲信号,
Figure PCTCN2019095680-appb-000006
Figure PCTCN2019095680-appb-000007
分别为忆阻器存储的第l个神经元节点的第k个输入的正负突触权重值,则突触的有效权重值为
Figure PCTCN2019095680-appb-000008
即可实现正负突触权重值,b k为第k个神经元节点对应的偏置项,m l表示经过全连接层电路操作输出的第l个元素,
Figure PCTCN2019095680-appb-000009
为所有输出信号元素的指数和,z l为信号m l经过softmax函数后对应的概率输出值;
所述softmax函数实现
Figure PCTCN2019095680-appb-000010
即将全连接层输出的值归一化为概率值的功能,然后将结果传入所述输出模块即得到整个卷积神经网络的输出,并将结果发送给所述权重更新模块。
可选地,所述权重更新模块包括结果比较单元、计算单元和驱动单元;
所述结果比较单元分别连接所述输出模块与所述计算单元,所述结果比较单元将当前卷积神经网络模块的输出结果与预设理想结果进行比较,并将比较结果发送到所述计算单元;
所述计算单元分别连接所述结果比较单元与所述驱动单元,所述计算单元接受所述结果比较单元发送的误差信号δ,根据设定的神经网络反向传播算法计算网络卷积核值或权重值的调整量,并将结果发送至驱动单元;所述驱动单元包括脉冲发生器和读写电路,驱动单元接收所述计算单元发送的卷积核值或权重值的调整量,对卷积层电路单元和全连接层电路单元的忆阻器件电导进行调整,脉冲发生器用于产生调整忆阻器件的电导调制信号;所述读写电路完成对基于忆阻器件的卷积神经网络模块的网络卷积核值或突触权重值的读写操作。
总体而言,通过本发明所构思的以上技术方案与现有技术相比,具有以下有益效果:
本发明提供的基于非易失存储器的卷积神经网络片上学习系统,输入模块接受外部信息并转换成电压脉冲信号,该信号经过卷积神经网络模块中的卷积层,池化层以及全连接层逐层运算后传入输出模块,并发送给权重更新模块,权重更新模块根据输出模块的结果来计算和调整忆阻器件的电导值,实现网络卷积核值或突触权重值的更新。在卷积神经网络模块采用忆阻器件的电导值随着施加电信号进行改变的多阶电导调控特性来模拟卷积核值和突触权重值的连续调节,在卷积层和池化层采用两行忆阻阵列作为一个卷积核来实现正负卷积核功能,在全连接层利用两个忆阻器件作为一个突触实现正负权重值。在卷积神经网络中卷积运算是最耗时的计算部分,本发明采用忆阻器件实现卷积神经网络运算,采用其高度并行性可以很大程度使整个系统的运算速度与密度有很大提高,运行能耗则大幅降低,可以实现信息存储和计算的融合,同时实现卷积神经网络的片上学习,有望实现对大脑规模神经网络的实时与低能耗模拟,解决传统冯诺依曼体系架构的大脑计算结构的缺点。
[附图说明]
图1为本发明实施例提供的基于非易失存储器的卷积神经网络片上学 习系统结构示意图;
图2为本发明实施例提供的矩阵卷积运算原理示意图;
图3为本发明实施例提供的忆阻器件单元示意图;
图4为本发明实施例提供的由忆阻器件阵列作为卷积核构成的卷积层电路模块结构示意图;
图5为本发明实施例提供的卷积核矩阵和输入矩阵的映射公式示意图,图5(a)为卷积核矩阵K如何转换成矩阵K+和K-映射公式示意图;图5(b)为输入矩阵X如何转换为两个一维矩阵X和-X映射公式示意图;
图6为本发明实施例提供的由忆阻器件阵列作为卷积核构成的池化层电路模块结构示意图;
图7为本发明实施例提供的由忆阻器件作为突触构成的全连接层电路模块结构示意图;
图8为本发明实施例提供的权重矩阵映射公式示意图;
图9为本发明实施例提供的权重更新模块示意图;
图10为本发明实施例提供的权重更新阶段卷积层电路模块忆阻阵列操作的电路原理图示意图。
[具体实施方式]
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。此外,下面所描述的本发明各个实施方式中所涉及到的技术特征只要彼此之间未构成冲突就可以相互组合。
卷积神经网络的片上学习不仅可以克服设备可变性的影响,而且更符合生物学上的学习特征,还可以根据所要执行的任务进行权重修改,具有很好的灵活性。所以实现卷积神经网络的硬件化、存储和计算的融合以及片上学习很有必要。
本发明提供一种基于非易失存储器的卷积神经网络片上学习系统,包括:输入模块、卷积神经网络模块、输出模块以及权重更新模块,所述卷积神经网络模块的片上学习利用忆阻器件的电导随着施加脉冲进行改变的模拟电导调制特性实现突触功能,卷积核值或突触权重值储存在忆阻器件单元中。
所述输入模块将输入信号转换成卷积神经网络所需的输入电压脉冲信号,并将结果传入所述卷积神经网络模块;所述卷积神经网络模块将输入信号对应的输入电压脉冲信号经过逐层计算转换,并将结果传入输出模块得到整个网络的输出;所述输出模块分别连接所述卷积神经网络模块与所述权重更新模块,用于将所述卷积神经网络模块产生的输出信号转换并发送给所述权重更新模块;所述权重更新模块根据输出模块的结果来计算和调整忆阻器件的电导值,实现网络卷积核值或突触权重值的更新。
可选地,所述输入模块将外界输入信号转换成所述卷积神经网络所需的电压信号,输入信号与电压脉冲信号的脉冲宽度或脉冲幅度遵循正比例关系,输入信号值越大,所对应电压脉冲信号的脉冲宽度(或脉冲幅度)就越宽(越大),反之,所对应电压信号就越窄(越小),且应小于忆阻器的擦写电压。
可选地,所述卷积神经网络模块采用忆阻器件来模拟卷积核值和突触权重值,所述忆阻器件的电阻随着施加电信号进行改变。所述卷积神经网络模块包括:由忆阻器件阵列作为卷积核构成的卷积层电路模块和池化层电路模块,以及由忆阻器件作为突触构成的全连接层电路模块;所述卷积层电路模块接收输入模块输出的输入电压脉冲信号,输入电压脉冲信号经过所述卷积层电路模块、池化层电路模块,以及全连接层电路模块逐层计算转换,并将计算结果发送至所述输出模块。
可选地,所述由忆阻器件阵列作为卷积核构成的卷积层电路模块,由忆阻阵列构成的卷积运算电路和激活函数部分组成。由于在生物神经系统 中权重值有正有负,所以电路采用两行忆阻阵列作为一个卷积核来实现正负卷积核值。同时,为了能一步得到全部卷积运算结果而不需要中间复杂的存储层,初始卷积核值与忆阻电导值对应时,卷积核值被映射成可以与整个输入信号进行矩阵乘法运算的矩阵,卷积核被扩展为两个大型稀疏矩阵K+和K-,相应地利用忆阻器件可以施加正负读取电压脉冲的特性将输入信号转换为正输入X和负输入-X两个一维矩阵。所述卷积运算电路将输入电压脉冲信号与存储在忆阻器单元中的卷积核值进行卷积运算,收集同一列的电流即得到卷积运算结果。卷积运算过程为
Figure PCTCN2019095680-appb-000011
Figure PCTCN2019095680-appb-000012
其中,
Figure PCTCN2019095680-appb-000013
为卷积运算符号,X为神经元节点的前端突触输入电压信号,k+和K-别为神经元节点对应的正负卷积核值,则有效的卷积核值为(K+)-(K-),即可实现正值和负值的卷积核值,b为卷积层网络对应的偏置项,f(.)为激活函数。然后,将输出结果传入池化层模块。所述激活函数f(.)主要有:sigmoid函数、tanh函数、ReLU函数、ELU函数以及PReLU函数,激活函数将卷积运算结果进行激活并得到y和-y两个相反输出值,同时将y和-y两个相反输出值转换成电压脉冲信号以便作为池化层的输入。
可选地,所述由忆阻器件阵列作为卷积核构成的池化层电路模块主要分为平均池化操作和最大池化操作,由忆阻阵列构成的池化操作电路和电压转换模块组成。池化操作是一个更简单的卷积操作,其忆阻阵列存储的卷积核值在训练过程中保持不变,其电路结构和卷积核的映射的分布和卷积层电路模块一样,只是存储的卷积核值的改变。同一行的忆阻器件一端连接在一起连接卷积层电路模块的输出,同一列的忆阻器件另一端连接在一起连接电压转换模块,所述电压转换模块将池化操作电路的结果转换成h和-h两个相反电压脉冲信号以便作为全连接层电路模块的输入。
可选地,所述由忆阻器阵列作为突触构成的全连接层电路模块实现分类的功能,由忆阻阵列构成的全连接层电路和softmax函数部分组成,由于 全连接层中神经元与池化层中的神经元为全连接状态,所以全连接层电路模块其与卷积层电路模块的权重映射方法不同,所述全连接层电路用于存储与计算权重矩阵,只完成一系列乘法加法运算,没有权重矩阵的平移;利用两个忆阻器件作为一个突触实现正负权重值,所述忆阻器件一端连接池化层电路模块,另一端连接softmax函数,收集同一列的电流即的该层的输出结果,输出结果
Figure PCTCN2019095680-appb-000014
Figure PCTCN2019095680-appb-000015
公式中,h k为第k个神经元节点的前端突触输入电压脉冲信号,
Figure PCTCN2019095680-appb-000016
Figure PCTCN2019095680-appb-000017
分别为忆阻器存储的第l个神经元节点的第k个输入的正负突触权重值,则突触的有效权重值为
Figure PCTCN2019095680-appb-000018
即可实现正负突触权重值,b k为第k个神经元节点对应的偏置项,m l表示经过全连接层电路操作输出的第l个元素,
Figure PCTCN2019095680-appb-000019
为所有输出信号元素的指数和,z l为信号m l经过softmax函数后对应的概率输出值。所述softmax函数实现
Figure PCTCN2019095680-appb-000020
即将全连接层输出的值归一化为概率值的功能,然后将结果传入所述输出模块即得到整个网络的输出,并将结果发送给所述权重更新模块。
可选地,所述权重更新模块包括结果比较模块、计算模块和驱动模块。所述结果比较模块分别连接所述输出模块与所述计算模块,所述结果比较模块将当前卷积神经网络模块的输出结果与理想结果进行比较,并将比较结果发送到所述计算模块;所述计算模块分别连接所述结果比较模块与所述驱动电路,所述计算模块接受所述结果比较模块发送的误差信号δ,根据设定的神经网络反向传播算法计算网络卷积核值或权重值的调整量,并将结果发送至驱动单元;所述驱动单元包括脉冲发生器和读写电路,驱动单元接收所述计算单元发送的卷积核值或权重值的调整量,对卷积层电路单元和全连接层电路单元的忆阻器件电导进行调整。脉冲发生器用于产生调整忆阻器件的电导调制信号。所述读写电路用于完成对基于忆阻器件的卷积神经网络模块的卷积核值或突触权重值的读写操作。
图1是本发明实施例提供的基于非易失存储器的卷积神经网络片上学习系统结构示意图。如图1所示,该系统包括:输入模块、卷积神经网络模块、输出模块以及权重更新模块。
输入模块将外界输入信号转换成所述卷积神经网络所需的电压信号,输入信号与电压脉冲信号的脉冲宽度或脉冲幅度遵循正比例关系,输入信号值越大,所对应电压脉冲信号的脉冲宽度(或脉冲幅度)就越宽(越大),反之,所对应电压信号就越窄(越小),并将电压信号传入卷积神经网络模块。
卷积神经网络模块将输入信号对应的输入电压脉冲信号经过逐层计算转换,并将结果传入输出模块得到整个网络的输出。
输出模块分别连接所述卷积神经网络模块与所述权重更新模块,用于将所述卷积神经网络模块产生的输出信号转换并发送给所述权重更新模块。
权重更新模块根据输出模块的结果来计算和调整忆阻器件的电导值,实现网络卷积核值或突触权重值的更新。
需要说明的是,卷积神经网络中卷积运算是最重要以及计算量最大的部分,作为一种广义的积分概念,卷积运算在图像识别以及数字信号处理方面有很重要的应用。卷积运算是指从输入矩阵的左上角开始,开一个与模板(即卷积核)同样大小的活动窗口,卷积核通常是一个四方形的网格结构,该区域上每个方格都有一个权重值。首先将卷积核进行180°反转,窗口矩阵与卷积核元素对应起来相乘再相加,并用计算结果代替窗口中心的元素。然后,活动窗口向右移动一列,并作同样的运算。以此类推,从左到右、从上到下,直到矩阵被卷积核全部重叠过即可卷积后的新矩阵。当输入矩阵为m×m,卷积核矩阵大小为n×n时,相应的输出矩阵大小为(m-n+1)×(m-n+1)。图2演示了一个3×3的输入矩阵和一个2×2的卷积核进行卷积运算后得到2×2的输出矩阵的卷积计算过程。
图3是本发明实施例提供的忆阻器件单元示意图。作为非易失性器件,忆阻器件的读写速度、器件密度、编程电压等各项指标都可以与当今领先的存储技术媲美,能耗相当低。忆阻器件的模拟记忆功能类似于生物突触,其电导可以通过施加相对较大的电压偏置而连续改变,但在施加较小的偏压或无偏压时保持不变。通过利用忆阻器件的不同电导值来区分不同的存储状态,忆阻器件这一脉冲作用下电导渐变特性用于模拟生物突触权重的变化过程,即模拟神经网络自适应学习的功能。忆阻器件的类型可以为两端、三端忆阻器件或其他常见的类型。并且可以施加正负读取电压脉冲,这一特性可以在实现正负权重值时避免额外的减法电路的引入,在一定程度上减小了电路规模。
图4是本发明提供的由忆阻器件阵列作为卷积核构成的卷积层电路模块结构示意图,由忆阻阵列构成的卷积运算电路和激活函数部分组成。图中示出了采用
Figure PCTCN2019095680-appb-000021
大小输入信号矩阵,卷积核大小为n×n,输出矩阵大小为
Figure PCTCN2019095680-appb-000022
的卷积运算电路。同一行的忆阻器件一端连接在一起连接输入模块,同一列的忆阻器件另一端连接在一起连接激活函数f(.),由于在生物神经系统中权重值有正有负,所以电路采用两行忆阻阵列作为一个卷积核来实现正负卷积核值,同时,在卷积层中卷积核是共享的,需要利用同一个卷积核不断对输入矩阵进行扫描,直到输入矩阵中的元素全被卷积核矩阵覆盖过,得到一系列卷积运算结果,为了能一步得到全部卷积运算结果而不需要中间复杂的存储层,初始卷积核值与忆阻电导值对应时,卷积核值被映射成可以与整个输入信号进行矩阵乘法运算的矩阵,卷积核被扩展为两个大型稀疏矩阵K+和K-,相应地利用忆阻器件可以施加正负读取电压脉冲的特性将输入信号转换为正输入X和负输入-X两个一维矩阵。所以我们需要的忆阻阵列大小为(2×i+1)×j。卷积运算电路将输入电压脉冲信号与存储在忆阻器单元中的卷积核值进行卷积运算,收集同一列的电流即 得到卷积运算结果。卷积运算过程为
Figure PCTCN2019095680-appb-000023
其中,y为卷积运算结果,
Figure PCTCN2019095680-appb-000024
为卷积运算符号,X为神经元节点的前端突触输入电压信号,K+和K-别为神经元节点对应的正负卷积核值,则有效的卷积核值为(K+)-(K-),即可实现正值和负值的卷积核值,b为卷积层网络对应的偏置项,f(.)为激活函数。其中,图4中X i表示输入电压信号,X b表示偏置项的输入电压信号。
然后,将输出结果传入池化层模块。激活函数f(.)主要有:sigmoid函数、tanh函数、ReLU函数、ELU函数以及PReLU函数,激活函数将卷积运算结果进行激活并得到y和-y两个相反输出值,同时将y和-y两个相反输出值转换成电压脉冲信号以便作为池化层的输入。以下我们以2×2卷积核矩阵K和3×3的输入信号矩阵X为实施例来演示如何将卷积核扩展为大型稀疏矩阵K+和K-以及输入矩阵如何转换成正输入X和负输入-X两个一维矩阵。
图5(a)显示了使用所提出的方法基于忆阻阵列的卷积核矩阵K如何转换成矩阵K+和K-。卷积核首先旋转180°然后被转换为两个矩阵,对应矩阵元素为0的忆阻器为未forming状态,在学习过程中始终保持高阻状态,因此忆阻阵列可以很容易地解释具有正值和负值的卷积核。由于输入信号矩阵X有9个元素,每个卷积核矩阵K+和K-必须有9行。
图5(b)显示了输入矩阵X如何转换为两个一维矩阵X和-X,分别乘以K+和K-。由于K的尺寸为2×2,X的尺寸为3×3,所以输出特征的尺寸为2×2。因此,卷积核矩阵必须4列,每个输出值对应一列。
图6是本发明实施例提供的由忆阻器件阵列作为卷积核构成的池化层电路模块结构示意图,主要分为平均池化操作和最大池化操作。整个输入矩阵被不重叠的分割成若干个同样大小的小块,每个小块内只取最大值或平均值,再舍弃其他节点后,保持原有的平面结构得到输出。池化操作可 以非常有效地缩小矩阵尺寸,从而减少最后全连接层中的参数,同时,使用池化层既可以加快计算速度也有防止过拟合问题的作用。池化层电路模块分别连接卷积层电路模块和全连接层电路模块,池化操作是一个更简单的卷积操作,其忆阻阵列存储的卷积核值在训练过程中保持不变,其电路结构和卷积核的映射的分布和卷积层电路模块一样,只是存储的卷积核值的改变。同一行的忆阻器件一端连接在一起连接卷积层电路模块的输出,同一列的忆阻器件另一端连接在一起连接电压转换模块,电压转换模块输出端连接到全连接层电路模块。同一列上的电流汇集在一起实现加法计算,收集电压转换器输出端的结果即得到池化操作的结果。电压转换模块将池化操作电路的结果转换成h和-h两个相反电压脉冲信号以便作为全连接层电路模块的输入。本实施例中使用2×2矩阵大小的池化操作,由于卷积层电路模块的输出矩阵为
Figure PCTCN2019095680-appb-000025
则输出矩阵为
Figure PCTCN2019095680-appb-000026
所以池化层电路模块忆阻阵列大小为(2×j+1)×k。池化操作处理后将结果传入全连接层电路模块,其中,图6中h k表示池化操作结果,y j表示卷积层单元的输出结果。
图7是本发明实施例提供的由忆阻器件作为突触构成的全连接层电路模块结构示意图,分别连接池化层电路模块和输出模块,全连接层电路模块将最后的输出映射到线性可分的空间,即实现分类的功能,由忆阻阵列构成的全连接层电路和softmax函数部分组成,由于全连接层完成在感知器网络中简单的一系列乘法加法运算,其神经元与池化层中的神经元为全连接状态,所以全连接层电路模块其与卷积层电路模块的权重映射方法不同,全连接层电路用于存储与计算权重矩阵,卷积运算电路用于存储与计算一组卷积核数组。同样,利用两个忆阻器件作为一个突触实现正负权重值,同一行的忆阻器件一端连接在一起连接池化层电路模块的输出,同一列的忆阻器件另一端连接在一起连接softmax函数,收集同一列的电流即的该层 的输出结果,输出结果
Figure PCTCN2019095680-appb-000027
Figure PCTCN2019095680-appb-000028
公式中,h k为第k个神经元节点的前端突触输入电压脉冲信号,
Figure PCTCN2019095680-appb-000029
Figure PCTCN2019095680-appb-000030
分别为忆阻器存储的第l个神经元节点的第k个输入的正负突触权重值,则突触的有效权重值为
Figure PCTCN2019095680-appb-000031
即可实现正负突触权重值,b k为第k个神经元节点对应的偏置项,m l表示经过全连接层电路操作输出的第l个元素,
Figure PCTCN2019095680-appb-000032
为所有输出信号元素的指数和,z l为信号m l经过softmax函数后对应的概率输出值。所述softmax函数实现
Figure PCTCN2019095680-appb-000033
即将全连接层输出的值归一化为概率值的功能。由于池化层输出矩阵大小为
Figure PCTCN2019095680-appb-000034
如果最后的分类类别有l类,则本实施例中全连接层电路忆阻阵列大小为(2×k+1)×l。然后将结果传入所述输出模块即得到整个网络的输出,并将结果发送给所述权重更新模块。其中,图7中h b表示偏置项对应的输入信号。以下我们以3×3的权重矩阵为实施例来演示如何将权重矩阵映射成两个矩阵W+和W-。
图8显示了使用所提出的方法基于忆阻阵列3×3的权重矩阵W如何转换成两个一维矩阵W+和W-,两个矩阵中对应矩阵元素为0的忆阻器为未forming状态,在学习过程中始终保持高阻状态,因此忆阻阵列可以很容易地解释具有正值和负值的权重值。
图9为本发明实施例提供的权重更新模块示意图,分别连接输出模块和卷积神经网络模块,包括:结果比较单元、计算单元和驱动单元。结果比较单元分别连接输出单元与计算单元,结果比较单元将当前卷积神经网络模块的输出结果与理想结果进行比较,并将比较结果δ发送到计算单元;计算单元分别连接结果比较单元与驱动单元,计算单元接受所述结果比较单元发送的误差信号δ,根据设定的神经网络反向传播算法计算网络卷积核值或权重值的调整量Δ,并将结果发送至驱动单元;所述驱动单元包括脉冲发生器和读写电路,驱动单元接收所述计算单元发送的卷积核值或权重值 的调整量,对卷积层电路单元和全连接层电路单元的忆阻器件电导进行调整。脉冲发生器用于产生调整忆阻器件的电导调制信号。读写电路用于完成对基于忆阻器件的卷积神经网络模块的卷积核值或连接权重的读写操作。以下我们以卷积层电路模块的操作为实施例来演示忆阻阵列作为卷积核其卷积核值在学习过程如何进行更新。
图10为本发明实施例提供的权重更新阶段卷积层电路模块忆阻阵列操作的电路原理图示意图,首先选择阵列中的某一列对忆阻器电导值进行调整,该列忆阻器的列线接地,行线分别施加不同的电导调整电压脉冲信号,所施加的电压脉冲信号幅值一定,脉冲个数正比于卷积核值或突触权重值的调整量Δ,从而实现卷积核值或突触权重值的更新。以此类推到其他列,实现对整个阵列的忆阻器电导的调整。全连接层电路模块中的权重更新与此类似。
本领域的技术人员容易理解,以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。

Claims (7)

  1. 一种基于非易失存储器的卷积神经网络片上学习系统,其特征在于,包括:输入模块、卷积神经网络模块、输出模块以及权重更新模块;
    所述输入模块将输入信号转换成卷积神经网络模块所需的输入电压脉冲信号后传入所述卷积神经网络模块;
    所述卷积神经网络模块对输入信号对应的输入电压脉冲信号经过逐层计算转换以完成片上学习得到输出信号,其利用忆阻器件的电导随着施加脉冲进行改变的电导调制特性实现突触功能,片上学习过程中用到的网络卷积核值或突触权重值储存在忆阻器件中;
    所述输出模块将所述卷积神经网络模块产生的输出信号转换并发送给所述权重更新模块;
    所述权重更新模块根据输出模块的结果来计算误差信号和调整忆阻器件的电导值,实现网络卷积核值或突触权重值的更新。
  2. 根据权利要求1所述的基于非易失存储器的卷积神经网络片上学习系统,其特征在于,所述输入模块将外界输入信号转换成所述卷积神经网络所需的输入电压脉冲信号,输入信号与输入电压脉冲信号的脉冲宽度或脉冲幅度遵循正比例关系,且所述输入电压脉冲信号应小于忆阻器的擦写电压。
  3. 根据权利要求1所述的基于非易失存储器的卷积神经网络片上学习系统,其特征在于,所述卷积神经网络模块采用忆阻器件来模拟网络卷积核值和突触权重值,所述忆阻器件的电阻随着施加电信号进行改变;
    所述卷积神经网络模块包括:由忆阻器阵列作为卷积核构成的卷积层电路单元和池化层电路单元,以及由忆阻器阵列作为突触构成的全连接层电路单元;
    所述卷积层电路单元接收输入模块输出的输入电压脉冲信号,输入电压脉冲信号经过所述卷积层电路单元、池化层电路单元,以及全连接层电路单元逐层计算转换,并将计算结果作为输出信号发送至所述输出模块。
  4. 根据权利要求3所述的基于非易失存储器的卷积神经网络片上学习系统,其特征在于,所述卷积层电路单元由忆阻阵列构成的卷积运算电路和激活函数部分组成;
    所述卷积运算电路采用两行忆阻阵列作为一个卷积核来实现正负卷积核值,初始卷积核值与忆阻电导值对应时,卷积核值被映射成可以与整个输入信号进行矩阵乘法运算的矩阵,卷积核被扩展为两个大型稀疏矩阵K+和K-,分别为神经元节点对应的正负卷积核值,相应地利用忆阻器件可以施加正负读取电压脉冲的特性将输入信号转换为正输入X和负输入-X两个一维矩阵;
    所述卷积运算电路将输入电压脉冲信号与存储在忆阻器单元中的卷积核值进行卷积运算,收集同一列的电流即得到卷积运算结果,卷积运算过程为
    Figure PCTCN2019095680-appb-100001
    其中,y为卷积运算结果,
    Figure PCTCN2019095680-appb-100002
    为卷积运算符号,X为神经元节点的前端突触输入电压信号,K+和K-别为神经元节点对应的正负卷积核值,b为卷积层网络对应的偏置项,f(.)为激活函数;
    所述卷积层电路单元将卷积运算结果传入池化层电路单元;
    所述激活函数将卷积运算结果进行激活并得到y和-y两个相反输出值,同时将y和-y两个相反输出值转换成电压脉冲信号以便作为池化层电路单元的输入。
  5. 根据权利要求3所述的基于非易失存储器的卷积神经网络片上学习系统,其特征在于,所述池化层电路单元分为平均池化操作和最大池化操作,由忆阻阵列构成的池化操作电路和电压转换子单元组成;
    池化操作电路对应的忆阻阵列存储的网络卷积核值在训练过程中保持 不变,其电路结构和卷积核的映射分布和卷积层电路单元一样,只是存储的卷积核值改变;
    所述电压转换子单元将池化操作电路的结果转换成h和-h两个相反电压脉冲信号以便作为全连接层电路单元的输入。
  6. 根据权利要求4所述的基于非易失存储器的卷积神经网络片上学习系统,其特征在于,所述全连接层电路单元实现分类的功能,由忆阻阵列构成的全连接层电路和softmax函数部分组成,全连接层电路单元与卷积层电路单元的权重映射方法不同;
    所述全连接层电路用于存储与计算权重矩阵,只完成一系列乘法加法运算,没有权重矩阵的平移,利用两个忆阻器件作为一个突触实现正负权重值,所述忆阻器件的一端连接池化层电路单元,另一端连接softmax函数,收集同一列的电流即该层的输出结果,输出结果
    Figure PCTCN2019095680-appb-100003
    Figure PCTCN2019095680-appb-100004
    公式中,h k为第k个神经元节点的前端突触输入电压脉冲信号,
    Figure PCTCN2019095680-appb-100005
    Figure PCTCN2019095680-appb-100006
    分别为忆阻器存储的第l个神经元节点的第k个输入的正负突触权重值,则突触的有效权重值为
    Figure PCTCN2019095680-appb-100007
    即可实现正负突触权重值,b k为第k个神经元节点对应的偏置项,m l表示经过全连接层电路操作输出的第l个元素,
    Figure PCTCN2019095680-appb-100008
    为所有输出信号元素的指数和,z l为信号m l经过softmax函数后对应的概率输出值;
    所述softmax函数实现
    Figure PCTCN2019095680-appb-100009
    即将全连接层输出的值归一化为概率值的功能,然后将结果传入所述输出模块即得到整个卷积神经网络的输出,并将结果发送给所述权重更新模块。
  7. 根据权利要求1所述的基于非易失存储器的卷积神经网络片上学习系统,其特征在于,所述权重更新模块包括结果比较单元、计算单元和驱动单元;
    所述结果比较单元分别连接所述输出模块与所述计算单元,所述结果 比较单元将当前卷积神经网络模块的输出结果与预设理想结果进行比较,并将比较结果发送到所述计算单元;
    所述计算单元分别连接所述结果比较单元与所述驱动单元,所述计算单元接受所述结果比较单元发送的误差信号δ,根据设定的神经网络反向传播算法计算网络卷积核值或权重值的调整量,并将结果发送至驱动单元;所述驱动单元包括脉冲发生器和读写电路,驱动单元接收所述计算单元发送的卷积核值或权重值的调整量,对卷积层电路单元和全连接层电路单元的忆阻器件电导进行调整,脉冲发生器用于产生调整忆阻器件的电导调制信号;所述读写电路完成对基于忆阻器件的卷积神经网络模块的网络卷积核值或突触权重值的读写操作。
PCT/CN2019/095680 2018-09-11 2019-07-12 一种基于非易失存储器的卷积神经网络片上学习系统 WO2020052342A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/961,932 US11861489B2 (en) 2018-09-11 2019-07-12 Convolutional neural network on-chip learning system based on non-volatile memory

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811058657.8 2018-09-11
CN201811058657.8A CN109460817B (zh) 2018-09-11 2018-09-11 一种基于非易失存储器的卷积神经网络片上学习系统

Publications (1)

Publication Number Publication Date
WO2020052342A1 true WO2020052342A1 (zh) 2020-03-19

Family

ID=65606589

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/095680 WO2020052342A1 (zh) 2018-09-11 2019-07-12 一种基于非易失存储器的卷积神经网络片上学习系统

Country Status (3)

Country Link
US (1) US11861489B2 (zh)
CN (1) CN109460817B (zh)
WO (1) WO2020052342A1 (zh)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111582462A (zh) * 2020-05-21 2020-08-25 中国人民解放军国防科技大学 权值原位更新方法、装置、终端设备和可读存储介质
CN111582461A (zh) * 2020-05-21 2020-08-25 中国人民解放军国防科技大学 神经网络训练方法、装置、终端设备和可读存储介质
CN112101549A (zh) * 2020-09-22 2020-12-18 清华大学 基于忆阻器阵列的神经网络的训练方法和装置
CN112115665A (zh) * 2020-09-14 2020-12-22 上海集成电路研发中心有限公司 存算一体存储阵列及其卷积运算方法
CN112819036A (zh) * 2021-01-12 2021-05-18 华中科技大学 一种基于忆阻器阵列的球形数据分类装置及其操作方法
CN113159293A (zh) * 2021-04-27 2021-07-23 清华大学 一种用于存算融合架构的神经网络剪枝装置及方法
CN113191492A (zh) * 2021-04-14 2021-07-30 华中科技大学 一种突触训练架构
CN113469348A (zh) * 2021-06-21 2021-10-01 安徽大学 一种联想记忆中多次泛化和分化的神经形态电路
CN113642723A (zh) * 2021-07-29 2021-11-12 安徽大学 一种实现原-异位训练的gru神经网络电路
CN114399037A (zh) * 2022-03-24 2022-04-26 之江实验室 基于忆阻器的卷积神经网络加速器核心的模拟方法及装置
CN115719087A (zh) * 2022-09-08 2023-02-28 清华大学 长短期记忆神经网络电路及控制方法
CN113642723B (zh) * 2021-07-29 2024-05-31 安徽大学 一种实现原-异位训练的gru神经网络电路

Families Citing this family (56)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6708146B2 (ja) * 2017-03-03 2020-06-10 株式会社デンソー ニューラルネットワーク回路
JP6805984B2 (ja) * 2017-07-06 2020-12-23 株式会社デンソー 畳み込みニューラルネットワーク
US11715287B2 (en) 2017-11-18 2023-08-01 Neuralmagic Inc. Systems and methods for exchange of data in distributed training of machine learning algorithms
US10832133B2 (en) 2018-05-31 2020-11-10 Neuralmagic Inc. System and method of executing neural networks
US11449363B2 (en) 2018-05-31 2022-09-20 Neuralmagic Inc. Systems and methods for improved neural network execution
CN109460817B (zh) * 2018-09-11 2021-08-03 华中科技大学 一种基于非易失存储器的卷积神经网络片上学习系统
WO2020072274A1 (en) 2018-10-01 2020-04-09 Neuralmagic Inc. Systems and methods for neural network pruning with accuracy preservation
US11544559B2 (en) 2019-01-08 2023-01-03 Neuralmagic Inc. System and method for executing convolution in a neural network
CN110378193B (zh) * 2019-05-06 2022-09-06 南京邮电大学 基于忆阻器神经网络的羊绒羊毛识别方法
US20200356847A1 (en) * 2019-05-07 2020-11-12 Hrl Laboratories, Llc Transistorless all-memristor neuromorphic circuits for in-memory computing
CN110188865B (zh) * 2019-05-21 2022-04-26 深圳市商汤科技有限公司 信息处理方法及装置、电子设备和存储介质
US11436478B2 (en) * 2019-05-22 2022-09-06 Ememory Technology Inc. Control circuit for multiply accumulate circuit of neural network system
CN112308107A (zh) * 2019-07-25 2021-02-02 智力芯片有限责任公司 可重构和时间编码卷积尖峰神经网络中基于事件的特征分类
US11195095B2 (en) * 2019-08-08 2021-12-07 Neuralmagic Inc. System and method of accelerating execution of a neural network
CN110619905A (zh) * 2019-08-09 2019-12-27 上海集成电路研发中心有限公司 一种基于rram忆阻器单元的集合模块及其形成方法
CN110543933B (zh) * 2019-08-12 2022-10-21 北京大学 基于flash存算阵列的脉冲型卷积神经网络
CN112396171A (zh) * 2019-08-15 2021-02-23 杭州智芯科微电子科技有限公司 人工智能计算芯片、信号处理系统
CN110659733A (zh) * 2019-09-20 2020-01-07 上海新储集成电路有限公司 一种加速神经网络模型预测过程的处理器系统
US11681903B2 (en) * 2019-10-31 2023-06-20 Micron Technology, Inc. Spike detection in memristor crossbar array implementations of spiking neural networks
CN110796241B (zh) * 2019-11-01 2022-06-17 清华大学 基于忆阻器的神经网络的训练方法及其训练装置
CN110807519B (zh) * 2019-11-07 2023-01-17 清华大学 基于忆阻器的神经网络的并行加速方法及处理器、装置
CN112825153A (zh) * 2019-11-20 2021-05-21 华为技术有限公司 神经网络系统中数据处理的方法、神经网络系统
CN113033759A (zh) * 2019-12-09 2021-06-25 南京惟心光电系统有限公司 脉冲卷积神经网络算法、集成电路、运算装置及存储介质
CN110956256B (zh) * 2019-12-09 2022-05-17 清华大学 利用忆阻器本征噪声实现贝叶斯神经网络的方法及装置
CN111428857A (zh) * 2020-02-28 2020-07-17 上海集成电路研发中心有限公司 一种基于忆阻器的卷积运算装置及方法
CN111144558B (zh) * 2020-04-03 2020-08-18 深圳市九天睿芯科技有限公司 基于时间可变的电流积分和电荷共享的多位卷积运算模组
US11562240B2 (en) 2020-05-27 2023-01-24 International Business Machines Corporation Efficient tile mapping for row-by-row convolutional neural network mapping for analog artificial intelligence network inference
US11514326B2 (en) * 2020-06-18 2022-11-29 International Business Machines Corporation Drift regularization to counteract variation in drift coefficients for analog accelerators
CN111967586B (zh) * 2020-07-15 2023-04-07 北京大学 一种用于脉冲神经网络存内计算的芯片及计算方法
CN111950720A (zh) * 2020-08-26 2020-11-17 南京大学 一种新型类脑视觉系统
US11537890B2 (en) * 2020-09-09 2022-12-27 Microsoft Technology Licensing, Llc Compressing weights for distributed neural networks
US11556757B1 (en) 2020-12-10 2023-01-17 Neuralmagic Ltd. System and method of executing deep tensor columns in neural networks
CN112598122B (zh) * 2020-12-23 2023-09-05 北方工业大学 一种基于可变电阻式随机存储器的卷积神经网络加速器
CN112686373B (zh) * 2020-12-31 2022-11-01 上海交通大学 一种基于忆阻器的在线训练强化学习方法
CN112966814B (zh) * 2021-03-17 2023-05-05 上海新氦类脑智能科技有限公司 融合脉冲神经网络信息处理方法及融合脉冲神经网络
CN113011574B (zh) * 2021-03-22 2022-11-04 西安交通大学 一种卷积神经网络系统、忆阻器阵列和卷积神经网络
CN113076827B (zh) * 2021-03-22 2022-06-17 华中科技大学 一种传感器信号智能处理系统
CN113077046B (zh) * 2021-03-30 2022-12-30 西南大学 一种基于遗忘忆阻桥的并行多算子卷积运算器
CN113222113B (zh) * 2021-04-19 2023-10-31 西北大学 一种基于反缩放卷积层的信号生成方法及装置
CN113325650B (zh) * 2021-05-28 2023-02-28 山东云海国创云计算装备产业创新中心有限公司 一种光学电路、光信号处理方法、装置及可读存储介质
CN113466338B (zh) * 2021-07-19 2024-02-20 中国工程物理研究院计量测试中心 一种基于神经网络的塑封电子元器件缺陷识别系统及方法
CN113592084B (zh) * 2021-07-23 2022-11-11 东南大学 基于反向优化超结构卷积核的片上光子神经网络
CN113657580B (zh) * 2021-08-17 2023-11-21 重庆邮电大学 基于微环谐振器和非易失性相变材料的光子卷积神经网络加速器
CN113762480B (zh) * 2021-09-10 2024-03-19 华中科技大学 一种基于一维卷积神经网络的时间序列处理加速器
CN113792010A (zh) * 2021-09-22 2021-12-14 清华大学 存算一体芯片及数据处理方法
CN113824483A (zh) * 2021-09-22 2021-12-21 清华大学 进行波束成形的电子装置及方法
CN113989862B (zh) * 2021-10-12 2024-05-14 天津大学 一种基于嵌入式系统的纹理识别平台
US11960982B1 (en) 2021-10-21 2024-04-16 Neuralmagic, Inc. System and method of determining and executing deep tensor columns in neural networks
KR102595529B1 (ko) * 2021-11-04 2023-10-27 서울대학교산학협력단 시간 커널 소자, 시간 커널 컴퓨팅 시스템 및 그들의 동작 방법
CN114330694A (zh) * 2021-12-31 2022-04-12 上海集成电路装备材料产业创新中心有限公司 实现卷积运算的电路及其方法
CN114463161B (zh) * 2022-04-12 2022-09-13 之江实验室 一种基于忆阻器的神经网络处理连续图像的方法和装置
CN114597232B (zh) * 2022-05-10 2022-07-19 华中科技大学 一种实现负权重的矩阵乘和运算的crossbar器件制备方法
CN115238857B (zh) * 2022-06-15 2023-05-05 北京融合未来技术有限公司 基于脉冲信号的神经网络及脉冲信号处理方法
CN115378814A (zh) * 2022-08-22 2022-11-22 中国科学院微电子研究所 一种储备池计算网络优化方法及相关装置
CN117031307A (zh) * 2023-08-17 2023-11-10 广州汽车集团股份有限公司 一种蓄电池寿命预测方法及其装置、存储介质
CN116863490B (zh) * 2023-09-04 2023-12-12 之江实验室 面向FeFET存储阵列的数字识别方法及硬件加速器

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180018559A1 (en) * 2016-07-14 2018-01-18 University Of Dayton Analog neuromorphic circuits for dot-product operation implementing resistive memories
CN107742153A (zh) * 2017-10-20 2018-02-27 华中科技大学 一种基于忆阻器的具有稳态可塑性的神经元电路
CN108182471A (zh) * 2018-01-24 2018-06-19 上海岳芯电子科技有限公司 一种卷积神经网络推理加速器及方法
CN109460817A (zh) * 2018-09-11 2019-03-12 华中科技大学 一种基于非易失存储器的卷积神经网络片上学习系统

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR3025344B1 (fr) * 2014-08-28 2017-11-24 Commissariat Energie Atomique Reseau de neurones convolutionnels
US9646243B1 (en) * 2016-09-12 2017-05-09 International Business Machines Corporation Convolutional neural networks using resistive processing unit array
CN106650922B (zh) * 2016-09-29 2019-05-03 清华大学 硬件神经网络转换方法、计算装置、软硬件协作系统
US9852790B1 (en) * 2016-10-26 2017-12-26 International Business Machines Corporation Circuit methodology for highly linear and symmetric resistive processing unit
US20210019609A1 (en) * 2017-04-27 2021-01-21 The Regents Of The University Of California Mixed signal neuromorphic computing with nonvolatile memory devices
CN107133668A (zh) * 2017-04-28 2017-09-05 北京大学 一种基于模糊玻尔兹曼机的忆阻神经网络训练方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180018559A1 (en) * 2016-07-14 2018-01-18 University Of Dayton Analog neuromorphic circuits for dot-product operation implementing resistive memories
CN107742153A (zh) * 2017-10-20 2018-02-27 华中科技大学 一种基于忆阻器的具有稳态可塑性的神经元电路
CN108182471A (zh) * 2018-01-24 2018-06-19 上海岳芯电子科技有限公司 一种卷积神经网络推理加速器及方法
CN109460817A (zh) * 2018-09-11 2019-03-12 华中科技大学 一种基于非易失存储器的卷积神经网络片上学习系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HU, FEI ET AL.: "Circuit Design of Convolutional Neural Network Based on Memristor Crossbar Arrays", JOURNAL OF COMPUTER RESEARCH AND DEVELOPMENT, 31 May 2018 (2018-05-31), pages 1098 - 1104 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111582462A (zh) * 2020-05-21 2020-08-25 中国人民解放军国防科技大学 权值原位更新方法、装置、终端设备和可读存储介质
CN111582461A (zh) * 2020-05-21 2020-08-25 中国人民解放军国防科技大学 神经网络训练方法、装置、终端设备和可读存储介质
CN111582461B (zh) * 2020-05-21 2023-04-14 中国人民解放军国防科技大学 神经网络训练方法、装置、终端设备和可读存储介质
CN112115665B (zh) * 2020-09-14 2023-11-07 上海集成电路研发中心有限公司 存算一体存储阵列及其卷积运算方法
CN112115665A (zh) * 2020-09-14 2020-12-22 上海集成电路研发中心有限公司 存算一体存储阵列及其卷积运算方法
CN112101549B (zh) * 2020-09-22 2024-05-10 清华大学 基于忆阻器阵列的神经网络的训练方法和装置
CN112101549A (zh) * 2020-09-22 2020-12-18 清华大学 基于忆阻器阵列的神经网络的训练方法和装置
CN112819036B (zh) * 2021-01-12 2024-03-19 华中科技大学 一种基于忆阻器阵列的球形数据分类装置及其操作方法
CN112819036A (zh) * 2021-01-12 2021-05-18 华中科技大学 一种基于忆阻器阵列的球形数据分类装置及其操作方法
CN113191492A (zh) * 2021-04-14 2021-07-30 华中科技大学 一种突触训练架构
CN113191492B (zh) * 2021-04-14 2022-09-27 华中科技大学 一种突触训练装置
CN113159293A (zh) * 2021-04-27 2021-07-23 清华大学 一种用于存算融合架构的神经网络剪枝装置及方法
CN113159293B (zh) * 2021-04-27 2022-05-06 清华大学 一种用于存算融合架构的神经网络剪枝装置及方法
CN113469348A (zh) * 2021-06-21 2021-10-01 安徽大学 一种联想记忆中多次泛化和分化的神经形态电路
CN113469348B (zh) * 2021-06-21 2024-02-20 安徽大学 一种联想记忆中多次泛化和分化的神经形态电路
CN113642723A (zh) * 2021-07-29 2021-11-12 安徽大学 一种实现原-异位训练的gru神经网络电路
CN113642723B (zh) * 2021-07-29 2024-05-31 安徽大学 一种实现原-异位训练的gru神经网络电路
CN114399037A (zh) * 2022-03-24 2022-04-26 之江实验室 基于忆阻器的卷积神经网络加速器核心的模拟方法及装置
CN115719087A (zh) * 2022-09-08 2023-02-28 清华大学 长短期记忆神经网络电路及控制方法

Also Published As

Publication number Publication date
CN109460817B (zh) 2021-08-03
CN109460817A (zh) 2019-03-12
US11861489B2 (en) 2024-01-02
US20200342301A1 (en) 2020-10-29

Similar Documents

Publication Publication Date Title
WO2020052342A1 (zh) 一种基于非易失存储器的卷积神经网络片上学习系统
CN108805270B (zh) 一种基于存储器的卷积神经网络系统
Yakopcic et al. Memristor crossbar deep network implementation based on a convolutional neural network
US10740671B2 (en) Convolutional neural networks using resistive processing unit array
Yu et al. An overview of neuromorphic computing for artificial intelligence enabled hardware-based hopfield neural network
JP7266330B2 (ja) メモリスタ誤差に対するメモリスタメモリニューラルネットワークトレーニング方法
Unnikrishnan et al. Alopex: A correlation-based learning algorithm for feedforward and recurrent neural networks
WO2021098821A1 (zh) 神经网络系统中数据处理的方法、神经网络系统
US20200117986A1 (en) Efficient processing of convolutional neural network layers using analog-memory-based hardware
US11087204B2 (en) Resistive processing unit with multiple weight readers
Hasan et al. On-chip training of memristor based deep neural networks
Dong et al. Convolutional neural networks based on RRAM devices for image recognition and online learning tasks
CN110852429B (zh) 一种基于1t1r的卷积神经网络电路及其操作方法
Zhang et al. Memristive quantized neural networks: A novel approach to accelerate deep learning on-chip
US20210374546A1 (en) Row-by-row convolutional neural network mapping for analog artificial intelligence network training
Ravichandran et al. Artificial neural networks based on memristive devices
US20210192327A1 (en) Apparatus and method for neural network computation
Huang et al. Memristor neural network design
Sun et al. Low-consumption neuromorphic memristor architecture based on convolutional neural networks
KR20210143614A (ko) 뉴럴 네트워크를 구현하는 뉴로모픽 장치 및 그 동작 방법
Werbos Supervised learning: Can it escape its local minimum?
Zhang et al. Memristive fuzzy deep learning systems
Sun et al. Quaternary synapses network for memristor-based spiking convolutional neural networks
US11868893B2 (en) Efficient tile mapping for row-by-row convolutional neural network mapping for analog artificial intelligence network inference
Zhang et al. Automatic learning rate adaption for memristive deep learning systems

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19859568

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19859568

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19859568

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 15/09/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19859568

Country of ref document: EP

Kind code of ref document: A1