CN108053029B

CN108053029B - Neural network training method based on storage array

Info

Publication number: CN108053029B
Application number: CN201711446484.2A
Authority: CN
Inventors: 张睿
Original assignee: Shanghai Shanyi Semiconductor Co Ltd
Current assignee: Shanghai Shanyi Semiconductor Co Ltd
Priority date: 2017-12-27
Filing date: 2017-12-27
Publication date: 2021-08-27
Anticipated expiration: 2037-12-27
Also published as: CN108053029A

Abstract

The invention provides a training method of a neural network based on storage arrays, which is characterized in that when parameters of connection weights of the storage arrays are modified, input data propagated in the forward direction of the storage arrays are discretized and error data propagated in the reverse direction of the storage arrays are discretized respectively, so that input discrete values and error discrete values are obtained respectively, and modification conditions of the connection weights are determined through the input discrete values and the error discrete values. According to the method, when the connection weight is modified, the adjustment is carried out according to a preset modification condition, namely the modification amplitude is random, but not according to a specific weight modification target value, so that the difference between the actual modification requirement of the neural network algorithm and the specific characteristic of the storage device can be made up, the purpose of output result convergence is achieved through multiple times of training and random modification, and a satisfactory training result is obtained.

Description

Neural network training method based on storage array

Technical Field

The invention relates to the field of neural network integrated circuit design, in particular to a neural network training method based on a storage array.

Background

Neural Networks (NN) are algorithmic mathematical models which simulate animal neural Network behavior characteristics and perform distributed parallel information processing, and the algorithmic models are widely applied to the artificial intelligence fields of voice recognition, image recognition, automatic driving and the like.

In the neural network, the purpose of processing information is achieved by adjusting the interconnection relationship among a large number of internal nodes according to the complexity of the system. In the implementation algorithm of the neural network, the neural network needs to be trained so as to obtain a converged neural network algorithm, as shown in fig. 1, in the training process, when forward propagation is performed, information of an input layer is transmitted to an output layer by layer through a hidden layer; and comparing the processing result of the output layer with the answer label to generate an error, transmitting the generated error back to the input layer by layer through the hidden layer when the error is transmitted in the backward direction, and modifying the parameters of the connection weight of each layer until the algorithm is converged.

In the neural network algorithm processing process, data processing between layers is matrix operation, the matrix operation amount is also sharply enlarged along with the complexity and scale of the neural network function, a mode of realizing the operation process through a CPU (central processing unit) or a special processor and a memory consumes a large amount of processing time and is high in equipment cost, at present, a mode of realizing the matrix operation by using a storage array is provided, the storage array is composed of a nonvolatile memory, and due to the storage characteristic of the nonvolatile memory, parameters of connection weight can be represented by data stored in the memory, so that the matrix operation between the layers is realized. The method can effectively improve the operation scale and the processing speed of the neural network, but in the training process, the connection weight is modified by writing and erasing the memory, however, due to the characteristics of the memory device, under a specific connection weight modification value, the obtained modification value is always in random distribution, and a satisfactory training result is difficult to obtain.

Disclosure of Invention

In view of the above, the present invention provides a method for training a neural network based on a storage array, which overcomes the problem of randomness of a weight modification value of the storage array and obtains a satisfactory training result.

In order to achieve the purpose, the invention has the following technical scheme:

a training method of a neural network based on storage arrays utilizes a plurality of storage arrays to train the neural network, each storage array is respectively used for matrix operation among layers of the neural network, each storage array is composed of storage units comprising nonvolatile memories, and storage data in the storage arrays are used for representing connection weights among the layers, and the training method comprises the following steps:

performing multiple sample training until the output error is converged;

wherein, the parameter modification of the connection weight of each storage array in each sample training comprises the following steps:

discretizing the input data propagated in the forward direction of the storage array according to a preset mapping relation between a first continuous interval and a first discrete value to obtain an input discrete value;

discretizing error data reversely propagated by the storage array according to a preset mapping relation between a second continuous interval and a second discrete value to obtain an error discrete value, wherein one of the first continuous interval or the second continuous interval at least comprises three continuous intervals;

determining a modification condition of a connection weight according to the inverse number of the weight variation quantity which is proportional to the product of the forward-propagated input data and the backward-propagated error data, wherein the modification condition is a preset erasing operation bias, a preset writing operation bias or a preset non-operation bias;

and biasing the corresponding nonvolatile memories according to the modification conditions.

Optionally, the first continuous interval includes a zero-value interval and at least one positive-value interval, the value of the first discrete value increases with the increase of the interval value of the first continuous interval, and the value sign is the sign of the corresponding interval; the second continuous interval comprises at least one positive value interval, a zero value interval and at least one negative value interval, the value of the second discrete value increases along with the increase of the value of the interval of the second continuous interval and the value sign of the second discrete value is the sign of the corresponding interval.

Optionally, the first continuous interval further comprises at least one negative value interval.

Optionally, the determining, by the input discrete value and the error discrete value, a modification condition of the connection weight includes:

and determining the modification condition of the connection weight through the inverse number of the product of the input discrete value and the error discrete value.

Optionally, the erasing bias and the writing bias comprise a plurality of levels, and the higher the level is, the larger the modification amplitude is; then the process of the first step is carried out,

determining a modification condition of the connection weight by the inverse of the product of the input discrete value and the error discrete value, including:

determining the type of modification condition of the connection weight by the inverse number of the product of the input discrete value and the error discrete value;

selecting a level of the type of the modification condition according to an absolute value of a product of the input discrete value and the error discrete value, a larger absolute value corresponding to a higher level.

Optionally, the different levels of erase operation bias correspond to different erase operation voltage pulse values and/or different erase operation voltage pulse durations and/or different numbers of erase operation voltage pulses; the different levels of the write operation bias correspond to different write operation voltage pulse values and/or different write operation voltage pulse durations and/or different numbers of write operation voltage pulses.

Optionally, the modification condition is such that the modification variation amount is less than ten percent of the total conductance variation range of the nonvolatile memory.

Optionally, biasing the corresponding non-volatile memory according to the modification condition includes:

and according to the modification conditions, biasing the nonvolatile memories in the storage arrays with the same modification conditions at the same time.

Optionally, in each of the memory arrays, a first source/drain of each of the nonvolatile memories in the first direction is electrically connected to the first electrical connection line, a second source/drain of each of the nonvolatile memories in the second direction is electrically connected to the second electrical connection line, and a gate of each of the nonvolatile memories in the first direction or the second direction is electrically connected to the third electrical connection line;

the first electrical connection is used for loading an input signal in forward propagation, and the second electrical connection is used for outputting an output signal in the forward propagation; the second electrical connection is used for loading an input signal in the reverse propagation, and the first electrical connection is used for outputting an output signal in the reverse propagation.

Optionally, the memory unit further includes an MOS device, a first source drain of the nonvolatile memory is electrically connected to a second source drain of the MOS device, the first source drain of the MOS device is electrically connected to the first electrical connection line, and a gate of each field effect transistor in the first direction or the second direction is electrically connected to the fourth electrical connection line.

Optionally, the memory cell further includes MOS devices sharing a channel with the nonvolatile memory, and a gate of each of the MOS devices in the first direction or the second direction is electrically connected to the fourth electrical connection line.

In the method for training the neural network based on the storage arrays, provided by the embodiment of the invention, when parameters of the connection weights of the storage arrays are modified, input data propagated in the forward direction of the storage arrays are discretized and error data propagated in the reverse direction of the storage arrays are discretized respectively, so that input discrete values and error discrete values are obtained respectively, and modification conditions of the connection weights are determined through the input discrete values and the error discrete values. According to the method, when the connection weight is modified, the adjustment is carried out according to a preset modification condition, namely the modification amplitude is random, but not according to a specific weight modification target value, so that the difference between the actual modification requirement of the neural network algorithm and the specific characteristic of the storage device can be made up, the purpose of output result convergence is achieved through multiple times of training and random modification, and a satisfactory training result is obtained.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 shows a schematic diagram of a neural network training process;

FIG. 2 shows a schematic diagram of a neural network algorithm;

FIG. 3 shows a schematic diagram of the forward and backward propagation of a neural network;

FIG. 4 is a schematic diagram of a memory array according to an embodiment of the present invention;

FIG. 5 shows a flow diagram of weight modification according to an embodiment of the invention;

fig. 6-9 are schematic diagrams of structures of memory arrays according to various embodiments of the invention.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways than those specifically described and will be readily apparent to those of ordinary skill in the art without departing from the spirit of the present invention, and therefore the present invention is not limited to the specific embodiments disclosed below.

Next, the present invention will be described in detail with reference to the drawings, wherein the cross-sectional views illustrating the structure of the device are not enlarged partially according to the general scale for convenience of illustration when describing the embodiments of the present invention, and the drawings are only examples, which should not limit the scope of the present invention. In addition, the three-dimensional dimensions of length, width and depth should be included in the actual fabrication.

As described in the background art, a solution for performing matrix operations in a neural network using a memory array is proposed, in which the memory array is composed of a nonvolatile memory, and parameters of connection weights can be represented by data stored in the memory due to the memory characteristics of the nonvolatile memory, thereby implementing matrix operations between layers. However, in the course of training the neural network, the memory is erased to modify the connection weights, however, due to the characteristics of the memory device itself, under a specific connection weight modification value, the obtained modification values are often randomly distributed, and it is difficult to obtain a satisfactory training result. Therefore, the application provides a neural network training method based on a storage array, and the training result conforming to the neural network algorithm is realized.

In order to better understand the technical scheme and technical effect of the invention, a neural network algorithm, a training process and a basic calculation are described first. Referring to fig. 1, a schematic diagram of a process of neural network training is shown, in each training process, a set of input data representing a training sample is input to a neural network algorithm for calculation, a calculation result is output, the calculation result is compared with an answer label, if an output error after comparison is not convergent, the output error is fed back to the neural network algorithm, an error value of each layer is obtained, a weight value between each layer in the neural network algorithm is modified according to the error value, and the training process is repeated until the output error is convergent, that is, the calculation result is close to the answer label. In each training process, the process from input to output is called forward propagation, and the process from output error feedback to the neural network is backward propagation.

Referring to fig. 2, a neural network algorithm is described for an example of a three-layer neural network. In this example, an input layer, a hidden layer and an output layer are included, the number of nodes is m, n and k, respectively, and the activation function of the node is θ. In forward propagation, for two adjacent layers, input data of each node of the current layer is a numerical value obtained after weighted summation of an output vector of the previous layer and connection weights of the current layer, the weighted summation operation is a matrix operation process in forward propagation, the input data are activated by the node of the current layer to obtain the output vector of the current layer, and an output result is obtained through layer-by-layer calculation. As shown in FIG. 2, the previous layer of the hidden layer is an input layer, the output vector of the input layer is an input vector, and the input vector of the input layer is (x)₁,x₂....x_m) Then input of the hidden layer

Inputting a vector x for an input layer_iConnection weights to hidden layers

A weighted sum of (i) i.e.

j is from 1 to n, the operation of weighted summation is a matrix operation in forward propagation, the input

After each node of the hidden layer is activated, an output vector is obtained

Input data of output layer

Output vector as hidden layer

Connection weights with output layer

A weighted sum of (i) i.e.

q is from 1 to k, the operation of weighted summation is another matrix operation in forward propagation, the input data

After being activated by each node of the output layer, the output result is obtained

This completes the forward propagation. The output result

After comparison with the answer label, the output error is e₁,e₂....e_kIf the output error does not converge, the output error propagates backward from the output layer to the other layers, the backward propagation differs from the forward propagation only in the propagation direction and the input data, and the input vector is the error (e) in the backward propagation₁,e₂....e_k) Error e, backward propagation from output layer to output layer and hidden layers_qAfter backward transmission through the output layer node, the error is output

The error is

Connection weight between output layer and hidden layer

After weighted summation, the error is output after transmission through the hidden layer node

Errors are transmitted to each node through layer-by-layer operation, so that the connection weight is modified according to the errors of each node, and the specific calculation method for weight modification can be determined according to a gradient descent method.

To facilitate understanding of forward propagation, backward propagation and matrix operations in the neural network, referring to fig. 3, a schematic diagram of forward propagation and backward propagation of the neural network in fig. 2 is shown, wherein circles represent nodes of layers, and a matrix composed of connection weights is between layers, it can be known that a matrix composed of connection weights exists between every two adjacent layers, and is used for weighted summation of input vectors in forward propagation and weighted summation of error vectors in backward propagation. As shown in FIG. 3, in this example, a first weight matrix from the input layer to the hidden layer is included

And a second weight matrix from the hidden layer to the output layer

The weight matrix is used for matrix operation of forward propagation input data and matrix operation of backward propagation error data. With a second weight matrix

For example, in forward propagation, the input to the matrix operation is the output from the hidden layer node

And

after weighted summation, the matrix operation is transmitted to the node of the output layer, and in the backward propagation, the input of the matrix operation is from the output of the node of the output layer

And

and after the matrix operation of weighted summation is carried out, the weighted summation is transmitted to a node of the hidden layer. That is, for the same weight matrix, during sample training, a weighted summation operation of forward propagation input data and backward propagation error data is performed, the input data is output data of a front-layer node, and the error data is error data of a rear-layer node.

The above matrix operation may be implemented by a memory array composed of memory cells including a nonvolatile memory, the above matrix operation being implemented by one memory array between each layer, as shown with reference to fig. 4, the nonvolatile memory in each memory cell being used to store the connection weight W_ijThe nonvolatile memory may be a single memory composed of one nonvolatile memory or a composite memory composed of a plurality of nonvolatile memories, and the weight W is connected_ijIs equivalent to a conductance value or a combination of conductance values of the memory, then each memory array is used for a matrix operation between adjacent layers, the matrix operation including a weighted sum operation of forward propagating input vectors and a weighted sum operation of backward propagating error vectors, the weighted sum operation being referred to as a matrix operation for convenience of description, one end of the matrix being used for loading the forward propagating data signal X_iThe other end of the matrix is used to load the error data delta which is propagated in reverse_j. It is understood that, besides the matrix operation, the neural network algorithm may be trained by other operations, such as activation operation, which may be implemented by other devices, and the implementation of these devices and the connection to the memory array are not limited by the present invention.

Based on the memory array, when training the neural network, if the output error is not converged after each sample training, the connection weight needs to be modified, that is, the stored data of the memory device in the memory array is erased and written, and the conductance value of the memory is changed, so that the purpose of modifying the connection weight is achieved, and under the target modified value of the specific connection weight, the stored data is difficult to be accurately modified to the required conductance value, and the training result conforming to the neural network algorithm is difficult to be realized.

To this end, based on the storage array, the present invention provides a training method of a neural network, the training being performed based on the storage array, the training of the neural network being performed using a plurality of storage arrays, each of the storage arrays being used for matrix operation between layers of the neural network, each of the storage arrays being formed of storage units including a nonvolatile memory, storage data in the storage arrays being used for characterizing connection weights between the layers, the training method including:

performing multiple sample training until the output error is converged;

determining modification conditions of connection weights according to the inverse number of the weight variation quantity which is proportional to the product of the forward-propagated input data and the backward-propagated error data, and determining the modification conditions of the connection weights through the input discrete values and the error discrete values, wherein the modification conditions are preset erasing operation bias, writing operation bias or non-operation bias;

In the training method, when parameters of the connection weight of each storage array are modified, input data propagated in the forward direction of the storage array are discretized, and error data propagated in the backward direction of the storage array are discretized, so that an input discrete value and an error discrete value are obtained, and the modification condition of the connection weight is determined through the input discrete value and the error discrete value. According to the method, when the connection weight is modified, the adjustment is carried out according to a preset modification condition, namely the modification amplitude is random, but not according to a specific weight modification target value, so that the difference between the actual modification requirement of the neural network algorithm and the specific characteristic of the storage device can be made up, the purpose of output result convergence is achieved through multiple times of training and random modification, and a satisfactory training result is obtained.

In the training process of the neural network algorithm, a plurality of sample trainings are needed until an output error is converged, in each sample training, the steps of sample input, forward propagation, output result, output error obtained by comparison, backward propagation, connection weight modification and the like are included, in the application, matrix operation in the forward propagation and the backward propagation is performed in a storage array, the connection weight modification is also realized by adjusting storage data of a corresponding storage in the storage array, and comparison of other operations in the forward propagation and the backward propagation and the output error is not limited and can be realized by adopting a proper device and a proper mode according to needs.

It is understood that a plurality of storage arrays may be included in the neural network, each storage array is used for matrix operation between adjacent layers, and the method for modifying the parameters of the connection weights of each storage array in each sample training is the same, and the detailed description will be given below in conjunction with specific embodiments on the modification of the parameters of the connection weights of each storage array in each sample training.

Referring to fig. 5, in step S01, the input data propagated in the forward direction of the storage array is discretized according to a mapping relationship between a preset first continuous interval and a first discrete value to obtain an input discrete value.

In step S02, discretizing the error data reversely propagated by the storage array according to a mapping relationship between a preset second continuous interval and a second discrete value to obtain an error discrete value, where one of the first continuous interval or the second continuous interval includes at least three sub-intervals.

For the same storage array, matrix operations between two adjacent layers are performed, including matrix operations in forward propagation and matrix operations in backward propagation, as shown in fig. 3 and 4, for convenience of description and understanding, if one of the two adjacent layers is referred to as a current layer and the other is referred to as a next layer, for the storage array, input data in forward propagation is output data of a node of the current layer, and error data in backward transmission is error data output by a node of the next layer. In one example, referring to FIG. 4, the output data of the first hidden layer node of the matrix operation array between the first hidden layer and the output layer of the memory array is

I.e. input data in forward propagation, and is denoted by X for convenience of description_iError data output from output layer node

I.e., error data in the back propagation, is denoted as δ for convenience of description_jFor the memory array, the input data in forward propagation is the output data X from the current level node_iError data in backward propagation is error data delta from the next layer node_j。

In this step, firstly, discretizing input data propagated in a forward direction of the storage array and discretizing error data propagated in a reverse direction of the storage array, in the discretizing, discretizing is performed according to a mapping relation between preset continuous intervals and discrete values, one of the first continuous interval or the second continuous interval at least comprises three continuous intervals, and the number of the continuous intervals, the division of the intervals and the size of the corresponding discrete values can be determined according to specific needs, so that modification conditions of connection weights can be determined by the input discrete values and the error discrete values obtained after discretization.

In discretization, input data X propagated in the forward direction of the storage array can be obtained according to a preset mapping relation_iDiscretizing to obtain input discrete value x_iAnd error data δ counter propagating the storage array_jDiscretizing to obtain error discrete value epsilon_jThus, specific data of the input data and the error data are converted into a plurality of determined values, so that the weight data Δ W of each memory in the memory array can be determined through the determined values_ijThe modification condition of (1).

And S03, determining a modification condition of the connection weight according to the weight variation quantity which is proportional to the inverse number of the product of the forward-propagating input data and the backward-propagating error data, wherein the modification condition is a preset erasing operation bias, a preset writing operation bias or a preset non-operation bias.

In each training, the weight modifier Δ W_ijCan be considered to be proportional to the propagated input data X_iError data delta from back propagation_jThe inverse of the product of (a) is formulated as: Δ W_ij∝-(X_i×δ_j) Then, according to the relationship, by the input discrete value x after discretization_iAnd error dispersion value epsilon_jThe connection weight W can be determined_ijThe modification condition of (1) can be a preset erasing operation bias, a writing operation bias or a non-operation bias, under the modification conditions, the operation of increasing, decreasing or not modifying can be realized on the basis of the existing connection weight, and the preset modification condition is a fixed modification condition and is not a modification condition corresponding to a numerical value which actually needs to be modified.

Due to the input of a discrete value x_iAnd error dispersion value epsilon_jCorresponding to different value intervals, then by inputting the discrete value x specifically_iAnd error dispersion value epsilon_jCan know the actual valueAnd the weight variation is proportional to the inverse of the product of the actual values of the input data and the error data, the specific modification condition of the weight can be determined to be an erasing operation, a writing operation or a non-operation through the relationship between the discrete value and the value interval. E.g. an input discrete value x from an interval_iAnd an error variance value epsilon of an interval_jWhich is connected to the weight W corresponding to the weight value_ijIn order not to modify, the modification condition is not to operate the bias; input discrete value x from another interval_iAnd error dispersion value epsilon_jWhich is connected to the weight W corresponding to the weight value_ijIf the modification direction is increased, the modification condition is a predetermined erase operation bias.

In some preferred embodiments, the first continuous interval may include a zero-value interval and a positive-value interval, and may further include a negative-value interval, a zero-value interval, and a positive-value interval, each interval corresponds to a first discrete value, the second continuous interval may include a negative-value interval, a zero-value interval, and a positive-value interval, each interval corresponds to a second discrete value, the zero-value interval refers to an interval in which a value range is located near the zero value, the number of the negative-value interval and the positive-value interval may be one or more, and the first discrete value and the second discrete value may be specifically set as needed, so that different situations of the interval from which the input data and the error data come can be identified by a combination of the first discrete value and the second discrete value, and thus, it is possible to know a modification condition that the weight data of the corresponding interval is increased, decreased, or unchanged.

Preferably, the modification condition is such that the modification variation is less than ten percent of the total conductance variation range of the non-volatile memory, that is, the magnitude of the modification is small enough, and in each weight modification, the magnitude of the rewriting is not a precise connection weight modification value obtained corresponding to each training, but a specific small enough value, which is equivalent to only ensuring the modification direction in each training and the modification magnitude is a random modification, so that after a sufficient number of training, the output error is gradually converged to obtain a satisfactory training result.

In a more preferred embodiment, the first continuous interval may include a zero value interval and at least one positive value interval, and may further include at least one negative value interval, a zero value interval and at least one positive value interval, according to different needs, the value of the first discrete value increases with the increase of the interval value of the first continuous interval, and the value sign is the sign of the corresponding interval; the second continuous interval comprises at least one positive value interval, a zero value interval and at least one negative value interval, the value of the second discrete value increases along with the increase of the interval value of the second continuous interval and has a value sign as the sign of the corresponding interval, wherein the zero value interval refers to the interval in which the value range is located near the zero value, the value in the negative value interval is negative, the sign of the negative value interval is negative, the value in the positive value interval is positive, the sign of the zero value interval can be positive or negative, and the discrete value and the corresponding interval have the same sign.

According to specific needs, the number and division of each interval in the first continuous interval and the second continuous interval may be the same or different, and the first discrete value and the second discrete value corresponding to each interval may be the same or different. In this arrangement, the sign and value of the discrete value represent the change in the value of the interval, and the discrete value x can be directly input_iAnd error dispersion value epsilon_jThe absolute value of the sign and the product of (a) determines the modification direction and the modification magnitude of the weight, and in a specific application, when-x_i×ε_jWhen the weight is less than 0, the corresponding connection weight W can be considered_ijIf the modification direction is smaller, determining the modification condition as a preset erasing operation bias; when-x_i×ε_jIf greater than 0, the corresponding connection weight W can be considered_ijThe modification direction is increased, and the modification condition is determined to be the preset write operation bias; -x_i×ε_jWhen the value is close to 0, the corresponding connection weight W can be considered_ijAnd determining the modification condition as a preset non-operation bias for non-modification.

In a specific application, the erasing operation bias and the writing operation bias comprise one or more levels, and when the level is only one, the erasing operation bias and the writing operation bias respectively correspond to one erasing operation circuitA voltage and a write operation voltage. When the error correction is a plurality of levels, the higher the level is, the larger the modification amplitude is, and the inverse-x of the product of the input discrete value and the error discrete value is_i×ε_jDetermining the type of modification condition for the connection weight, i.e. the modification is in particular a write operation, an erase operation or no operation, and further, based on the product-x of the input discrete value and the error discrete value_i×ε_jThe absolute value of the product is greater, the selected rank is higher.

Specifically, the different levels of the erase operation bias may correspond to different erase operation voltage pulse values and/or different erase operation voltage pulse durations and/or different numbers of erase operation voltage pulses; the different levels of the write operation bias may correspond to different write operation voltage pulse values and/or different write operation voltage pulse durations and/or different numbers of write operation voltage pulses, with larger voltage pulse values or pulse durations or larger numbers of pulses corresponding to larger modification amplitudes.

And S04, biasing the corresponding nonvolatile memory according to the modification condition.

After the modification conditions are determined, the stored data in the corresponding non-volatile memory are biased, and if the input data corresponds to a row of the array of input values and the error data corresponds to a column of the array, then the discrete value x is input_iAnd error dispersion value epsilon_jThe corresponding nonvolatile memory is the memory at the position of the ith row and the jth column.

The modification of the connection weight is realized by changing the storage data in the nonvolatile memory, and corresponding erasing and writing voltages can be applied to the memory, so that the corresponding conductance value is increased or decreased, and the connection weight characterized by the conductance value is increased or decreased, or a non-erasing voltage is applied, so that the conductance value is unchanged and the connection weight characterized by the conductance value is unchanged.

In the specific adjustment, the memories in the array can be modified one by one or in parallel, and in the parallel modification, the nonvolatile memories in the storage array with the same modification condition are biased simultaneously. Therefore, the modification of all the connection weight parameters in the whole array can be completed through several modifications, so that the parallel modification is realized, and the calculation effect of the neural network is improved.

Therefore, parameter modification of the connection weight of each storage array in one sample training is realized, and the modification is repeated for multiple times until the output error is converged.

In order to facilitate understanding of the technical solutions and technical effects of the present application, a specific example is described below, and in the specific example, a mapping relationship between the first continuous interval and the first discrete value is shown in the following table. The mapping relationship between the second continuous interval and the second discrete value is shown in the following table two. The first continuous interval and the second continuous interval respectively comprise three continuous intervals, the positive value interval and the negative value interval are respectively one, and the positive value interval, the zero value interval and the negative value interval respectively correspond to

discrete values

1,0 and-1.

Watch 1

First continuous interval	First discrete value
		Positive interval (0.18,1)	1
Interval of zero value [ -0.18,0.18 [ -0.18 [ ]]	0
		Negative interval (-1, -0.18)	-1

Watch two

Second continuous interval	Second discrete value
		Positive interval (0.1,1)	1
Interval of zero value [ -0.1,0.1 [ -0.1 [ ]]	0
		Negative interval (-1, -0.1)	-1

Discretizing the input data and the error data according to the mapping relation of the first table and the second table, and then-x_i×ε_jThere may be several cases according to aw by weight modification according to different situations_ij∝-X_i×δ_jThe corresponding W can be determined_ijSee table three below.

Watch III

Situation(s)	Inputting discrete values	Discrete value of error	Weight modifying basis	Modifying conditions	Change in conductance value
						1	x_i＝1	ε_j＝1	ΔW_ij∝-X_i×δ_j<0	Write operation voltage 10V	Become smaller
2	x_i＝-1	ε_j＝-1	ΔW_ij∝-X_i×δ_j<0	Write operation voltage 10V	Become smaller
						3	x_i＝1	ε_j＝-1	ΔW_ij∝-X_i×δ_j>0	Erase operating voltage-10V	Become larger than
4	x_i＝-1	ε_j＝1	ΔW_ij∝-X_i×δ_j>0	Erase operating voltage-10V	Become larger than
						5	x_i＝0	ε _j1 or-1	0	Non-erasable voltage	Is not changed
6	x_i1 or-1	ε_j＝0	0	Non-erasable voltage	Is not changed
						7	x_i＝0	ε_j＝0	0	Non-erasable voltage	Is not changed

In this particular example, a discrete value x is input_iAnd error dispersion value epsilon_jThere may be a plurality of combinations, each of which may correspond to one type of the respective modification condition, i.e. erase bias, write bias or no-operation bias, and the bias condition may comprise the voltage pulse value, duration and/or number of pulses of the bias, due to the input discrete value x_iAnd error dispersion value epsilon_jRespectively correspond to and input data X_iSum error data δ_jIn the interval, by inputting discrete value x_iAnd error dispersion value epsilon_jThe corresponding modification condition can be determined, of course, in this particular example, the discrete value x is input_iAnd error dispersion value epsilon_jThe sign and the value represent the numerical value change of the interval, the inverse number of the product of the input discrete value and the error discrete value can be directly determined, the modification condition of the connection weight can be determined, and the conductance value, namely the connection weight can be changed to be smaller, larger and unchanged under the modification conditions of the erasing operation bias, the writing operation bias and the non-operating bias, in this example, the writing operation and the erasing operation are in one grade.

In another example, when the input data and/or the error data are discretized into more discrete values, such as-2, -1,0, 1,2, the larger discrete values correspond to larger data intervals, at this time, different write operation voltages and erase operation voltages can be set, the different voltages correspond to different bias voltage values and/or bias durations, and different amplitude modifications can be realized, as shown in table four below, by taking the write operation as an example.

Watch four

Situation(s)	Inputting discrete values	Discrete value of error	Weight modifying basis	Modifying conditions	Change in conductance value
						1	x_i1 or 2	ε_j＝1	ΔW_ij∝-X_i×δ_j<0	Write operation voltage 10V	Become smaller
2	x_iIs-1 or-2	ε_j＝-1	ΔW_ij∝-X_i×δ_j<0	Write operation voltage 10V	Become smaller
						3	x_i＝1	ε _j1 or 2	ΔW_ij∝-X_i×δ_j<0	Write operation voltage 10V	Become smaller
4	x_i＝-1	ε_jIs-1 or-2	ΔW_ij∝-X_i×δ_j<0	Write operation voltage 10V	Become smaller
						5	x_i＝2	ε_j＝2	ΔW_ij∝-X_i×δ_j<0	Write operation voltage 12V	Become smaller
6	x_i＝-2	ε_j＝-2	ΔW_ij∝-X_i×δ_j<0	Write operation voltage 12V	Become smaller

It can be seen that in this example, the conductance values are all made smaller under the modified condition, i.e. the type of modified condition is a write operation, which is biased to have two levels of 10V and 12V, then x is applied_i×ε_jAnd determining that the write operation is biased to be 12V when the absolute value of the product is maximum, and determining that the write operation is biased to be 10V in other cases. In other examples, more levels may be set, such as in Table four above, and the write bias may be set to 3 levels, such as three levels 10V, 12V, and 15V, x_i×ε_jThe product absolute values are 1,2 and 4, which correspond to the write operation bias of 10V, 12V and 15V, respectively. This is merely an example, and other examples may have other rank types and rank determination manners.

In a specific application, when the weight is modified, the memories at the corresponding i rows and j columns are biased according to a determined modification condition, so that the weight modification in one sample training is realized, for example, under a 10V write operation, two control terminals of the memories at the corresponding i rows and j columns can be respectively set with voltages of 5V and-5V, so as to make the conductance value of the memories smaller.

For the above-mentioned memory array, different structures may be provided according to specific designs, and in the embodiment of the present invention, as shown in fig. 6, the structure includes:

a plurality of memory cells 100 arranged in an array, each memory cell 100 including a nonvolatile memory 101;

in the memory array, a first source-drain electrode DS1 of each nonvolatile memory in a first direction X is electrically connected with a first electric connecting line AL, a second source-drain electrode DS2 of each nonvolatile memory in a second direction Y is electrically connected with a second electric connecting line BL, and a gate G of each nonvolatile memory in the first direction X or the second direction Y is electrically connected with a third electric connecting line CL;

the first electrical connection line AL is used for loading an input signal in forward propagation, and the second electrical connection line BL is used for outputting an output signal in the forward propagation; the second electrical connection line BL is used for loading an input signal in the reverse propagation, and the first electrical connection line AL is used for outputting an output signal in the reverse propagation.

In the embodiment of the present invention, the first direction X and the second direction Y are two directions of array arrangement, and the array is usually arranged in rows and columns, and in a specific implementation, an appropriate array arrangement manner may be adopted as needed, as shown in fig. 6, for example, the array may be arranged in rows and columns aligned in order, or may be arranged in staggered rows and columns, that is, the memory cell in the next row is located between two memory cells in the previous row. In a specific embodiment, the first direction X is a row direction, and the second direction Y is a column direction, and correspondingly, the first direction X is a column direction, and the second direction Y is a row direction, each row in the row direction means each row, and each column in the column direction means each column.

In the illustrated embodiment of the present invention, only the memory cells in the first row and the first column are illustrated in the memory array, and the memory cells in the other portions are not illustrated, but actually the memory cells are provided in the other portions.

In the embodiment of the present invention, the first source-drain DS1 and the second source-drain DS2 are source terminals or drain terminals of a memory or a MOS device, and when the first source-drain DS1 is a source terminal, the second source-drain DS2 is a drain terminal, and correspondingly, when the first source-drain DS1 is a drain terminal, the second source-drain DS2 is a source terminal. Each memory unit at least comprises a nonvolatile memory 101, the nonvolatile memory 101 has the characteristic of retaining data when power is down, a memory array is designed by using the characteristic and used for matrix calculation of a neural network, and the nonvolatile memory 101 can be a memristor, a phase change memory, a ferroelectric memory, a spin magnetic moment coupling memory, a floating gate field effect transistor or a SONOS (silicon-silicon Oxide-silicon nitride-silicon Oxide-silicon, Si-Oxide-SiN-Oxide-Si) field effect device and the like. Further, each memory cell may further include a Metal-Oxide-Semiconductor Field-Effect Transistor (MOS fet).

In each memory cell, a MOS device is used to assist in controlling the state of the non-volatile memory, with the gate G2 of the MOS device being controlled separately from the gate G1 of the memory. In some embodiments, referring to fig. 7 and 8, each memory cell 200 in the memory array includes a nonvolatile memory 101 and a MOS device 102, where the MOS device 102 is connected in series with the nonvolatile memory 101, that is, a first source-drain terminal DS1 of the MOS device 102 is electrically connected to a second source-drain terminal DS2 of the nonvolatile memory 101, and in specific implementations, the electrical connection may be a direct connection or an indirect connection, for example, the MOS device may be connected in series with a nonvolatile memory common-source drain, or may be connected in series through an interconnect line or a doped region, and in these embodiments, a first source-drain terminal DS1 of the memory 101 is electrically connected to an electrical connection line BL, and another source-drain terminal DS2 is connected to another electrical connection line AL through the MOS device 102. The gate G1 of the nonvolatile memory 101 is connected to the third electrical connection line CL in the first direction X or the second direction Y, and the gate G2 of the MOS device 102 is connected to the fourth electrical connection line DL in the first direction X or the second direction Y, preferably, the directions of the third electrical connection line CL and the fourth electrical connection line DL are orthogonal to each other.

In other embodiments, referring to fig. 9, each memory cell 300 in the memory array includes a nonvolatile memory 101 and a MOS device 103, the MOS device 103 shares a channel with the nonvolatile memory 101, a source-drain end DS1 of the MOS device 103 is also a source-drain end DS2 of the nonvolatile memory 101, a gate G1 of the nonvolatile memory 101 is connected to a third electrical connection line CL along a first direction X or a second direction Y, and a gate G2 of the MOS device 103 is connected to a fourth electrical connection line DL along the first direction X or the second direction Y, preferably, the directions of the third electrical connection line CL and the fourth electrical connection line DL are orthogonal to each other, and the memory module may be arranged as shown in fig. 3, and only the device connections in the memory cell are different.

In the memory array of the embodiment of the invention, one source/drain terminal DS1 of each nonvolatile memory in one direction is electrically connected with one electrical connection line BL, the other source/drain terminal DS2 of each nonvolatile memory in the other direction is electrically connected with the other electrical connection line AL, and the gate G of the nonvolatile memory can be connected with the electrical connection line in the row or column direction as required.

Based on the storage array, a neural network based on the storage array is realized by further arranging other devices and connecting the storage arrays, and the other devices such as amplifiers, integrators and the like further process the output signals of the storage array to realize other operations in forward and backward propagation. The present invention is described herein only for the convenience of understanding the technical solution of the present invention as a whole, and the present invention is not particularly limited to this portion.

Based on the memory array, when the weight value is specifically modified, a write voltage or an erase voltage may be applied to the memory through the first electrical connection line AL, the second electrical connection line BL, the fourth electrical connection line DL, and the third electrical connection line CL, if the connection weight data needs to be increased, a preset write operation voltage may be applied to the memory, so that the memory continues to perform write operation, and if the connection weight data needs to be decreased, a preset erase operation voltage may be applied to the memory, so that the memory performs erase operation.

The foregoing is only a preferred embodiment of the present invention, and although the present invention has been disclosed in the preferred embodiments, it is not intended to limit the present invention. Those skilled in the art can make numerous possible variations and modifications to the present teachings, or modify equivalent embodiments to equivalent variations, without departing from the scope of the present teachings, using the methods and techniques disclosed above. Therefore, any simple modification, equivalent change and modification made to the above embodiments according to the technical essence of the present invention are still within the scope of the protection of the technical solution of the present invention, unless the contents of the technical solution of the present invention are departed.

Claims

1. A training method of a neural network based on storage arrays is used for training the neural network by utilizing a plurality of storage arrays, each storage array is respectively used for matrix operation among layers of the neural network, each storage array is composed of storage units comprising a nonvolatile memory, and storage data in the storage arrays are used for representing connection weights among the layers, and the training method is characterized by comprising the following steps:

performing multiple sample training until the output error is converged;

2. Training method according to claim 1, wherein said first continuous interval comprises a zero value interval and at least one positive value interval, the value of said first discrete value increasing with increasing value of the interval of said first continuous interval and the sign of the value being the sign of the corresponding interval; the second continuous interval comprises at least one positive value interval, a zero value interval and at least one negative value interval, the value of the second discrete value increases along with the increase of the value of the interval of the second continuous interval, and the value sign is the sign of the corresponding interval.

3. Training method according to claim 2, wherein said first continuous interval further comprises at least one negative interval.

4. A training method as claimed in claim 2 or 3, wherein said determining the modification condition of the connection weight by said input discrete value and said error discrete value comprises:

5. The training method of claim 4, wherein the erase operation bias and the write operation bias comprise a plurality of levels, the higher the level the greater the modification amplitude; then the process of the first step is carried out,

6. Training method according to claim 4, wherein the different levels of erase operation bias correspond to different erase operation voltage pulse values and/or different erase operation voltage pulse durations and/or different numbers of erase operation voltage pulses; the different levels of the write operation bias correspond to different write operation voltage pulse values and/or different write operation voltage pulse durations and/or different numbers of write operation voltage pulses.

7. The training method as claimed in claim 1, wherein the modification condition is such that the modification variation amount is less than ten percent of the total variation range of the conductance of the nonvolatile memory.

8. The training method of claim 1, wherein biasing the corresponding non-volatile memory according to the modification condition comprises:

9. The training method according to claim 1, wherein in each of the memory arrays, a first source-drain of each of the nonvolatile memories in a first direction is electrically connected to a first electrical connection line, a second source-drain of each of the nonvolatile memories in a second direction is electrically connected to a second electrical connection line, and a gate of each of the nonvolatile memories in the first direction or the second direction is electrically connected to a third electrical connection line;

10. The training method according to claim 9, wherein the memory cell further comprises an MOS device, a first source-drain of the nonvolatile memory is electrically connected to a second source-drain of the MOS device, the first source-drain of the MOS device is electrically connected to the first electrical connection line, and a gate of each field effect transistor in the first direction or the second direction is electrically connected to the fourth electrical connection line.

11. The training method as claimed in claim 9, wherein the memory cell further comprises MOS devices sharing a channel with the non-volatile memory, and the gate of each of the MOS devices in the first or second direction is electrically connected to the fourth electrical connection.