CN112396176B

CN112396176B - Hardware neural network batch normalization system

Info

Publication number: CN112396176B
Application number: CN202011251999.9A
Authority: CN
Inventors: 李祎; 秦一凡; 缪向水
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2020-11-11
Filing date: 2020-11-11
Publication date: 2022-05-20
Anticipated expiration: 2040-11-11
Also published as: CN112396176A

Abstract

The invention discloses a hardware neural network batch normalization system, which comprises cascaded C-layer neural network circuits; the output control circuit of the p layer neural network circuit is connected with the weight area input coding circuit in the p +1 layer neural network circuit; p-1, 2, …, C-1; the p-layer neural network circuit comprises a weight region input coding circuit, a batch normalization region input coding circuit, a weight region synapse unit, a batch normalization region synapse unit, an activation layer circuit and an output control circuit; the batch normalization formula is deduced and simplified by combining with the characteristics of the neural network activation function, the batch normalization region synaptic unit is adopted to store the batch normalization parameter information of the neural network, and the normalization process is corresponding to the process of summing the output of the weight region synaptic unit and the batch normalization parameter information of the neural network according to rows, so that the original complex hardware function adapts to a memory storage and calculation integrated framework, the circuit complexity for realizing the batch normalization hardware function is greatly simplified, and higher network precision can be realized with lower circuit area consumption.

Description

Hardware neural network batch normalization system

Technical Field

The invention belongs to the technical field of artificial neural networks, and particularly relates to a hardware neural network batch normalization system.

Background

In the big data era, more and more artificial intelligence and deep learning are applied to daily life, but the artificial intelligence and deep learning are limited by a von Neumann architecture with a traditional memory and a processor separated, and an existing neural network hardware implementation and acceleration system faces more and more serious 'storage wall' problems. The neural network memory computing hardware system based on the mature memory and the novel memory has the characteristics of high parallelism, low delay, low power consumption and no obvious limit on storage and computation, is expected to break through the von Neumann bottleneck problem of the traditional computer architecture, and has great potential and significance in the current era background.

With the upgrading of application scenes and the increasing of task difficulty, a neural network algorithm develops towards more complexity and deeper layers, so that the convergence speed and the inference time of the neural network and the accuracy performance of the whole network present higher requirements. The batch normalization algorithm is increasingly emphasized and adopted as a part of the neural network optimization algorithm, and aims to normalize the changed data distribution of the neural network middle layer into a distribution with mean and variance more suitable for neural network convergence, and the normalized activation value is input into an activation function to generate a better layer output distribution. The batch normalization operation can obviously increase the convergence speed during the neural network training and accelerate the inference, and the regularization effect brought by the batch normalization operation can improve the precision performance of the neural network, which is very key in the hardware process of the neural network, and particularly for the low-precision neural network facing to the edge intelligent equipment, the batch normalization algorithm can obviously improve the network performance and efficiency.

The batch normalization parameters are continuously learned, adjusted and optimized in the neural network training process, after the neural network training is completed, different batch normalization parameter values of each neuron are determined and memorized in a constant form, and the batch normalization parameters are necessary to be adjusted according to different application scenes and tasks, so that the batch normalization hardware circuit needs to meet the variability and the fixity at the same time. At present, hardware implementation solutions for batch normalization algorithms are mainly divided into two categories, one is to omit the batch normalization algorithms on the premise of sacrificing the precision performance of a neural network, and the precision loss caused by the solutions is particularly prominent in application scenes with complex tasks and high precision requirements; secondly, a batch normalization circuit is built based on a traditional Metal-Oxide-Semiconductor (CMOS) transistor, parameters stored in the CMOS transistor cannot be changed after the CMOS transistor is manufactured, batch normalization CMOS circuits with the number equivalent to that of neurons need to be built and additional control circuits need to be built to provide different parameters due to the fact that batch normalization parameters are different for each neuron, and a large amount of circuits and additional control circuits of the traditional CMOS transistor consume a large amount of area and power consumption. Therefore, the proposal of a new batch normalization system becomes an urgent need.

Disclosure of Invention

In view of the above defects or improvement requirements of the prior art, the present invention provides a hardware neural network batch normalization system, which aims to solve the technical problem that the prior art cannot realize higher network precision with lower circuit area consumption.

In order to achieve the above object, the present invention provides a hardware neural network batch normalization system, which includes cascaded C-layer neural network circuits; c is a positive integer; the output control circuit of the p-th layer neural network circuit is connected with the weight area input coding circuit in the p + 1-th layer neural network circuit; p-1, 2, …, C-1;

the p-layer neural network circuit comprises a weight region input coding circuit, a batch normalization region input coding circuit, a weight region synapse unit, a batch normalization region synapse unit, a first activation layer circuit and an output control circuit; at this time, the weight area synapse units and the batch normalization area synapse units are both arrays formed by electronic synapse devices, the number of rows of the weight area synapse units and the number of rows of the batch normalization area synapse units are the same, and the rows are respectively connected; the output end of the weight region input coding circuit is connected with each column of the synapse units in the weight region; the output end of the input coding circuit of the batch normalization region is connected with each column of the synapse units of the batch normalization region; each row of the synapse units in the batch normalization region is connected with a first activation layer circuit, the first activation layer circuit comprises a plurality of first operational amplifiers, the synapse units in the batch normalization region are divided into a plurality of groups of synaptic blocks according to the rows, each group of synaptic blocks is composed of two adjacent rows of electronic synapse devices, and each group of synaptic blocks is connected with one first operational amplifier in a differential pair mode; the output ends of the first active layer circuits are connected with an output control circuit;

the weight area input coding circuit is used for coding input information of a system or output information X of a previous layer of neural network circuit to obtain a corresponding pulse signal, and the corresponding pulse signal is input into a weight area synapse unit;

the batch normalization region input coding circuit is used for inputting a logic level '1' pulse into the batch normalization region synapse unit, and the input time is synchronous with the time when the weight region input coding circuit inputs a pulse signal into the weight region synapse unit;

the weight area synapse unit is used for storing synapse weight information W of the neural network, realizing matrix vector multiplication WX under the action of pulse signals input by the weight area input coding circuit, and outputting the matrix vector multiplication WX according to rows;

the batch normalization region synapse unit is used for storing neural network batch normalization parameter information K, and under the action of logic level '1' pulse input by the batch normalization region input coding circuit, the output of the weight region synapse unit and the neural network batch normalization parameter information K are summed according to rows and then output to the first activation layer circuit;

the first active layer circuit is used for comparing the output results of all rows in the bump contact block connected with the first operational amplifier by adopting the first operational amplifier to obtain mapping results, inputting the mapping results into the output control circuit for integration, and inputting the results into the p +1 layer neural network circuit;

the layer C neural network circuit comprises a weight region input coding circuit, a weight region synapse unit, a second activation layer circuit and an output control circuit; at the moment, the output end of the weight region input coding circuit is connected with each column of the synapse units in the weight region; each row of the synapse units in the weight area is connected with a second activation layer circuit, the second activation layer circuit comprises a plurality of second operational amplifiers, the synapse units in the weight area are divided into a plurality of groups of synaptic blocks according to the rows, each group of synaptic blocks is composed of two adjacent rows of electronic synapse devices, and each group of synaptic blocks is connected with one second operational amplifier in a differential pair mode; the output ends of the second active layer circuits are connected with the output control circuit;

and the second active layer circuit is used for subtracting the output results of all rows in the bump blocks connected with the second operational amplifier by using the second operational amplifier and integrating the subtraction result by the output control circuit to obtain a final result.

Further preferably, in each layer of neural network circuit, the synaptic units in the weight area and the synaptic units in the batch normalization area are different in size.

Further preferably, the weight area synapse unit stores the neural network synapse weight information as

At this time, the size of the synaptic units in the weight region is 2 MxN, w_ijIs the difference between the electronic synapse devices in row 2i-1 and column j and the electronic synapse devices in row 2i and column j, i equals 1,2,…,M，j＝1,2,…,N。

Further preferably, the neural network batch normalization parameter information stored by the synaptic units in the batch normalization area is

At this time, the size of the synapse units in the batch normalization region is 2 MxL,

k_rsfor the difference between the electronic synapse devices in row 2r-1 and column s and the electronic synapse devices in row 2r and column s, r is 1,2, …, M, s is 1,2, …, L is determined by the normalized parameter information and the electronic synapse device accuracy.

Further preferably, the output control circuit comprises a plurality of neurons; in the p-layer neural network circuit, all the neurons are connected with all the first operational amplifiers; in the layer C neural network circuit, each neuron is fully connected with each second operational amplifier.

Further preferably, in the hardware neural network batch normalization system, in the training process, the synapse units in the weight region and the synapse units in the batch normalization region update the synapse weights synchronously; and after the training is finished, keeping the synaptic weights in the synaptic units in the weight area and the synaptic units in the batch normalization area unchanged.

Further preferably, the electronic synapse device comprises a two-terminal electronic synapse device or a multi-terminal electronic synapse device;

the two-terminal electronic synapse device comprises a resistive random access memory, a phase change random access memory, a magnetic random access memory, a ferroelectric random access memory or a novel two-dimensional material device;

a multi-terminal electronic synapse device comprises a floating gate transistor or a synapse transistor.

Further preferably, when the electronic synapse device is a two-terminal electronic synapse device, the resistance state, the crystallization state, the magnetization state, and the electrical polarization state of the device are changed by applying a predetermined voltage or current or an external magnetic field to two terminals of the two-terminal electronic synapse device, so as to adjust the synapse weight.

Further preferably, when the electronic synapse device is a multi-terminal electronic synapse device, a gate, a source, and a drain of the multi-terminal electronic synapse device are used as input ports of synapses, and a channel resistance state between the source and the drain of the multi-terminal electronic synapse device is used as a synapse weight; and adjusting the synapse weight by controlling the voltage of each electrode of the multi-terminal electronic synapse device.

In general, compared with the prior art, the above technical solution contemplated by the present invention can achieve the following beneficial effects:

1. the invention provides a hardware neural network batch normalization system, which is characterized in that a batch normalization formula is combined with the characteristics of a neural network activation function to be deduced and simplified, batch normalization region synaptic units are adopted to store neural network batch normalization parameter information, a normalization process is correspondingly taken as a process of summing the output of weight region synaptic units and the neural network batch normalization parameter information according to rows, and the neural network synaptic weight information and the neural network batch normalization parameter information are synchronously updated in a training process, so that the original complex hardware function adapts to a memory storage and calculation integrated framework, the circuit complexity for realizing the batch normalization hardware function is greatly simplified, and higher network precision can be realized with lower circuit area consumption.

2. Compared with the traditional batch normalization differential design, the hardware neural network batch normalization system provided by the invention can realize batch normalization operation on the basis of not adding peripheral circuit complexity, improve neural network identification precision and accelerate network convergence; compared with the traditional CMOS batch normalization design, the complexity of a peripheral circuit is greatly simplified, and the energy efficiency and the hardware area are improved.

3. According to the hardware neural network batch normalization system provided by the invention, the synapse unit is realized by the electronic synapse device, so that the synapse unit can modify and store weight and parameters according to the neural network requirement; the weight area synapse units memorize the weight information of the layer and receive the input mode information or the upper layer input signals, the batch normalization area synapse units memorize the batch normalization information of the layer and synchronously receive logic '1' electric pulses, and the synapse arrays realize multiplication and addition of the neural network and batch normalization operation in one step.

Drawings

FIG. 1 is a schematic diagram of a hardware neural network batch normalization system according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a neural network flow for performing 0-9 handwritten digit pattern recognition tasks according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a hardware neural network batch normalization system corresponding to a neural network for executing a 0-9 handwritten digit pattern recognition task according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

In order to achieve the above object, as shown in fig. 1, the present invention provides a hardware neural network batch normalization system, and in particular, a hardware neural network batch normalization system suitable for an activation function being a symbolic function, which includes a cascaded C-layer neural network circuit; c is a positive integer; the output control circuit of the p-th layer neural network circuit is connected with the weight area input coding circuit in the p + 1-th layer neural network circuit; p-1, 2, …, C-1; in each layer of neural network circuit, the scale of the weight area synapse unit and the scale of the batch normalization area synapse unit are different, the scale of the weight area parameter is determined by the neural network parameter, the scale of the weight area synapse unit is determined by combining the precision of the electronic synapse device, and generally, a pair of differential pair electronic synapses represents a weight parameter; the batch normalization region parameter scale is determined by the number of the layer of neurons, the parameters correspond to the neurons under general conditions, corresponding optimization measures such as the number of characteristic graphs and other parameter adjustment in a convolutional neural network exist, and the array scale is determined by combining the precision of the electronic synapse device and corresponding margins. It should be noted that the scales of the synapse units in the weight area and the synapse units in the batch normalization area are the effective scales of the synapse units in the weight area and the synapse units in the batch normalization area, that is, the scales of the electronic synapse devices used for the neural network calculation.

The p-layer neural network circuit comprises a weight region input coding circuit, a batch normalization region input coding circuit, a weight region synapse unit, a batch normalization region synapse unit, a first activation layer circuit and an output control circuit; at this time, the synapse units in the weight area and the synapse units in the batch normalization area are both arrays formed by electronic synapse devices (so that the synapse units in the weight area and the synapse units in the batch normalization area can modify and store weights and parameters according to the requirements of a neural network, and have plasticity), the number of rows of the synapse units in the weight area and the synapse units in the batch normalization area are the same, and the rows are respectively connected; the output end of the weight region input coding circuit is connected with each column of the synapse units in the weight region; the output end of the input coding circuit of the batch normalization region is connected with each column of the synapse units of the batch normalization region; each row of the synapse units in the batch normalization region is connected with a first activation layer circuit, the first activation layer circuit comprises a plurality of first operational amplifiers, the synapse units in the batch normalization region are divided into a plurality of groups of synaptic blocks according to the rows, each group of synaptic blocks is composed of two adjacent rows of electronic synapse devices, and each group of synaptic blocks is connected with one first operational amplifier in a differential pair mode; the output ends of the first active layer circuits are connected with an output control circuit; the weight area input coding circuit is used for inputting information of a system or outputting information X (X) of a previous layer neural network circuit₁,x₂,…,x_N)^TCoding is carried out to obtain a corresponding pulse signal, and the pulse signal is input into a synapse unit in a weight region; the batch normalization region input coding circuit is used for inputting a logic level '1' pulse into the batch normalization region synapse unit, and the input time is synchronous with the weight region input coding circuit 1; the weight area synapse unit is used for storing synapse weight information of the neural network

And realizing matrix vector multiplication WX under the action of pulse signal input by weight region input coding circuit, and making it be row-by-rowInputting the result into a synapse unit of a batch normalization region; specifically, the synapse units in the weight region have a size of 2 MxN, w_ijThe difference between the electronic synapse devices in row 2i-1 and column j and the electronic synapse devices in row 2i and column j, i equals 1,2, …, M, j equals 1,2, …, N. Specifically, the sizes of synapse units in the corresponding weight regions can be adjusted according to the sizes and scales of synapse weights of the neural network. The input data in each layer of neural network is concentrated in s distribution through batch normalization, so that the network training is faster. The neural network batch normalization parameter information stored in the synaptic unit of the batch normalization area is

Under the action of a logic level '1' pulse input by the input coding circuit of the batch normalization region, summing the output of the synapse unit of the weight region and the batch normalization parameter information K of the neural network according to rows, and outputting the sum to the circuit of the activation layer; specifically, the size of the synapse units in the batch normalization region is 2 MxL,

k_rsfor the difference between the electronic synapse devices in row 2r-1 and column s and the electronic synapse devices in row 2r and column s, r is 1,2, …, M, s is 1,2, …, L is determined by the normalized parameter information and the accuracy of the electronic synapse devices, generally, L is set with a certain margin to be compatible with different tasks, and the extra electronic synapses may have their input signals set to logic level "0". Specifically, the scale of the synapse units in the corresponding batch normalization region can be adjusted according to the numerical value and scale of the neural network batch normalization parameters. The first active layer circuit is used for comparing the output results of all rows in the bump block connected with the first operational amplifier by adopting the first operational amplifier to obtain mapping results, inputting the mapping results into the output control circuit for integration, and inputting the results into the p +1 layer neural network circuit.

The layer C neural network circuit comprises a weight region input coding circuit, a weight region synapse unit, a second activation layer circuit and an output control circuit; at the moment, the output end of the weight region input coding circuit is connected with each column of the synapse units in the weight region; each row of the synapse units in the weight area is connected with a second activation layer circuit, the second activation layer circuit comprises a plurality of second operational amplifiers, the synapse units in the weight area are divided into a plurality of groups of synaptic blocks according to the rows, each group of synaptic blocks is composed of two adjacent rows of electronic synapse devices, and each group of synaptic blocks is connected with one second operational amplifier in a differential pair mode; the output ends of the second active layer circuits are connected with the output control circuit; and the second active layer circuit is used for subtracting the output results of all rows in the bump blocks connected with the second operational amplifier by using the second operational amplifier and integrating the subtraction result by the output control circuit to obtain a final result.

The output control circuit includes a plurality of neurons; in the p-layer neural network circuit, all the neurons are connected with all the first operational amplifiers; in the layer C neural network circuit, each neuron is fully connected with each second operational amplifier.

Wherein the electronic synapse device comprises a two-terminal electronic synapse device or a multi-terminal electronic synapse device; specifically, in this embodiment, the two-terminal electronic synapse device may be a resistive random access memory (e.g., HfOx, TaOx, TiOx, AlOx, ZrOx, CuOx, SiNx, SiOx, GeSe, GeTe, AgInSbTe, Ag2S, Ag2Se), a phase change random access memory (e.g., GeTe, SbTe, GeSb, GeSbTe, BiTe, etc.), a magnetic random access memory (e.g., NiFe, NiFeNi, CoFe, CoFeB, La1-xSrxMnO, Nd-Pb-Mn-O, La-Ba-Mn-O, La-Ca-Mn-O), a ferroelectric random access memory (e.g., BaTiO3, PbTiO3, SrTiO3, SrRuO3, BaxSr 1-xNb 84, Pb (Zr1-xTix) O3, PbNb2O6, Sr 1-xxxxxNb 2O6 Ba, BaxNb 2NaNb5O15, Ta2O 15, Nb2O 3, Pb (Zr) O2O 3728, Pb2 Cd 2) O7, MoNb 2O 2, Pb2 WO 28, such as a two-dimensional material, such as FeO 826959, graphite, Mo 2N 2, Mo 2, and so on, such as a new-82 7O 826959; the multi-terminal electronic synapse device may be a floating gate transistor (e.g., NOR Flash, NAND Flash), a synapse transistor, or the like. Specifically, when the electronic synapse device is a two-terminal electronic synapse device, a predetermined voltage or current or an external magnetic field is applied to two terminals of the two-terminal electronic synapse device to change a resistance state, a crystallization state, a magnetization state, and an electrical polarization state of the device, thereby implementing the adjustment of the synapse weight. When the electronic synapse device is a multi-terminal electronic synapse device, a gate of the multi-terminal electronic synapse device is used as an input port of synapse, and a channel resistance state between a source and a drain of the multi-terminal electronic synapse device is used as a synapse weight. It should be noted that in a few multi-terminal electronic synapse devices, the source and drain of the multi-terminal electronic synapse device are used as input ports of synapses, and the channel resistance state between the source and drain of the multi-terminal electronic synapse device is used as a synapse weight. The regulation of the synaptic weight is achieved by controlling the input of each pole of the multi-terminal electronic synapse device.

Further, the neural network applicable to the batch normalization system comprises an online training acceleration system and an offline inference acceleration system. The on-line training acceleration system updates network parameters once per training, synchronously updates hardware parameters, randomly distributes synaptic weights in the synaptic units in the weight area when being initialized, presets parameters according to an algorithm when the synaptic weights in the synaptic units in the batch normalization area are initialized, and updates the state of a synaptic device according to results after each training. The off-line inference acceleration system trains network parameters to ideal values in a high-performance computer, the parameters are determined and then written into a hardware acceleration system, and the hardware executes an inference acceleration function. Further, in the hardware neural network batch normalization system, in the training process, the synapse units in the weight area and the synapse units in the batch normalization area synchronously update the synapse weights; and after the training is finished, keeping the synaptic weights in the synaptic units in the weight area and the synaptic units in the batch normalization area unchanged.

Note that, the output result in the synapse unit in the weight region is denoted as S ═ WX, and denoted as a preliminary activation value; according to the process in the batch normalization algorithm, the q-th row of preliminary activation values in S are normalized and expressed as:

q ═ 1,2, …, Q; wherein Q is the size of the batch in the batch normalization algorithm, and mu is the average value of the initial activation values of the batch; σ is the standard deviation of the preliminary activation value; gamma ray_q、ξ_q、β_qScaling factors and offsets, both being neural network learnable constants; by activating the function one can get:

wherein, the first and the second end of the pipe are connected with each other,

further, the hardware neural network batch normalization system can determine sigma and mu as constants to accelerate hardware inference in the training process, and corresponding m is_qAnd k_qAlso constant, the above equation can be further expressed as:

from the above, a batch normalization region synaptic cell storage constant k can be used_qInformation by applying a constant k_qIs trained to reach a training parameter gamma_q、ξ_q、β_qThe purpose of (1). In the training process of the hardware neural network batch normalization system, synchronously updating synaptic weights by the synaptic units in the weight area and the synaptic units in the batch normalization area; and after the training is finished, keeping the synaptic weights of the synaptic units in the weight area and the synaptic units in the batch normalization area unchanged.

Further, when considering the bias constant b of the neural network_qThen, the corresponding derivation process is as follows:

wherein the content of the first and second substances,

from the above, the neural network can be biased by the constant b_qIs comprised to constant k_qTo perform training. Further, the number of columns of the synapse units in the batch normalization region is determined by the value range of the neural network batch normalization parameter information K stored in the synapse units in the batch normalization region and the minimum precision capable of being expressed by the device, iThere is typically a larger margin left to make the array adaptable to different recognition tasks.

The hardware neural network batch normalization system provided by the invention can flexibly adjust the scale and range of the electronic synapse device according to the characteristics of the neural network, and flexibly select parameters by taking the completion accuracy of the brain-like cognitive task as a performance index for different neural networks taking a symbolic function as an activation function.

In order to further explain the hardware neural network batch normalization system provided by the invention, the following details are provided with embodiments:

in this embodiment, a neural network batch normalization hardware offline inference acceleration system is obtained by combining a binary neural network, and a pattern recognition task for 0-9 handwritten numbers is taken as an embodiment, a flow diagram of the neural network is shown in fig. 2, in this embodiment, the number of neurons in an input layer is 784, the size of a pixel value (28 × 28) corresponding to the handwritten number is set, the number of neurons in a hidden layer is 1024, and the number of neurons in an output layer is 10; the input information is a binary gray image, and the activation function is:

correspondingly, the schematic diagram of the hardware neural network batch normalization system structure corresponding to the neural network for executing the pattern recognition task of 0-9 handwritten digits is shown in fig. 3 (two layers of neural networks are shared), and the formula (3) is used for setting m in the example_q> 0, so:

a binary electronic synapse device is adopted in the binary network hardware design, in the embodiment, the value range of K is [ -52.3,17.1], and each row of a batch normalization region needs at least 70 electronic synapses to express K values; in order to meet the requirements of the accuracy of the K constant of the electronic synapse memory of the batch normalization region and the margin of other future tasks, the synapse unit of the batch normalization region of the first layer of neural network circuit is designed to be 2048 multiplied by 128 by hardware, and the decimal part in K is saved; the sizes of synaptic units in the weight area of the first layer of neural network circuit and the second layer of neural network circuit are 2048 × 784 and 20 × 1024 respectively. Writing the trained network parameters into a weight area synapse unit of a first layer of neural network circuit, when the operation is inferred to accelerate, inputting coding electric pulse information into the weight area synapse unit in the first layer of neural network circuit by a weight area input coding circuit in the first layer of neural network circuit according to mode input information, synchronously inputting '1' electric pulse into a batch normalization area synapse unit in the first layer of neural network circuit by a batch normalization area input coding circuit in the first layer of neural network circuit, representing positive and negative two columns of current and flowing into a corresponding first operational amplifier to execute an activation function, and transmitting results to an output control circuit in the first layer of neural network circuit by 1024 first operational amplifiers, storing and synchronously outputting the results to a weight area input coding circuit in a second layer of neural network circuit; the second operational amplifier in the second layer of neural network circuit executes subtraction function and generates real value output, and the results of 10 second operational amplifiers in the second layer of neural network circuit are transmitted out of the off-chip circuit through the output control circuit in the second layer of neural network circuit to read out the neural network pattern recognition result. The binary neural network obtains the recognition precision of more than 97% under the scheme, and compared with the traditional binary neural network which does not adopt batch normalization, the precision is improved by 7%.

The invention provides a hardware neural network batch normalization system, which aims to develop a neural network system with application value and advantages. The invention discloses a batch normalization system hardware design facing a specific neural network and based on an electronic synapse device. Compared with a synapse function circuit module under the traditional CMOS technology, the electronic synapse device serving as a key synapse function module in a neural network has the outstanding advantages of low power consumption, high density, compatibility with the CMOS technology and the like, shows huge potential in the aspects of accelerating the processing speed of the neural network and breaking the bottleneck of von Neumann, and is rapidly developed. The batch normalization algorithm has great potential in accelerating the convergence of the neural network and improving the precision of the neural network. The batch normalization hardware system based on the electronic synapse device disclosed by the invention can promote the development of a road realized by neural network hardware, and promote a novel computing architecture to provide new inspiration and a new road in an era of rapid artificial intelligence development.

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A hardware neural network batch normalization system is characterized by comprising cascaded C-layer neural network circuits; c is a positive integer; the output control circuit of the p-th layer neural network circuit is connected with the weight area input coding circuit in the p + 1-th layer neural network circuit; p-1, 2, …, C-1;

the p-layer neural network circuit comprises a weight region input coding circuit, a batch normalization region input coding circuit, a weight region synapse unit, a batch normalization region synapse unit, a first activation layer circuit and an output control circuit; at this time, the weight area synapse units and the batch normalization area synapse units are both arrays formed by electronic synapse devices, the number of rows of the weight area synapse units and the number of rows of normalization area synapse units are the same, and the rows are respectively connected; the output end of the weight region input coding circuit is connected with each column of the weight region synapse units; the output end of the batch normalization region input coding circuit is connected with each column of the synapse units in the batch normalization region; each row of the synapse units in the batch normalization region is connected with the first activation layer circuit, the first activation layer circuit comprises a plurality of first operational amplifiers, the synapse units in the batch normalization region are divided into a plurality of groups of synaptic blocks according to the rows, each group of synaptic blocks is composed of two adjacent rows of electronic synapse devices, and each group of synaptic blocks is connected with one first operational amplifier in a differential pair mode; the output ends of the first active layer circuits are connected with the output control circuit;

the weight area input coding circuit is used for coding input information of the system or output information X of a previous layer of neural network circuit to obtain a corresponding pulse signal, and the corresponding pulse signal is input into the weight area synapse unit;

the batch normalization region input coding circuit is used for inputting a logic level 1 pulse into the batch normalization region synapse units, and the input time is synchronous with the time when the weight region input coding circuit inputs a pulse signal into the weight region synapse units;

the batch normalization region synapse unit is used for storing neural network batch normalization parameter information K, and under the action of a logic level '1' pulse input by the batch normalization region input coding circuit, the output of the weight region synapse unit and the neural network batch normalization parameter information K are summed in a row and then output to the first activation layer circuit;

the layer C neural network circuit comprises the weight region input coding circuit, the weight region synapse unit, a second activation layer circuit and the output control circuit; at this time, the output end of the weight region input coding circuit is connected with each column of the weight region synapse units; each row of the synapse units in the weight area is connected with the second activation layer circuit, the second activation layer circuit comprises a plurality of second operational amplifiers, the synapse units in the weight area are divided into a plurality of groups of synaptic blocks according to the rows, each group of synaptic blocks is composed of two adjacent rows of electronic synapse devices, and each group of synaptic blocks is connected with one second operational amplifier in a differential pair mode; the output ends of the second active layer circuits are connected with the output control circuit;

and the second active layer circuit is used for subtracting the output results of all rows in the bump blocks connected with the second operational amplifier by adopting the second operational amplifier and integrating the subtraction result by the output control circuit to obtain a final result.

2. The hardware neural network batch normalization system of claim 1, wherein the synaptic units in the weight area and the synaptic units in the batch normalization area are different in scale in each layer of neural network circuit.

3. The hardware neural network batch normalization system of claim 1, wherein the neural network synaptic weight information stored by the synaptic weight unit in the weight area is

At this time, the size of the synapse units in the weight region is 2 MxN, w_ijThe difference between the electronic synapse devices in row 2i-1 and column j and the electronic synapse devices in row 2i and column j, i equals 1,2, …, M, j equals 1,2, …, N.

4. The hardware neural network batch normalization system of claim 1, wherein the neural network batch normalization parameter information stored in the synapse units of the batch normalization region is

At this time, the size of the synapse units in the batch of normalized regions is 2 MxL,

k_rsfor the difference between the electronic synapse devices in row 2r-1 and column s and the electronic synapse devices in row 2r and column s, r is 1,2, …, M, s is 1,2, …, L is determined by the normalized parameter information and the electronic synapse device precision.

5. The hardware neural network batch normalization system of claim 1, wherein the output control circuit comprises a plurality of neurons; in the p-th layer neural network circuit, all the neurons are connected with all the first operational amplifiers; in the layer C neural network circuit, each neuron is fully connected to each second operational amplifier.

6. The hardware neural network batch normalization system of claim 1, wherein the weight region synapse units update synapse weights synchronously with the batch normalization region synapse units during training; and after the training is finished, keeping the synaptic weights in the weight area synaptic units and the batch of normalization area synaptic units unchanged.

7. The hardware neural network batch normalization system of any one of claims 1-6, wherein the electronic synapse devices comprise two-terminal electronic synapse devices or multi-terminal electronic synapse devices;

the multi-terminal electronic synapse device comprises a floating gate transistor or a synapse transistor.

8. The batch normalization system of hardware neural networks of claim 7, wherein when the electronic synapse device is a two-terminal electronic synapse device, the resistance state, the crystallization state, the magnetization state, and the electrical polarization state of the device are changed by applying a predetermined voltage or current or an applied magnetic field across the two-terminal electronic synapse device, thereby implementing the adjustment of the synaptic weights.

9. The hardware neural network batch normalization system of claim 7, wherein the electronic synapse device is a multi-terminal electronic synapse device, a gate or a source and a drain of the multi-terminal electronic synapse device are used as input ports of synapses, and a channel resistance state between the source and the drain of the multi-terminal electronic synapse device is used as a synapse weight; and adjusting the synapse weight by controlling the voltages of all the electrodes of the multi-terminal electronic synapse device.