CN115906936A

CN115906936A - Neural network training and reasoning method, device, terminal and storage medium

Info

Publication number: CN115906936A
Application number: CN202211356224.7A
Authority: CN
Inventors: 王伟; 李阳; 姜文峰; 汪令飞; 耿玓; 刘明
Original assignee: Peng Cheng Laboratory
Current assignee: Peng Cheng Laboratory
Priority date: 2022-11-01
Filing date: 2022-11-01
Publication date: 2023-04-04
Also published as: WO2024092896A1

Abstract

The invention discloses a neural network training and reasoning method, a device, a terminal and a storage medium, comprising the following steps: carrying out mapping processing on the network node information which is propagated in the forward direction, carrying out Bernoulli process sampling according to the mapped numerical value, and taking the obtained random binary numerical value as the input of the next layer of network; carrying out Bernoulli process sampling on the derivative of the activation function to obtain the derivative of the activation function after random binarization; performing symbolization processing on the counter-propagating error of the next layer of network, and calculating the error information of the network according to the symbolized value and the derivative after random binarization; training the network according to the error information of the network and the random binary output generated by the network of the previous layer; and reasoning the neural network according to the random binary numerical values propagated layer by layer. The invention adopts the neural network which carries out forward propagation and back propagation of symbolic errors by using the random binary signals, thereby reducing the calculation resources and improving the identification precision.

Description

Neural network training and reasoning method, device, terminal and storage medium

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a neural network training and reasoning method, a device, a terminal and a storage medium.

Background

Technical progress of an Artificial Neural Network (hereinafter referred to as a Neural Network) is an important driving force of technological development in recent years. The neural network is widely applied to the processing process of information such as images, sounds, characters and the like.

The neural network usually comprises a plurality of layers of interconnected nonlinear network nodes, and the connection strength between the nodes is called weight. The information of the neural network which needs to be processed is input from the input node, and is propagated layer by layer in the neural network to finally reach the output layer, and the process is called the forward propagation of the information. The forward propagation of information is the processing process of the neural network on the input information, also called reasoning process. The neural network can adjust the weights between the nodes connected in the network through a specific algorithm and a specific process, so that the reasoning result is as accurate as possible, and the process is called a training or learning process.

Error back propagation (Error back propagation) and Gradient Descent algorithm (Gradient decision) are important technical inventions for realizing neural network training. The neural network training based on the error back propagation and gradient descent algorithm comprises the following 4 steps:

1) Inputting sample data in the training set into a neural network for forward propagation of information, obtaining state information of each node, and obtaining a final output result;

2) Comparing the output result with the marking information of the sample data to obtain an output Error (Error);

3) Taking an input error as an input from a network terminal, and reversely propagating from the last layer of the neural network to the first layer;

4) And calculating gradient information of the final output result of the neural network relative to the connection weight in the network by using the information propagated in the forward direction and the error propagated in the backward direction, and adjusting the connection weight in the network according to a gradient descent algorithm. The inference process of neural networks only involves the first step described above, i.e. the forward propagation of information.

In traditional neural network training and reasoning, forward-propagating information and backward-propagating errors need to be described with high-precision numerical values. However, the storage and processing overhead of high-precision numerical values in a computer is large, which causes high demands on computing power and energy consumption by neural network training. The problems of computing power and energy consumption become the bottleneck of further wide application of the neural network. In addition, when the neural network acceleration is realized by using the memristor array, the information and the error described by the high-precision numerical value need to be completed by complex peripheral circuits, and the cost and the power consumption of hardware acceleration neural network operation are increased.

In order to solve or relieve the problems of the computational power bottleneck and the energy consumption bottleneck in the neural network training and reasoning process, people invent a plurality of technical methods. Mainly comprises a Neural Network Quantization (Neural Network Quantization) technology and a Neural Network Binarization (Neural Network Binarization) technology; the neural network quantification reduces the demand on computing power in the neural network reasoning process to a certain extent, but reduces the recognition accuracy of the neural network, and in the training process of the binarization network, the error of back propagation is still described by a high-accuracy numerical value, so that the problem of reduction of the recognition accuracy of the neural network still exists in the process of accelerating the training.

Thus, the prior art has yet to be improved.

Disclosure of Invention

The invention aims to solve the technical problem that the recognition accuracy of the existing neural network training and reasoning method is reduced in a calculation bottleneck scene.

The technical scheme adopted by the invention for solving the technical problem is as follows:

in a first aspect, the present invention provides a neural network training and reasoning method, including:

mapping the network node information which is transmitted in the forward direction according to the activation function, sampling in the Bernoulli process according to the mapped numerical value to obtain a random binary numerical value generated by the network of the current layer, and taking the obtained random binary numerical value as the input of the network of the next layer;

carrying out Bernoulli process sampling on the derivative of the activation function to obtain the derivative of the activation function after random binarization;

performing symbolization processing on the counter-propagating error of the next layer network, and calculating the error information of the layer network according to the symbolized value and the derivative of the activation function after random binarization;

training the network according to the error information of the network and the random binary output generated by the network of the previous layer;

and reasoning the neural network according to the random binary numerical values propagated layer by layer.

In one implementation, the mapping processing on the forward propagated network node information according to the activation function previously includes:

acquiring output information of each node connected to the network of the current layer in the previous layer network, and acquiring the input of all nodes connected to the network of the current layer; the input of the node of the network of the current layer is the output of the network of the previous layer after random binarization processing;

and multiplying the inputs of all the nodes connected with the network of the current layer by the corresponding weights, and summing all the obtained products to obtain the input information of the nodes of the network of the current layer.

In one implementation, the mapping processing of the network node information propagated in the forward direction according to the activation function, and bernoulli process sampling according to the mapped value to obtain a random binary value generated by the network of the current layer, where the obtained random binary value is used as an input of the network of the next layer, includes:

mapping the input information of the local network according to the activation function, and mapping the input information of the local network into a numerical value between 0 and 1;

sampling in the Bernoulli process by taking the mapped numerical value as probability to obtain the output result of the corresponding random binary network node; wherein, the output result is a random binary numerical value generated by the local network;

and taking the obtained output result of the random binarization network node as the input of the next layer network.

In one implementation, the activation function is a squeeze function, including: a Logistic function, an error function, a shear-type rectified linear unit function, and a symmetric shear-type rectified linear unit function.

In one implementation, the performing bernoulli process sampling on the derivative of the activation function to obtain a derivative of the activation function after random binarization includes:

obtaining a derivative of the activation function;

carrying out Bernoulli process sampling by taking the obtained derivative as probability to obtain an output result of the corresponding random binarization network node;

and taking the obtained output result of the random binary network node as an error information calculation value in the back propagation process.

In one implementation, the sampling by the bernoulli process with the obtained derivative as a probability to obtain the output result of the corresponding random binarization network node includes:

the magnitude of the derivative of the activation function is scaled or approximated.

In one implementation, the performing a symbolization process on the counter-propagating error of the next layer network, and calculating the error information of the current layer network according to the symbolized value and the derivative of the randomly binarized activation function includes:

performing symbolization processing on the counter-propagating error of the next layer of network to obtain a symbolized error;

multiplying the obtained symbolized error by the derivative of the randomly binarized activation function to obtain error information of the local network;

wherein, the value of the error information of the network of the current layer is-1, 0 or 1.

In one implementation, the training of the network of the current layer according to the error information of the network of the current layer and the random binarization output generated by the network of the previous layer includes:

calculating to obtain the gradient of the integral output error function of the network relative to the weight change in the network according to the error information of the network and the random binary output generated by the network of the previous layer;

and adjusting the weight according to the gradient of the weight change and a gradient descending algorithm.

In one implementation, the method further comprises:

obtaining a binary numerical value with a value of 0 or 1 for each layer of network node information which is transmitted in the forward direction in a deterministic manner;

and transmitting the obtained binary numerical value to the last layer of the neural network layer by layer in the forward direction to obtain an inference result of the neural network.

In one implementation, the method further comprises:

and carrying out a forward propagation process of repeated random binarization for multiple times, and obtaining a final inference result of the neural network according to voting results of multiple inference results.

In a second aspect, the present invention provides a neural network training and reasoning apparatus, including:

the forward information random binarization module is used for mapping the forward-propagated network node information according to the activation function, sampling in the Bernoulli process according to the mapped numerical value to obtain a random binarization numerical value generated by the network of the current layer, and taking the obtained random binarization numerical value as the input of the network of the next layer;

the derivative information random binarization module is used for carrying out Bernoulli process sampling on the derivative of the activation function to obtain the derivative of the activation function after random binarization;

the error symbolization processing module is used for symbolizing the counter-propagating error of the next layer of network and calculating the error information of the network according to the symbolized value and the derivative of the activation function after random binarization;

the training module is used for training the network of the local layer according to the error information of the network of the local layer and the random binary output generated by the network of the previous layer;

and the reasoning module is used for reasoning the neural network according to the random binary numerical values propagated layer by layer.

In a third aspect, the present invention provides a terminal, including: a processor and a memory, the memory storing a neural network training and reasoning program, the neural network training and reasoning program, when executed by the processor, being configured to implement the operation of the neural network training and reasoning method according to the first aspect.

In a fourth aspect, the present invention further provides a storage medium, which is a computer-readable storage medium, and the storage medium stores a neural network training and reasoning program, where the neural network training and reasoning program is used to implement the operation of the neural network training and reasoning method according to the first aspect when executed by a processor.

In a fifth aspect, the present invention also provides an apparatus, comprising: a circuit module for implementing the operations of the neural network training and reasoning method according to the first aspect.

The invention adopts the technical scheme and has the following effects:

in the forward propagation process of the neural network information, the input of each layer is changed into a binary state through mapping processing and Bernoulli process sampling, so that the computational power requirement of the forward propagation of the information is greatly reduced; in addition, in the neural network training process, the derivative information of each network node is stored in a specific memory unit, so that the derivative information of the network nodes needing to be stored is converted into a binary value from a high-precision numerical value, and the storage requirement in the neural network training process is greatly reduced; in the error back propagation process, the error information of each network node is converted into a symbolic numerical state from a high-precision numerical value, so that the computational power requirement of error back propagation is greatly reduced; the invention improves the recognition precision of the neural network under the condition of reducing the computational power requirement.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the structures shown in the drawings without creative efforts.

FIG. 1 is a flow chart of a neural network training and reasoning method in one implementation of the invention.

Fig. 2 is a schematic diagram of a technical element one, a technical element two, and a technical element three adopted in a neural network training process in an implementation manner of the present invention.

FIG. 3 is a schematic illustration of a neural network forward propagating squeeze function in one implementation of the invention.

Fig. 4 is a schematic diagram of two equivalent methods using a technical element four in the neural network inference process in an implementation manner of the present invention.

Fig. 5 is a flowchart of the technical element five used in the neural network inference process in an implementation manner of the present invention.

Fig. 6 is a schematic diagram comparing the implementation effects of different full period combination technical solutions in an implementation manner of the present invention.

Fig. 7 is a functional schematic of a terminal in one implementation of the invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer and clearer, the present invention is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Exemplary method

In traditional neural network training and reasoning, forward-propagating information and backward-propagating errors need to be described with high-precision numerical values. However, the storage and processing overhead of high-precision numerical values in a computer is large, which causes high demands on computing power and energy consumption by neural network training. The computational power and energy consumption problems become the bottleneck for further widespread application of neural networks. In addition, when the neural network acceleration is realized by using the memristor array, information and errors described by high-precision numerical values need to be completed by complex peripheral circuits, and the cost and the power consumption of hardware acceleration neural network operation are increased.

In order to solve the above technical problems, the present embodiment provides a neural network training and reasoning method, in a forward propagation process of neural network information, by mapping processing and bernoulli process sampling, each layer of input is changed into a binary state, so that computational power requirements for information forward propagation are greatly reduced; in addition, in the neural network training process, the derivative information of each network node is stored in a specific memory unit, so that the derivative information of the network nodes needing to be stored is converted into a binary value from a high-precision numerical value, and the storage requirement in the neural network training process is greatly reduced; in the error back propagation process, the error information of each network node is converted into a symbolic numerical state from a high-precision numerical value, so that the computational power requirement of error back propagation is greatly reduced; the embodiment improves the recognition accuracy of the neural network under the condition of reducing the computational power requirement.

As shown in fig. 1, an embodiment of the present invention provides a neural network training and reasoning method, including the following steps:

and S100, mapping the network node information which is propagated in the forward direction according to the activation function, sampling in the Bernoulli process according to the mapped numerical value to obtain a random binary numerical value generated by the network of the local layer, and taking the obtained random binary numerical value as the input of the network of the next layer.

In this embodiment, the neural network training and reasoning method is applied to a terminal, and the terminal includes but is not limited to: computers, computer boards, application specific integrated circuits, and the like; the terminal is provided with a neural network training and reasoning based framework.

In this embodiment, the neural network that performs forward propagation and back propagation of the error of symbolization by using a randomly binarized signal mainly includes the following three technical elements that can be applied to neural network training:

the technical elements are as follows: for the network node information which is propagated in the forward direction, firstly, the network node information is mapped into a numerical value between 0 and 1 through an activation function, and then, bernoulli process sampling is carried out on the numerical value to obtain a binary random state which is used as the input of the next layer of network.

A second technical element: and carrying out Bernoulli process sampling on the derivative of the activation function to obtain a binary random state for the error back propagation process.

The technical elements are as follows: and performing symbolization processing on the error in the reverse propagation to obtain a symbolized state of which the value can only be-1,0 or 1.

The three technical elements in the embodiment are independently or jointly applied, so that the demand on calculation power in the information forward propagation and error backward propagation processes is effectively reduced, and meanwhile, the high accuracy of the training result is ensured.

In this embodiment, the following two technical elements applicable to the neural network inference process are mainly included:

the technical elements are as follows: and acquiring a binary state with a value of 0 or 1 as the input of the next layer network in a deterministic manner for the network node information which is propagated in the forward direction.

The technical elements are as follows: and (3) carrying out forward propagation on the network information by adopting a random binarization method in the technical element I, and outputting a network node Voting (Voting) mode after multiple times of forward propagation to obtain a final inference result.

In this embodiment, the technical element four can greatly improve the calculation efficiency of neural network inference and reduce inference delay, but cause a decrease in inference accuracy. By adopting the technical element five, the reasoning process of the neural network gradually increases along with the increase of the forward propagation times, and the balance of the calculation resource consumption and the reasoning accuracy in the reasoning process can be obtained.

The technical scheme is that the technical element I or the technical element IV is independently applied. The second, third and fifth technical elements are technical elements used in the present embodiment. In this embodiment, a technical solution of combining and applying the first technical element, the second technical element, the third technical element, the fourth technical element or the fifth technical element is provided. In the implementation process, the technical element I, the technical element II and the technical element III can be applied independently or in a combined way to replace the traditional high-precision calculation mode, so that 8 combined technical schemes for neural network training are formed; and the technical element four and the technical element five form a technical scheme of 3 neural network reasoning with the traditional high-precision reasoning process.

The above-mentioned 8 kinds of neural network training and 3 kinds of neural network reasoning schemes are combined, can form 24 kinds of neural network complete cycle technical schemes. The traditional high-precision neural network training and reasoning scheme, the existing technical scheme of the single application of the technical element one and the existing technical scheme of the single application of the technical element four are special examples of the 24 combined technical scheme in the embodiment.

In one implementation of the embodiment, three technical elements applicable to neural network training and two technical elements applicable to neural network inference processes are included. These technical elements are combined with each other to form a new technical scheme for realizing neural network training and reasoning. The training of the neural network relates to a technical element I, a technical element II and a technical element III. These three technical elements will be described in detail below.

In this embodiment, in the process of training the neural network, when implementing the technical element one, the forward propagation information needs to be subjected to mapping processing and random binarization processing to reduce the computational power of the forward propagation process.

Specifically, in one implementation manner of the present embodiment, step S100 includes the following steps before:

step S101a, acquiring output information of each node connected to the network of the current layer in the previous layer network, and acquiring input of all nodes connected to the network of the current layer;

and step S101b, multiplying the inputs of all the nodes connected with the network of the local layer by the corresponding weights, and summing all the obtained products to obtain the input information of the nodes of the network of the local layer.

In this embodiment, the input of the node in the network of the current layer is the output of the network of the previous layer after the random binarization processing, that is, the output of the network of the previous layer is the output of the network of the previous layer after the random binarization processing by the method described in this embodiment.

As shown at 101 in fig. 2, when information is propagated in the forward direction, the output of the previous layer network node is used as the input of the present layer network, and the input information is used

And (104) showing. The input information of the network node j of the current layer can be expressed as:

wherein,

is a previous netThe ith node connected to the network node j (106) of the network outputs information, and the upper corner mark b indicates that the output information of the previous network is subjected to random binarization processing according to the method; w is a _ij A weight (105) for connecting the ith network node in the previous layer network with the network node j in the current layer; the above equation sums the products of all inputs connected to the present network node j and the weights.

Specifically, in one implementation manner of the present embodiment, the step S100 includes the following steps:

step S101, mapping the input information of the local network according to the activation function, and mapping the input information of the local network into a numerical value between 0 and 1;

s102, carrying out Bernoulli process sampling by taking the mapped numerical value as probability to obtain an output result of the corresponding random binary network node; wherein, the output result is a random binary numerical value generated by the network of the local layer;

and step S103, taking the obtained output result of the random binarization network node as the input of the next layer network.

In the present embodiment, input information y of a network node j is obtained _j Thereafter, the activation function (107) can be used to act on it to obtain a result z _j (108)：

z _j ＝f(y _j )

The function f (u) should be a monotonically increasing function with an output range of 0 to 1.

In this embodiment, such a Function is also called a Squashing Function: when the input value is small, the output should be close to 0 or equal to 0; when the input value is large, the output should be close to 1 or equal to 1; the output monotonically increases from 0 to 1 as the input value gradually increases from a smaller value to a larger value.

A typical crush function that meets these requirements is shown in fig. 3 and includes:

logistic function:

error function:

shear rectified linear unit (Clipped ReLU) function: f (u) = min (max (0, au), 1);

symmetric shear-mode rectification linear unit function:

and the like, wherein a is a constant greater than 0.

Subsequently, in this embodiment, z is _j The numerical value of (2) is used as probability to carry out Bernoulli process sampling (109), and a random binary network node output result is obtained

(110)：

A probability of z of a value of 1 _j (ii) a Otherwise->

Value of 0

Wherein,

the upper-middle subscript b indicates that the output result information has been subjected to random binarization, i.e., that its value can only take 0 or 1. Sampling result->

As input to the latter neural network.

It is worth noting that when the constant a of the squeeze function approaches infinity, the squeeze function becomes a step function, representing z of the probability _j The value is only 0 or 1, and the random binary sampling process is converted into a deterministic binary operation. Thus, deterministic binarization can be seen as a special case of stochastic binarization.

As shown in fig. 1, in an implementation manner of the embodiment of the present invention, the neural network training and reasoning method further includes the following steps:

and S200, carrying out Bernoulli process sampling on the derivative of the activation function to obtain the derivative of the activation function after random binarization.

In this embodiment, in the process of neural network training, when implementing the second technical element, random binarization processing needs to be performed on the derivative of the activation function, so as to store the derivative information of each network node in a specific memory unit, so that the derivative information of the network node that needs to be stored is converted from a high-precision numerical value to a binary numerical value, thereby greatly reducing the storage requirement in the process of neural network training.

Specifically, in one implementation manner of the present embodiment, the step S200 includes the following steps:

step S201, acquiring a derivative of the activation function;

step S202, carrying out Bernoulli process sampling by taking the obtained derivative as probability to obtain an output result of the corresponding random binary network node;

and step S203, taking the obtained output result of the random binary network node as an error information calculation value of the back propagation process.

In this embodiment, as shown at 102 in fig. 2, while the information is propagating forward, derivative (Derivative) information (111, 112) of the activation function can be obtained:

then, in this embodiment

The numerical value is used as the probability to carry out Bernoulli process sampling (113), and the output result of the random binarization network node is obtained and is greater than or equal to the value>

(114)：

A probability of 1 being->

Otherwise->

Value of 0

Wherein,

the upper-middle subscript b indicates that the output result information has been subjected to random binarization, i.e., that its value can only take 0 or 1. Sampling result>

Will be used for the error back-propagation process.

In principle, in this embodiment, an activation function with a derivative value range of 0 to 1 should be selected to obtain an effective derivative sampling result; in practice, it is found that scaling or approximating the amplitude of the derivative of the activation function equally only affects the convergence rate of the neural network, and does not affect the final training result of the neural network.

Specifically, in one implementation manner of the present embodiment, step S202 includes the following steps:

step S202a, scaling or approximating the magnitude of the derivative of the activation function. In this embodiment, when the Logistic function is adopted and the constant a =1 is set, the activation function is

Its derivative has a value in the interval 0 to 0.25. In this embodiment, two processing modes are available: (i) Taking the numerical value as probability to carry out Bernoulli process sampling; (ii) Multiplying the derivative value by 4The bernoulli process sampling is performed as a probability.

When using the Logistic function, setting the constant a =4, the activation function is

Its derivative has a value in the interval 0 to 1. In this embodiment, it is not necessary to process the data, and the data can be directly used as the probability to perform bernoulli process sampling.

When using the Logistic function, setting the constant a =8, the activation function is

Its derivative has a value in the interval 0 to 2. In this embodiment, two processing modes are available: (i) Dividing the derivative numerical value by 2 to be used as probability to carry out Bernoulli process sampling; (ii) And assigning the derivative with the value larger than 1 as 1, keeping the values of other derivatives unchanged, and then carrying out Bernoulli process sampling.

When a shear rectified linear unit (Clipped ReLU) function is used, its derivative can only take on 0 or 1 when the constant a =1 is set. The derivative bernoulli process sampling process degenerates to a deterministic binarization process.

When a shear rectified linear unit (Clipped ReLU) function is used, its derivative can only take on 0 or 0.5 when the constant a =2 is set. In this embodiment, two processing modes are available: (i) Taking the numerical value as probability to carry out Bernoulli process sampling; (ii) The derivative value is multiplied by 2 and then the probability is used as a sampling process of the Bernoulli process (degradation is a deterministic binarization process).

When using a shear rectified Linear Unit (Clipped ReLU) function, a constant is set

Its derivative may only take on values of 0 or 2. In this embodiment, the numerical value may be divided by 2, and then a bernoulli process sampling process (a binarization process whose degradation is deterministic) may be performed.

When using Logistic function as the activation function of forward propagation, its derivative can be approximated by a shear rectified linear unit (Clipped ReLU) function, and using a deterministic binarization method to obtain the binarized derivative information for the error back propagation process.

and step S300, performing symbolization processing on the counter-propagating error of the next layer network, and calculating the error information of the layer network according to the symbolized value and the derivative of the randomly binarized activation function.

In this embodiment, in the process of neural network training, when implementing the technical element three, the back propagation error needs to be symbolized, so that the error information of each network node can be converted from a high-precision numerical value to a symbolized numerical value state, and the computational power requirement of error back propagation is greatly reduced.

Specifically, in one implementation manner of the present embodiment, the step S300 includes the following steps:

step S301, performing symbolization processing on the counter-propagating error of the next layer of network to obtain a symbolized error;

step S302, the obtained symbolized error is multiplied by the derivative of the activation function after random binarization, and the error information of the network of the local layer is obtained.

In this embodiment, the sign processing of the error in the backward propagation means that the value of the error in the backward propagation is-1, 0, or 1; as shown in fig. 2 at 103, when the Error (Error) propagates in the reverse direction, the Error δ z reversely propagated to the next layer in this embodiment _j (115) A symbolization (Sign) process (116) is performed to obtain a symbolized Error (Signed Error)

(117)：

When δ z is _j When the content is more than or equal to 0,

when δ z is _j <When 0, is greater or equal>

The upper superscript s indicates that the error information has been symbolized, and its value can only be-1 or 1. Multiplying (118) the signed error by the derivative of the activation function of the random binarization to obtain error information ≥ of the present network node>

(119)：

The error information of the network node is symbolized information, and the value can be only-1, 0 or 1. The signed error will continue to propagate back along the neural network (119, 105, 120). The error of node i in the previous layer of neural network consists of the sum of the products of the error and the weight of all the nodes of the current layer connected to it:

after the error information is reversely transmitted to the layer, the gradient of the change of the whole output error function of the network relative to the weight (105) in the neural network of the layer can be obtained by combining the binary node state information generated in the forward transmission process, so that the weight can be adjusted by utilizing a gradient descent algorithm to finish one neural network training.

and S400, training the network of the current layer according to the error information of the network of the current layer and the random binary output generated by the network of the previous layer.

In the present embodiment, the above three technical elements may be employed independently or in combination in the neural network training. The three technical elements correspond to three indispensable processes of neural network training: forward propagation of information, calculation of the derivative of the activation function, backward propagation of errors. Each process can be replaced by adopting a traditional high-precision calculation mode or adopting the technical elements to form a 2x2x2=8 combined technical scheme.

Specifically, in one implementation manner of the present embodiment, the step S400 includes the following steps:

step S401, calculating to obtain the gradient of the integral output error function of the network relative to the weight change in the network of the current layer according to the error information of the network of the current layer and the random binary output generated by the network of the previous layer;

and S402, adjusting the weight according to the gradient of the weight change and a gradient descending algorithm.

In this embodiment, in the neural network training, the technical solution includes:

the first training technical scheme is as follows: the case that none of the three technical elements is adopted, namely the case that all the technical elements adopt the high-precision calculation mode, is a training method of the traditional neural network.

The second training technical scheme is as follows: only a second technical factor is adopted, and the forward propagation information and the backward propagation error in the neural network adopt high-precision numerical values.

The third training technical scheme is as follows: only by using the technical element three, the derivatives of the forward propagating information and the activation function in the neural network adopt high-precision numerical values.

The training technical scheme is four: and a technical element II and a technical element III are adopted, and the forward propagation information in the neural network adopts a high-precision numerical value.

The training technical scheme is as follows: only the technical element one is adopted, and the activating function derivative and the back propagation error in the neural network adopt high-precision numerical values, so that the method is a calculation method of the existing binary neural network.

The training technical scheme is six: and a technical element I and a technical element II are adopted, and the error of back propagation in the neural network adopts a high-precision numerical value.

The training technical scheme is seven: and adopting a technical element I and a technical element III, and adopting a high-precision numerical value for the derivative of the activation function in the neural network.

The training technical scheme is eight: by adopting the technical element I, the technical element II and the technical element III, all node information in the neural network has no high-precision numerical value.

In this embodiment, through the 8 training technical solutions, the trained neural network can be obtained, and then the trained neural network can be used for reasoning in the practice process; the reasoning process based on the trained neural network can be a traditional high-precision reasoning process or a technical scheme of neural network reasoning adopting a technical element four and a technical element five.

and step S500, reasoning of the neural network is carried out according to the random binary numerical values propagated layer by layer.

After the neural network training is completed, the inference of the neural network can adopt a traditional high-precision calculation mode. In order to reduce the computational power consumption in the inference process, the inference process of the neural network also needs to be subjected to binarization processing. Because the reasoning process of the neural network is the same as the information forward propagation process in the training process of the neural network, the technical element I can be directly applied to realize the neural network reasoning of random binarization. However, due to the existence of randomness and the loss of information during binarization, the accuracy of neural network reasoning is degraded. In the reasoning process, the following technical elements four or five can be selected to improve the reasoning accuracy:

specifically, in an implementation manner of the embodiment of the present invention, the neural network inference method further includes the following steps:

step S601, obtaining a binary numerical value with a value of 0 or 1 for the network node information propagated in the forward direction in a deterministic manner;

step S602, transmitting the obtained binary numerical value to the last layer of the neural network to obtain the inference result of the neural network.

In this embodiment, in the neural network inference process, when implementing technical element four, inference needs to be performed based on a deterministic binarization network; in the information forward propagation process, in this embodiment, sampling is no longer performed with the output value of the activation function as a probability, but rather, a deterministic binarization process is performed, and a binarization network node output result (shown as 201, 202 in fig. 4) is obtained:

if z is _j Not less than 0.5, then

Otherwise->

Equivalently, the activation function can be omitted in this embodiment, and the input information y of the local network node j can be directly obtained from the local network node j _j To obtain the binarized network node outputs (203, 204 shown in fig. 4):

if z is _j Is more than or equal to 0, then

Or else>

Result of binarization

As input to the latter neural network. And finally, transmitting the binarization information to the last layer to obtain an inference result of the neural network.

and step S701, performing a forward propagation process of repeated random binarization for multiple times, and obtaining a final inference result of the neural network according to a voting result of the inference results of multiple times.

In this embodiment, in the neural network inference process, when implementing the technical element five, inference needs to be performed based on the repeated random binarization network result: the forward propagation process of random binarization can be repeated for multiple times, and the final inference result is obtained in the form of voting for the inference results for multiple times (as shown in fig. 5). The repeated times of the forward propagation reasoning process of random binarization are continuously increased, so that the reasoning accuracy is continuously improved.

Therefore, in the neural network reasoning process, the two technical elements enable three reasoning technical schemes to be selected.

The first inference technical scheme is as follows: neural network reasoning (conventional scheme) is performed using high precision forward propagation.

The second reasoning technical scheme is as follows: and (4) carrying out neural network reasoning by adopting a technical element four, namely a deterministic binarization method (the prior technical scheme).

The third reasoning technical scheme is as follows: and (4) carrying out neural network reasoning by adopting a technical element five, namely a repetitive random binarization method.

In the application process of the full cycle (i.e. training and reasoning) of the neural network, the 8 training technical schemes and the 3 reasoning technical schemes can be combined with each other to form 24 neural network full cycle technical schemes. The method comprises the following steps of firstly, adopting a training technical scheme and secondly, adopting a reasoning technical scheme, namely a traditional high-precision neural network training and reasoning method. The existing neural network quantization or binarization method only adopts a technical element I or a technical element IV to form a combination of a training technical scheme V and an inference technical scheme II.

The following takes an application of Multi-layer perceptron (MLP) neural network learning and recognizing handwritten numbers as an example to show the implementation and effects of the above technical elements and solutions.

And (4) learning and identifying the MNIST handwritten digit set by adopting a fully-connected multilayer neural network with a structure of (784-500-200-10). The MNIST handwritten digit set consists of a training set consisting of 60000 handwritten digital images and a test set consisting of 10000 handwritten digital images. Images in the training set are used for training (or learning) of the neural network; the data in the test set is used to test the recognition accuracy (inference accuracy) of the neural network. Each digital image in the MNIST consists of 32x32 pixel points and corresponds to 784 input nodes of the first layer of the neural network; the 10 output nodes of the last layer of the neural network correspond to 10 classifications of digital images (0,1, …, 9). The neural network training also follows the following settings:

(1) A fixed learning rate η =0.1 is used.

(2) The small Batch (Mini Batch) training mode is adopted, 60000 sample images in a training set are divided into 600 batches, and each Batch of training samples is 100. The average of the gradients obtained for 100 samples in each batch was used as a weight update.

(3) And completing one-time learning for all samples in the training set, wherein the training period is one training period. After each training period is finished, the 10000 sample images in the test set are inferred according to the first inference technical scheme to obtain the identification precision (or inference error rate).

(4) The hidden layer adopts a Logistic function with the parameter a =4 as the activation function.

(5) The output layer adopts a Softmax function as an activation function and Cross Engine as an objective function of output error.

From the comparison of implementation effects, the training scheme (scheme five to scheme eight) adopting forward information binarization (technical element one) is generally superior to the training scheme (scheme one to scheme four) adopting high-precision mode information forward propagation. Compared with the technical scheme without the technical elements II and III, the technical scheme with the technical elements II and III has smaller training result difference.

In this embodiment, the results of inference error rates of three different inference technical schemes under four different training technical schemes are listed. Under the condition of single inference, the inference scheme I is superior to the inference scheme II, and the inference scheme II is superior to the inference scheme III. And the first inference scheme and the second inference scheme are deterministic inference processes, and repeated inference for many times does not improve inference effects. The third reasoning scheme is a stochastic reasoning process, and the reasoning effect can be continuously improved by increasing the repetition times of the reasoning process; the inference effect is superior to the effect of the inference technical scheme two after repeated inference for a plurality of times and approaches or surpasses the effect of the inference technical scheme one.

As shown in fig. 6, from the combined effect of different training schemes and different inference schemes, the training scheme with the technical element one, the technical element two, and the technical element three is introduced, and after the training is completed, the different inference effects are generally superior to the training scheme without the technical element one, the technical element two, and the technical element three.

The embodiment is only used as an exemplary illustration for applying the technical scheme proposed by the invention, and cannot be used as a limitation for applying the invention to the neural network training and reasoning process.

The technical scheme provided by the invention can be applied to a plurality of neural networks which adopt error back propagation and gradient descent algorithms as basic algorithms, such as a convolutional neural network (convolutional neural network), a long-short-term memory (long-short-term memory) neural network, a feedback neural network (feedback neural network), a Reinforcement Learning (Reinforcement Learning) network and the like. The neural networks can be used in various application scenes, such as image recognition, voice recognition, natural language processing, man-machine game playing, automatic driving and the like.

The embodiment achieves the following technical effects through the technical scheme:

in the process of forward propagation of the neural network information, the first technical element enables each layer of input to be in a binary state, the product operation of the input and the weight of the network layer is changed from the product of two high-precision numerical values to the multiplication of a numerical value with a value of 0 or 1 and another high-precision numerical value, and the calculation force requirement of the forward propagation of the information is greatly reduced. In addition, in the training process of the neural network, the input state of each layer network in the forward propagation process needs to be stored for calculating the weight gradient after the error is propagated backwards to the layer. By adopting the technical element one, the state of the network node to be stored is converted from a high-precision numerical value into a binary numerical value, and the storage requirement in the neural network training process is greatly reduced.

In the neural network training process, the derivative information of each network node also needs to be stored in a specific memory unit for the error back propagation process. By adopting the second technical element, the derivative information of the network node to be stored is converted from a high-precision numerical value into a binary numerical value, and the storage requirement in the neural network training process is greatly reduced.

In the error back propagation process, the technical element two and the technical element three are adopted, so that the error information of each network node is converted from a high-precision numerical value to a symbolic numerical value state, and the value is only-1,0 or 1. The error and weight product operation is changed from the product of two high-precision numerical values to the multiplication of a numerical value which is-1,0 or 1 and another high-precision numerical value, and the calculation force requirement of error back propagation is greatly reduced.

In this embodiment, the technical element one, the technical element two, and the technical element three are simultaneously adopted, so that the calculation of the weight gradient is changed from the product of two high-precision numerical values to the product of a numerical value taking 0 or 1 and a numerical value taking-1,0 or 1, and the demand of the weight updating process on the calculation power is reduced. Moreover, the value of the weight gradient can only be-1,0 or 1, so that the value of the weight is changed in a determined unit, and the quantization processing of the weight is facilitated. The weights of the neural network may be described entirely in fixed point numerical values or integer numerical values.

By adopting the technical scheme in the embodiment, when the neural network acceleration is realized by utilizing the memristor array, the complexity of a peripheral circuit of the memristor array is effectively reduced by the technical element I, the technical element II, the technical element III, the technical element IV and the technical element V, and the cost and the power consumption required by hardware acceleration neural network operation are effectively reduced.

Exemplary device

Based on the above embodiment, the present invention further provides a neural network training and reasoning apparatus, including:

the forward information random binarization module is used for mapping the forward propagated network node information according to the activation function, sampling in the Bernoulli process according to the mapped numerical value to obtain a random binarization numerical value generated by the network of the current layer, and taking the obtained random binarization numerical value as the input of the network of the next layer;

the training module is used for training the network of the current layer according to the error information of the network of the current layer and the random binary output generated by the network of the previous layer;

and the inference module is used for carrying out inference of the neural network according to the random binary numerical values propagated layer by layer.

Based on the above embodiment, the present invention further provides a terminal, and a functional block diagram of the terminal may be as shown in fig. 7.

The terminal includes: the system comprises a processor, a memory, an interface, a display screen and a communication module which are connected through a system bus; wherein the processor of the terminal is configured to provide computing and control capabilities; the memory of the terminal comprises a storage medium and an internal memory; the storage medium stores an operating system and a computer program; the internal memory provides an environment for the operation of an operating system and a computer program in the storage medium; the interface is used for connecting external equipment, such as mobile terminals, computers and the like; the display screen is used for displaying corresponding information; the communication module is used for communicating with a cloud server or a mobile terminal.

The computer program is executed by a processor to implement the operations of a neural network training and reasoning method.

It will be understood by those skilled in the art that the block diagram of fig. 7 is a block diagram of only a portion of the structure associated with the inventive arrangements, and is not intended to limit the terminals to which the inventive arrangements may be applied, and that a particular terminal may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a terminal is provided, which includes: the neural network training and reasoning method comprises a processor and a memory, wherein the memory stores a neural network training and reasoning program, and the neural network training and reasoning program is used for realizing the operation of the neural network training and reasoning method when being executed by the processor.

In one embodiment, a storage medium is provided, wherein the storage medium stores a neural network training and reasoning program, and the neural network training and reasoning program is used for implementing the operation of the neural network training and reasoning method as above when executed by a processor.

In one embodiment, there is provided an apparatus comprising: a circuit module for implementing the operations of the neural network training and reasoning method as above.

It will be understood by those skilled in the art that all or part of the processes of the methods of the above embodiments may be implemented by hardware related to instructions of a computer program, which may be stored in a non-volatile storage medium, and when executed, may include the processes of the embodiments of the methods described above. Any reference to memory, databases, or other media used in embodiments provided herein may include non-volatile and/or volatile memory.

In summary, the present invention provides a neural network training and reasoning method, apparatus, terminal and storage medium, the method comprising: mapping the network node information which is transmitted in the forward direction according to the activation function, sampling in the Bernoulli process according to the mapped numerical value to obtain a random binary numerical value generated by the network of the current layer, and taking the obtained random binary numerical value as the input of the network of the next layer; carrying out Bernoulli process sampling on the derivative of the activation function to obtain the derivative of the activation function after random binarization; performing symbolization processing on the counter-propagating error of the next layer of network, and calculating the error information of the network according to the symbolized value and the derivative of the activation function after random binarization; training the network of the current layer according to the error information of the network of the current layer and the random binary output generated by the network of the previous layer; and reasoning the neural network according to the random binary numerical values propagated layer by layer. The invention adopts the neural network which carries out forward propagation and back propagation of symbolic errors by using the random binary signals, thereby reducing the calculation resources and improving the identification precision.

It is to be understood that the invention is not limited to the examples described above, but that modifications and variations may be effected thereto by those of ordinary skill in the art in light of the foregoing description, and that all such modifications and variations are intended to be within the scope of the invention as defined by the appended claims.

Claims

1. A neural network training and reasoning method is characterized by comprising the following steps:

2. The neural network training and reasoning method of claim 1, wherein the mapping of the forward propagated network node information according to the activation function previously comprises:

3. The neural network training and reasoning method of claim 1, wherein the mapping processing is performed on the network node information which is propagated in the forward direction according to an activation function, bernoulli process sampling is performed according to the mapped values, so as to obtain random binary values generated by the network of the local layer, and the obtained random binary values are used as the input of the network of the next layer, and the method comprises the following steps of:

sampling in the Bernoulli process by taking the numerical value obtained by mapping as probability to obtain the output result of the corresponding random binary network node; wherein, the output result is a random binary numerical value generated by the local network;

4. The neural network training and reasoning method of claim 1, wherein the activation function is a squeeze function, comprising: a Logistic function, an error function, a shear-rectified linear unit function, and a symmetric shear-rectified linear unit function.

5. The neural network training and reasoning method of claim 1, wherein the performing bernoulli process sampling on the derivative of the activation function to obtain the derivative of the activation function after random binarization comprises:

obtaining a derivative of the activation function;

6. The neural network training and reasoning method as claimed in claim 5, wherein the Bernoulli process sampling is performed with the obtained derivatives as probabilities to obtain output results of corresponding random binarization network nodes, and the method comprises:

7. The neural network training and reasoning method of claim 1, wherein the symbolizing the back-propagated error of the next layer network, and calculating the error information of the current layer network according to the symbolized value and the derivative of the randomly binarized activation function comprises:

8. The neural network training and reasoning method of claim 1, wherein the training of the network of the current layer according to the error information of the network of the current layer and the random binarization output generated by the network of the previous layer comprises:

9. The neural network training and reasoning method of claim 1, further comprising:

10. The neural network training and reasoning method of claim 1, further comprising:

and performing a forward propagation process of repeating the random binarization for multiple times, and obtaining a final inference result of the neural network according to voting results of the inference results of the multiple times.

11. A neural network training and reasoning apparatus, comprising:

12. A terminal, comprising: a processor and a memory, the memory storing a neural network training and reasoning program, the neural network training and reasoning program when executed by the processor being for implementing the operation of the neural network training and reasoning method as claimed in any one of claims 1-10.

13. A storage medium, wherein the storage medium is a computer-readable storage medium, and wherein the storage medium stores a neural network training and reasoning program, which when executed by a processor, is configured to perform operations of the neural network training and reasoning method as recited in any one of claims 1-10.

14. An apparatus, comprising: a circuit module for implementing the operations of the neural network training and reasoning method as claimed in any one of claims 1-10.