WO2024092896A1 - 一种神经网络训练及推理方法、装置、终端及存储介质 - Google Patents

一种神经网络训练及推理方法、装置、终端及存储介质 Download PDF

Info

Publication number
WO2024092896A1
WO2024092896A1 PCT/CN2022/133546 CN2022133546W WO2024092896A1 WO 2024092896 A1 WO2024092896 A1 WO 2024092896A1 CN 2022133546 W CN2022133546 W CN 2022133546W WO 2024092896 A1 WO2024092896 A1 WO 2024092896A1
Authority
WO
WIPO (PCT)
Prior art keywords
network
layer
neural network
training
derivative
Prior art date
Application number
PCT/CN2022/133546
Other languages
English (en)
French (fr)
Inventor
王伟
李阳
姜文峰
汪令飞
耿玓
刘明
Original Assignee
鹏城实验室
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 鹏城实验室 filed Critical 鹏城实验室
Publication of WO2024092896A1 publication Critical patent/WO2024092896A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Definitions

  • the present invention relates to the field of artificial intelligence technology, and in particular to a neural network training and reasoning method, device, terminal and storage medium.
  • ANNs artificial neural networks
  • Neural networks usually contain multiple layers of interconnected nonlinear network nodes, and the connection strength between nodes is called weight.
  • the information that the neural network needs to process is input from the input node, propagated layer by layer in the neural network, and finally reaches the output layer. This process is called forward propagation of information.
  • Forward propagation of information is the process of neural network processing input information, also known as reasoning process.
  • Neural networks can adjust the weights between the nodes in the network through specific algorithms and processes to make the reasoning results as accurate as possible. This process is called training or learning process.
  • Error backpropagation and gradient descent are important technical inventions for implementing neural network training.
  • Neural network training based on error backpropagation and gradient descent includes the following four steps:
  • the gradient information of the final output of the neural network relative to the connection weights in the network is calculated, and the connection weights in the network are adjusted according to the gradient descent algorithm.
  • the reasoning process of the neural network only includes the first step mentioned above, that is, the forward propagation of information.
  • neural network quantization reduces the demand for computing power in the neural network reasoning process to a certain extent, but reduces the recognition accuracy of the neural network, and in the training process of the binary network, the back propagation error is still described by high-precision numerical values, and the problem of reduced neural network recognition accuracy still exists in the process of accelerated training.
  • the technical problem to be solved by the present invention is that, in view of the defects of the prior art, the present invention provides a neural network training and reasoning method, device, terminal and storage medium to solve the technical problem of reduced recognition accuracy in the existing neural network training and reasoning methods in computing power bottleneck scenarios.
  • the present invention provides a neural network training and reasoning method, comprising:
  • the network node information of the forward propagation is mapped according to the activation function, and the Bernoulli process sampling is performed according to the mapped value to obtain the random binary value generated by the current layer of the network, and the obtained random binary value is used as the input of the next layer of the network;
  • the neural network is inferred based on the random binary values propagated layer by layer.
  • mapping process of the network node information of the forward propagation according to the activation function includes:
  • the inputs of all nodes connected to the network of the current layer are multiplied by the corresponding weights, and all the products are summed to obtain the input information of the nodes of the network of the current layer.
  • the forward propagation network node information is mapped according to the activation function, and the mapped value is sampled by the Bernoulli process to obtain the random binary value generated by the current layer network, and the obtained random binary value is used as the input of the next layer network, including:
  • the Bernoulli process sampling is performed with the mapped value as probability to obtain the output result of the corresponding random binary network node; wherein the output result is the random binary value generated by the network at this layer;
  • the output results of the randomly binarized network nodes are used as the input of the next layer of network.
  • the activation function is a squeezing function, including one or a combination of a logistic function, an error function, a sheared rectified linear unit function, and a symmetric sheared rectified linear unit function.
  • the performing of Bernoulli process sampling on the derivative of the activation function to obtain the derivative of the activation function after random binarization includes:
  • the obtained derivative is used as the probability to perform Bernoulli process sampling to obtain the output result of the corresponding random binary network node;
  • the output results of the randomly binarized network nodes are used as the error information calculation values of the back-propagation process.
  • the method of performing Bernoulli process sampling using the obtained derivative as probability to obtain the output result of the corresponding random binary network node includes:
  • the magnitude of the derivative of the activation function is scaled or approximated.
  • the symbolizing of the back-propagated error of the next layer network, and calculating the error information of the current layer network according to the symbolized value and the derivative of the random binarized activation function includes:
  • the error information of the network at this layer has a value of -1, 0 or 1.
  • the training of the network at the current layer according to the error information of the network at the current layer and the random binary output generated by the network at the previous layer includes:
  • the gradient of the overall network output error function relative to the weight change in the network in the current layer is calculated
  • the weights are adjusted according to the gradient of the weight change and the gradient descent algorithm.
  • the method further includes:
  • the acquired binary values are forward transmitted layer by layer to the last layer of the neural network to obtain the inference result of the neural network.
  • the method further includes:
  • the forward propagation process of random binarization is repeated multiple times, and the final inference result of the neural network is obtained according to the voting results of the multiple inference results.
  • the present invention provides a neural network training and reasoning device, comprising:
  • the forward information random binarization module is used to map the network node information of the forward propagation according to the activation function, and perform Bernoulli process sampling according to the mapped value to obtain the random binarized value generated by the current layer network, and use the obtained random binarized value as the input of the next layer network;
  • a derivative information random binarization module used for performing Bernoulli process sampling on the derivative of the activation function to obtain the derivative of the activation function after random binarization
  • An error symbolization processing module used for symbolizing the back-propagation error of the next layer network, and calculating the error information of the current layer network according to the symbolized value and the derivative of the random binarized activation function;
  • a training module used for training the network of the current layer according to the error information of the network of the current layer and the random binary output generated by the network of the previous layer;
  • the inference module is used to perform neural network inference based on the random binary values propagated layer by layer.
  • the present invention provides a terminal, comprising: a processor and a memory, wherein the memory stores a neural network training and reasoning program, and when the neural network training and reasoning program is executed by the processor, it is used to implement the operation of the neural network training and reasoning method as described in the first aspect.
  • the present invention further provides a storage medium, which is a computer-readable storage medium, and which stores a neural network training and reasoning program.
  • a storage medium which is a computer-readable storage medium, and which stores a neural network training and reasoning program.
  • the neural network training and reasoning program is executed by a processor, it is used to implement the operation of the neural network training and reasoning method as described in the first aspect.
  • the present invention further provides a device, comprising: a circuit module, wherein the circuit module is used to implement the operation of the neural network training and reasoning method as described in the first aspect.
  • the present invention transforms each layer of input into a binary state through mapping processing and Bernoulli process sampling, thereby greatly reducing the computing power requirement for forward information propagation; and, in the neural network training process, stores the derivative information of each network node in a specific memory unit, so that the network node derivative information that needs to be stored is transformed from a high-precision numerical value to a binary numerical value, thereby greatly reducing the storage requirement in the neural network training process; and in the error back propagation process, transforms the error information of each network node from a high-precision numerical value to a symbolic numerical state, thereby greatly reducing the computing power requirement for error back propagation; the present invention improves the recognition accuracy of the neural network while reducing the computing power requirement.
  • FIG1 is a flow chart of a neural network training and reasoning method in one implementation of the present invention.
  • FIG2 is a schematic diagram of technical element one, technical element two and technical element three used in a neural network training process in an implementation of the present invention.
  • FIG3 is a schematic diagram of a squeezing function for forward propagation of a neural network in one implementation of the present invention.
  • FIG. 4 is a schematic diagram of two equivalent methods of using technical element four in the neural network reasoning process in one implementation of the present invention.
  • FIG5 is a flow chart of technical element five used in the neural network reasoning process in one implementation of the present invention.
  • FIG6 is a schematic diagram showing a comparison of the implementation effects of different full-cycle combination technical solutions in an implementation of the present invention.
  • FIG. 7 is a functional schematic diagram of a terminal in an implementation of the present invention.
  • neural network quantization reduces the demand for computing power in the neural network reasoning process to a certain extent, but reduces the recognition accuracy of the neural network, and in the training process of the binary network, the back propagation error is still described by high-precision numerical values, and the problem of reduced neural network recognition accuracy still exists in the process of accelerated training.
  • a neural network training and reasoning method is provided in the present embodiment.
  • the input of each layer is converted into a binary state through mapping processing and Bernoulli process sampling, which greatly reduces the computing power requirement for the forward propagation of information; and, in the neural network training process, the derivative information of each network node is stored in a specific memory unit, so that the derivative information of the network node that needs to be stored is converted from a high-precision numerical value to a binary value, which greatly reduces the storage requirement in the neural network training process; and, in the error back propagation process, the error information of each network node is converted from a high-precision numerical value to a symbolic numerical state, which greatly reduces the computing power requirement for the error back propagation; the present embodiment improves the recognition accuracy of the neural network while reducing the computing power requirement.
  • an embodiment of the present invention provides a neural network training and reasoning method, comprising the following steps:
  • Step S100 mapping the network node information of the forward propagation according to the activation function, and performing Bernoulli process sampling according to the mapped values to obtain the random binary values generated by the current layer of the network, and using the obtained random binary values as the input of the next layer of the network.
  • the neural network training and reasoning method is applied to a terminal, which includes but is not limited to: a computer, a computer board, a dedicated integrated circuit and other equipment; the terminal is provided with a neural network training and reasoning framework.
  • the neural network using random binary signals for forward propagation and symbolized errors for back propagation mainly includes the following three technical elements that can be applied to neural network training:
  • Technical element 1 For the network node information in the forward propagation, it is first mapped into a value between 0 and 1 through an activation function, and then the Bernoulli process sampling is performed with this value to obtain a binary random state as the input of the next layer of the network.
  • Technical element 2 Perform Bernoulli process sampling on the derivative of the activation function to obtain a binary random state for the error back propagation process.
  • the independent application or combined application of the above three technical elements in this embodiment effectively reduces the demand for computing power during information forward propagation and error back propagation, while ensuring the high accuracy of the training results.
  • Technical Element 5 Use the random binarization method in Technical Element 1 to perform forward propagation of network information, and use multiple forward propagations to output network node voting (Voting) to obtain the final inference result.
  • the use of technical factor 4 can greatly improve the computational efficiency of neural network reasoning and reduce reasoning delay, but it will cause a decrease in reasoning accuracy.
  • the reasoning process of the neural network gradually increases with the increase in the number of forward propagation times, which can achieve a balanced trade-off between computing resource consumption and reasoning accuracy during the reasoning process.
  • technical element 1 or technical element 4 alone is an existing technical solution.
  • Technical element 2, technical element 3 and technical element 5 are the technical elements used in this embodiment.
  • a technical solution of combining and applying technical element 1, technical element 2, technical element 3 and technical element 4 or technical element 5 is proposed at the same time.
  • technical element 1, technical element 2 and technical element 3 can be applied alone or in combination to replace the traditional high-precision calculation mode to form a combined technical solution of 8 neural network training; technical element 4 and technical element 5 form 3 technical solutions of neural network reasoning with the traditional high-precision reasoning process.
  • the above 8 neural network training and 3 neural network reasoning schemes are combined to form 24 neural network full-cycle technical schemes.
  • the traditional high-precision neural network training and reasoning scheme, the existing technical scheme of applying technical element one alone, and the existing technical scheme of applying technical element four alone are special cases of the 24 combined technical schemes in this embodiment.
  • three technical elements applicable to neural network training and two technical elements applicable to neural network reasoning processes are included. These technical elements are combined with each other to form a new technical solution for implementing neural network training and reasoning.
  • the training of the neural network involves technical element one, technical element two, and technical element three. These three technical elements are described in detail below.
  • step S100 the following steps are included before step S100:
  • Step S101a obtaining output information of each node in the upper layer network connected to the current layer network, and obtaining inputs of all nodes connected to the current layer network;
  • Step S101b multiplying the inputs of all nodes connected to the current layer network by corresponding weights, and summing up all the obtained products to obtain the input information of the nodes of the current layer network.
  • the input of the nodes of the current network layer is the output of the previous network layer after random binarization processing, that is, the output of the previous network layer is randomly binarized by the method described in this embodiment.
  • the output of the previous layer of network nodes is used as the input of the current layer of network, and the input information is used as Representation (104).
  • the input information of the network node j at this layer can be expressed as:
  • w ij is the weight connecting the ith network node in the previous network layer and the current network node j (105). The above formula sums the product of all inputs connected to the current network node j and the weight.
  • step S100 includes the following steps:
  • Step S101 mapping the input information of the current layer network according to the activation function, mapping the input information of the current layer network into a value between 0 and 1;
  • Step S102 performing Bernoulli process sampling with the mapped value as probability to obtain the output result of the corresponding random binary network node; wherein the output result is the random binary value generated by the network at the current layer;
  • Step S103 using the obtained output results of the randomly binarized network nodes as the input of the next layer of network.
  • the activation function (107) can be used to act on it to obtain the result zj (108):
  • the function f(u) should be a monotonically increasing function with an output range between 0 and 1.
  • this type of function is also called a squeezing function: when the input value is small, the output should be close to or equal to 0; when the input value is large, the output should be close to or equal to 1; when the input value gradually increases from a smaller value to a larger value, the output increases monotonically from 0 to 1.
  • FIG. 3 A typical extrusion function that meets these requirements is shown in Figure 3 and includes:
  • Symmetric sheared rectified linear unit function etc., where a is a constant greater than 0.
  • the value of zj is used as the probability to perform Bernoulli process sampling (109) to obtain the random binary network node output result (110):
  • the probability of taking the value 1 is z j ; otherwise The value is 0
  • the superscript b in the middle indicates that the output information has been randomly binarized, that is, its value can only be 0 or 1. Sampling results As the input of the next layer of neural network.
  • the neural network training and reasoning method further includes the following steps:
  • Step S200 performing Bernoulli process sampling on the derivative of the activation function to obtain the derivative of the activation function after random binarization.
  • step S200 includes the following steps:
  • Step S201 obtaining the derivative of the activation function
  • Step S202 using the obtained derivative as probability to perform Bernoulli process sampling to obtain the output result of the corresponding random binary network node
  • Step S203 Using the obtained output results of the randomly binarized network nodes as the error information calculation value of the back propagation process.
  • the derivative information (111, 112) of the activation function can be obtained:
  • the value of is used as the probability to perform Bernoulli process sampling (113) to obtain the random binary network node output result (114):
  • the probability of taking the value 1 is otherwise The value is 0
  • an activation function whose derivative value range is between 0 and 1 should be selected to obtain a valid derivative sampling result; in practice, it is found that proportionally scaling or approximating the amplitude of the derivative of the activation function only affects the convergence speed of the neural network, and will not affect the final training result of the neural network.
  • step S202 the following steps are included before step S202:
  • Step S202a scaling or approximating the amplitude of the derivative of the activation function.
  • the activation function when the Logistic function is used and the constant a is set to 1, the activation function is Its derivative value ranges from 0 to 0.25.
  • the activation function is Its derivative value range is from 0 to 1. In this embodiment, it does not need to be processed, and this value can be directly used as the probability for Bernoulli process sampling.
  • the activation function is Its derivative value ranges from 0 to 2.
  • the value can be divided by 2 and then the Bernoulli process sampling process (degenerated into a deterministic binarization process) can be performed.
  • the Logistic function When the Logistic function is used as the activation function for forward propagation, its derivative can be approximated by the Clipped ReLU function, and the deterministic binarization method is used to obtain the binarized derivative information used in the error back propagation process.
  • the neural network training and reasoning method further includes the following steps:
  • Step S300 symbolize the back-propagation error of the next layer network, and calculate the error information of the current layer network according to the symbolized value and the derivative of the random binarized activation function.
  • the back-propagation error needs to be symbolized, so that the error information of each network node can be converted from a high-precision numerical value to a symbolic numerical state, greatly reducing the computing power required for error back-propagation.
  • step S300 includes the following steps:
  • Step S301 symbolizing the back propagation error of the next layer network to obtain a symbolized error
  • Step S302 multiplying the obtained symbolic error by the derivative of the random binarized activation function to obtain the error information of the current layer network.
  • the back-propagated error is symbolized, which means that the back-propagated error takes a value of -1, 0 or 1.
  • the error (Error) is back-propagated, in this embodiment, the error ⁇ z j (115) transmitted back from the next layer is symbolized (Sign) (116) to obtain a symbolized error (SignedError) (117):
  • the superscript s indicates that the error information has been symbolized and its value can only be -1 or 1.
  • the symbolized error is multiplied by the derivative of the random binary activation function (118) to obtain the error information of the network node. (119):
  • the error information of this network node is symbolic information, and the value can only be -1, 0 or 1.
  • the symbolic error will continue to propagate back along the neural network (119, 105, 120).
  • the error of node i in the previous layer of the neural network is composed of the sum of the errors of all nodes in this layer connected to it and the product of the weights:
  • the gradient of the overall network output error function relative to the change of the weight (105) in the neural network of this layer can be obtained, so that the gradient descent algorithm can be used to adjust the weight and complete a neural network training.
  • the neural network training and reasoning method further includes the following steps:
  • Step S400 training the network of the current layer according to the error information of the network of the current layer and the random binary output generated by the network of the previous layer.
  • the above three technical elements can be used independently or in combination in neural network training.
  • the above three technical elements correspond to the three indispensable processes of neural network training: forward propagation of information, calculation of activation function derivatives, and back propagation of errors.
  • step S400 includes the following steps:
  • Step S401 calculating the gradient of the overall network output error function relative to the weight change in the current layer network according to the error information of the current layer network and the random binary output generated by the previous layer network;
  • Step S402 adjusting the weight according to the gradient of the weight change and the gradient descent algorithm.
  • the technical solution includes:
  • Training technology solution 1 The situation where none of the above three technical elements are used, that is, the situation where all high-precision calculation modes are used, is the traditional neural network training method.
  • Training technology solution 2 Only technical element 2 is used, and high-precision values are used for the forward propagation information and the backward propagation error in the neural network.
  • Training technology solution three Only technical element three is used, and high-precision numerical values are used for the forward propagation information and the derivatives of the activation function in the neural network.
  • Training technology solution 4 Using technical elements 2 and 3, the information forward propagated in the neural network uses high-precision numerical values.
  • Training technical solution five only adopts technical element one.
  • the derivative of the activation function and the error of back propagation in the neural network use high-precision numerical values, which is the calculation method of the existing binary neural network.
  • Training technology solution six Using technical elements one and two, the error of back propagation in the neural network uses high-precision numerical values.
  • Training Technical Solution 7 Using Technical Elements 1 and 3, the derivatives of the activation functions in the neural network use high-precision values.
  • Training technical solution eight Using technical elements one, two and three, all node information in the neural network has no high-precision numerical values.
  • a trained neural network can be obtained, and then the trained neural network can be used for reasoning in practice; wherein, the reasoning process based on the trained neural network can be a traditional high-precision reasoning process, or it can be a technical scheme of neural network reasoning using technical elements four and five.
  • the neural network training and reasoning method further includes the following steps:
  • Step S500 performing neural network inference according to the random binary values propagated layer by layer.
  • the reasoning of the neural network can adopt the traditional high-precision computing mode.
  • the reasoning process of the neural network also needs to be binarized. Since the reasoning process of the neural network is the same as the information forward propagation process during the neural network training process, technical element one can be directly applied to realize the random binarization of the neural network reasoning. However, due to the existence of randomness and the loss of information during the binarization process, the accuracy of the neural network reasoning decreases. During the reasoning process, the following technical elements four or five can be selected to improve the reasoning accuracy:
  • the neural network reasoning method further includes the following steps:
  • Step S601 obtaining a binary value of 0 or 1 for the network node information in a deterministic manner
  • Step S602 transmitting the acquired binary value to the last layer of the neural network to obtain the inference result of the neural network.
  • the activation function can be omitted, and the binary network node output (203, 204 as shown in FIG. 4) can be directly obtained from the input information yj of the network node j of this layer:
  • Binarization result As the input of the next layer of neural network, the final binary information is transmitted to the last layer to obtain the inference result of the neural network.
  • the neural network reasoning method further includes the following steps:
  • Step S701 repeating the random binarization forward propagation process multiple times, and obtaining the final inference result of the neural network according to the voting results of the multiple inference results.
  • the random binarization forward propagation process can be repeated multiple times, and the final reasoning result is obtained by voting on multiple reasoning results (as shown in FIG5 ). Continuously increasing the number of repetitions of the random binarization forward propagation reasoning process can continuously improve the reasoning accuracy.
  • Reasoning technology solution 2 Use technical element 4, namely the deterministic binarization method, to perform neural network reasoning (existing technical solution).
  • Reasoning technology solution three Use technical element five, namely the repeated random binarization method, to perform neural network reasoning.
  • training technology solution 1 and reasoning technology solution 1 are a traditional high-precision neural network training and reasoning method.
  • the existing neural network quantization or binarization method only uses technical element 1 or technical element 4 to form a combination of training technology solution 5 and reasoning technology solution 2.
  • MLP multi-layer perceptron
  • a fully connected multi-layer neural network with a [784-500-200-10] structure is used to learn and recognize the MNIST handwritten digit set.
  • the MNIST handwritten digit set consists of a training set consisting of 60,000 handwritten digit images and a test set consisting of 10,000 handwritten digit images.
  • the images in the training set are used for training (or learning) the neural network; the data in the test set is used to test the recognition accuracy (inference accuracy) of the neural network.
  • Each digital image in MNIST consists of 32x32 pixels, corresponding to the 784 input nodes of the first layer of the neural network; the 10 output nodes of the last layer of the neural network correspond to the 10 categories of digital images (0, 1, ..., 9).
  • Neural network training also follows the following settings:
  • the mini-batch training mode is adopted.
  • the 60,000 sample images in the training set are divided into 600 batches, and each batch contains 100 training samples.
  • the average value of the gradients obtained from the 100 samples in each batch is used for weight update.
  • the output layer uses Softmax function as the activation function and Cross Entropy as the objective function of the output error.
  • the training schemes (Schemes 5 to 8) that use forward propagation information binarization (Technical Element 1) are generally better than the training schemes (Schemes 1 to 4) that use high-precision forward propagation of information.
  • the training results of the technical schemes that use Technical Elements 2 and 3 are slightly different from those that do not use Technical Elements 2 and 3.
  • inference scheme 1 is better than inference scheme 2
  • inference scheme 3 is better than inference scheme 3.
  • Inference schemes 1 and 2 are deterministic inference processes, and repeated inference will not improve the inference effect.
  • Inference scheme 3 is a random inference process, and increasing the number of repetitions of the inference process will continuously improve the inference effect; after several repeated inferences, the inference effect is better than the effect of inference technology scheme 2, and approaches or exceeds the effect of inference technology scheme 1.
  • the training scheme that introduces technical elements one, two, and three generally has better reasoning effects after training than the training scheme that does not introduce technical elements one, two, and three.
  • This embodiment is only an exemplary description of the application of the technical solution proposed by the present invention, and cannot be used as a limitation of the application of the present invention in the neural network training and reasoning process.
  • the technical solution proposed in the present invention can be applied to various neural networks that use error back propagation and gradient descent algorithms as basic algorithms, such as convolutional neural networks, long short-term memory neural networks, recurrent neural networks, and reinforcement learning networks.
  • These neural networks can be used in a variety of different application scenarios, such as image recognition, speech recognition, natural language processing, human-computer chess, automatic driving, and other application scenarios.
  • technical element one makes the input of each layer become a binary state, and the product operation of the input and weight of the network layer is changed from the product of two high-precision numerical values to the multiplication of a numerical value of 0 or 1 and another high-precision numerical value, which greatly reduces the computing power required for the forward propagation of information.
  • it is necessary to store the input state of each layer of the network during the forward propagation process for the calculation of the weight gradient after the error is back-propagated to this layer.
  • the use of technical element one makes the network node state that needs to be stored change from a high-precision numerical value to a binary value, which greatly reduces the storage demand during the training process of the neural network.
  • the derivative information of each network node also needs to be stored in a specific memory unit for the error back propagation process.
  • the network node derivative information that needs to be stored is converted from a high-precision value to a binary value, which greatly reduces the storage requirements during the neural network training process.
  • this embodiment adopts technical elements 2 and 3, so that the error information of each network node is transformed from a high-precision numerical value to a symbolic numerical state, which can only take values of -1, 0 or 1.
  • the product operation of error and weight is changed from the product of two high-precision numerical values to the multiplication of a numerical value taking values of -1, 0 or 1 and another high-precision numerical value, which greatly reduces the computing power requirement for error back propagation.
  • the present invention further provides a neural network training and reasoning device, comprising:
  • the forward information random binarization module is used to map the network node information of the forward propagation according to the activation function, and perform Bernoulli process sampling according to the mapped value to obtain the random binarized value generated by the current layer network, and use the obtained random binarized value as the input of the next layer network;
  • a derivative information random binarization module used for performing Bernoulli process sampling on the derivative of the activation function to obtain the derivative of the activation function after random binarization
  • An error symbolization processing module used for symbolizing the back-propagation error of the next layer network, and calculating the error information of the current layer network according to the symbolized value and the derivative of the random binarized activation function;
  • a training module used for training the network of the current layer according to the error information of the network of the current layer and the random binary output generated by the network of the previous layer;
  • the inference module is used to perform neural network inference based on the random binary values propagated layer by layer.
  • the present invention further provides a terminal, whose principle block diagram may be shown in FIG. 7 .
  • the terminal includes: a processor, a memory, an interface, a display screen and a communication module connected via a system bus; wherein the processor of the terminal is used to provide computing and control capabilities; the memory of the terminal includes a storage medium and an internal memory; the storage medium stores an operating system and a computer program; the internal memory provides an environment for the operation of the operating system and the computer program in the storage medium; the interface is used to connect to external devices, such as mobile terminals and computers; the display screen is used to display corresponding information; and the communication module is used to communicate with a cloud server or a mobile terminal.
  • the computer program is used to implement the operation of a neural network training and reasoning method when executed by a processor.
  • FIG7 is only a block diagram of a partial structure related to the solution of the present invention, and does not constitute a limitation on the terminal to which the solution of the present invention is applied.
  • the specific terminal may include more or fewer components than shown in the figure, or combine certain components, or have a different arrangement of components.
  • a terminal which includes: a processor and a memory, the memory storing a neural network training and reasoning program, and the neural network training and reasoning program is used to implement the operations of the above neural network training and reasoning method when executed by the processor.
  • a storage medium stores a neural network training and reasoning program, which, when executed by a processor, is used to implement the operations of the above neural network training and reasoning method.
  • a device including: a circuit module, wherein the circuit module is used to implement the operations of the above neural network training and reasoning method.
  • any reference to a memory, a database or other medium used in the embodiments provided by the present invention can include non-volatile and/or volatile memory.
  • the present invention provides a neural network training and reasoning method, device, terminal and storage medium, the method includes: mapping the network node information of forward propagation according to the activation function, and performing Bernoulli process sampling according to the mapped value to obtain the random binary value generated by the current network, and using the obtained random binary value as the input of the next network; performing Bernoulli process sampling on the derivative of the activation function to obtain the derivative of the random binary activation function; symbolizing the error of the back propagation of the next network, and calculating the error information of the current network according to the symbolized value and the derivative of the random binary activation function; training the current network according to the error information of the current network and the random binary output generated by the previous network; and reasoning the neural network according to the random binary values propagated layer by layer.
  • the present invention uses a neural network that uses random binary signals for forward propagation and symbolized errors for back propagation, which reduces computing resources and improves recognition accuracy.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种神经网络训练及推理方法、装置、终端及存储介质,包括:对正向传播的网络节点信息进行映射处理,并根据映射后的数值进行伯努利过程采样,将得到的随机二值化数值作为下一层网络的输入;对激活函数的导数进行伯努利过程采样,得到随机二值化后的激活函数的导数;对下一层网络的反向传播的误差进行符号化处理,并根据符号化处理后的值和随机二值化后的导数计算本层网络的误差信息;根据本层网络的误差信息和上一层网络产生的随机二值化输出对本层网络进行训练;根据逐层传播的随机二值化数值进行神经网络的推理。本发明采用随机二值化的信号进行正向传播和符号化的误差进行反向传播的神经网络,降低计算资源且提高了识别精度。

Description

一种神经网络训练及推理方法、装置、终端及存储介质 技术领域
本发明涉及人工智能技术领域,尤其涉及的是一种神经网络训练及推理方法、装置、终端及存储介质。
背景技术
人工神经网络(Artificial Neural Network,以下简称神经网络)的技术进步是近年来科技发展的重要推动力。神经网络广泛应用于对图像、声音、文字等信息的处理过程中。
神经网络中通常包含多层相互连接的非线性网络节点,节点之间的连接强度称为权重。神经网络的所需处理的信息从输入节点输入,在神经网络中逐层传播,最终到达输出层,这一过程称为信息的正向传播。信息的正向传播是神经网络对输入信息的处理过程,又称推理过程。神经网络可以通过特定的算法和流程调整网络中连接各个节点之间的权重,从而使推理结果尽可能地准确,这一过程称为训练或学习过程。
误差反向传播(Error Backpropagation)和梯度下降算法(Gradient Descent)是实现神经网络训练的重要技术发明。基于误差反向传播和梯度下降算法的神经网络训练,包含以下4个步骤:
1)将训练集中的样本数据输入到神经网络中进行信息的正向传播,获得每个节点的状态信息,并获得最终输出结果;
2)输出结果与样本数据的标记信息进行对比,获得输出误差(Error);
3)将输入误差从网络终端作为输入,由神经网络最后一层向第一层反向传播;
4)利用正向传播的信息和反向传播的误差,计算获得神经网络最终输出结果相对于网络中连接权重的梯度信息,并依据梯度下降算法调整网络中的连接权重。神经网络的推理过程仅包含上述第一个步骤,即信息的正向传播。
在传统神经网络训练和推理中,正向传播的信息和反向传播的误差需要采用高精度的数值来描述。但是,高精度数值在计算机中的存储和处理开销较大,造成了神经网络训练对算力和能耗的较高需求。算力和能耗问题成为神经网络进一步广泛应用的瓶颈。此外,当利用忆阻器阵列实现神经网络加速时,高精度数值所描述的信息和误差需要复杂的外围电路来完成,增加了硬件加速神经网络运算的成本和功耗。
为解决或缓解神经网络训练和推理过程中的算力瓶颈和能耗瓶颈问题,人们发明了多种技术方法。主要包括神经网络量化(Neural Network Quantization)技术和神经网络二值 化(Neural Network Binarization)技术;其中,神经网络量化在一定程度上降低了神经网络推理过程中对算力的需求,但是降低了神经网络的识别精度,而二值化网络的训练过程中,反向传播的误差仍采用高精度的数值来描述,在加速训练的过程中仍然存在神经网络识别精度下降的问题。
因此,现有技术还有待改进。
发明内容
本发明要解决的技术问题在于,针对现有技术缺陷,本发明提供一种神经网络训练及推理方法、装置、终端及存储介质,以解决现有的神经网络训练和推理方法在算力瓶颈场景中存在的识别精度下降的技术问题。
本发明解决技术问题所采用的技术方案如下:
第一方面,本发明提供一种神经网络训练及推理方法,包括:
根据激活函数对正向传播的网络节点信息进行映射处理,并根据映射后的数值进行伯努利过程采样,得到本层网络产生的随机二值化数值,将得到的随机二值化数值作为下一层网络的输入;
对所述激活函数的导数进行伯努利过程采样,得到随机二值化后的激活函数的导数;
对所述下一层网络的反向传播的误差进行符号化处理,并根据符号化处理后的值和所述随机二值化后的激活函数的导数计算所述本层网络的误差信息;
根据所述本层网络的误差信息和上一层网络产生的随机二值化输出对本层网络进行训练;
根据逐层传播的随机二值化数值进行神经网络的推理。
在一种实现方式中,所述根据激活函数对正向传播的网络节点信息进行映射处理,之前包括:
获取上一层网络中连接到所述本层网络的各节点的输出信息,得到所有连接所述本层网络的节点的输入;其中,所述本层网络的节点的输入为所述上一层网络的经过随机二值化处理后的输出;
将所有连接所述本层网络的节点的输入与对应的权重进行相乘,并将得到的所有乘积进行求和,得到所述本层网络的节点的输入信息。
在一种实现方式中,所述根据激活函数对正向传播的网络节点信息进行映射处理,并根据映射后的数值进行伯努利过程采样,得到本层网络产生的随机二值化数值,将得到的 随机二值化数值作为下一层网络的输入,包括:
根据所述激活函数对所述本层网络的输入信息进行映射处理,将所述本层网络的输入信息映射成0至1之间的数值;
以映射得到的数值为概率进行伯努利过程采样,获得对应的随机二值化网络节点的输出结果;其中,所述输出结果为所述本层网络产生的随机二值化数值;
将获得的随机二值化网络节点的输出结果作为所述下一层网络的输入。
在一种实现方式中,所述激活函数为挤压函数,包括:Logistic函数、误差函数、剪切式整流线性单元函数以及对称剪切式整流线性单元函数中的一种或组合。
在一种实现方式中,所述对所述激活函数的导数进行伯努利过程采样,得到随机二值化后的激活函数的导数,包括:
获取所述激活函数的导数;
以获得的导数为概率进行伯努利过程采样,获得对应的随机二值化网络节点的输出结果;
将获得的随机二值化网络节点的输出结果作为反向传播过程的误差信息计算值。
在一种实现方式中,所述以获得的导数为概率进行伯努利过程采样,获得对应的随机二值化网络节点的输出结果,之前包括:
对所述激活函数的导数的幅值进行等比例缩放或近似化处理。
在一种实现方式中,所述对所述下一层网络的反向传播的误差进行符号化处理,并根据符号化处理后的值和所述随机二值化后的激活函数的导数计算所述本层网络的误差信息,包括:
对所述下一层网络的反向传播的误差进行符号化处理,获得符号化的误差;
将获得的符号化的误差与所述随机二值化后的激活函数的导数相乘,得到所述本层网络的误差信息;
其中,所述本层网络的误差信息的取值为-1、0或1。
在一种实现方式中,所述根据所述本层网络的误差信息和上一层网络产生的随机二值化输出对本层网络进行训练,包括:
根据所述本层网络的误差信息和所述上一层网络产生的随机二值化输出,计算得到网络整体输出误差函数相对于所述本层网络中权重变化的梯度;
根据所述权重变化的梯度和梯度下降算法调整权重。
在一种实现方式中,还包括:
对正向传播的每层网络节点信息以确定性的方式获得取值为0或1的二值化数值;
将获取的二值化数值逐层正向传输到所述神经网络的最后一层,获得所述神经网络的推理结果。
在一种实现方式中,还包括:
进行多次重复随机二值化的正向传播过程,并根据多次推理结果的投票结果获得所述神经网络的最终推理结果。
第二方面,本发明提供一种神经网络训练及推理装置,包括:
正向信息随机二值化模块,用于根据激活函数对正向传播的网络节点信息进行映射处理,并根据映射后的数值进行伯努利过程采样,得到本层网络产生的随机二值化数值,将得到的随机二值化数值作为下一层网络的输入;
导数信息随机二值化模块,用于对所述激活函数的导数进行伯努利过程采样,得到随机二值化后的激活函数的导数;
误差符号化处理模块,用于对所述下一层网络的反向传播的误差进行符号化处理,并根据符号化处理后的值和所述随机二值化后的激活函数的导数计算所述本层网络的误差信息;
训练模块,用于根据所述本层网络的误差信息和上一层网络产生的随机二值化输出对本层网络进行训练;
推理模块,用于根据逐层传播的随机二值化数值进行神经网络的推理。
第三方面,本发明提供一种终端,包括:处理器以及存储器,所述存储器存储有神经网络训练及推理程序,所述神经网络训练及推理程序被所述处理器执行时用于实现如第一方面所述的神经网络训练及推理方法的操作。
第四方面,本发明还提供一种存储介质,所述存储介质为计算机可读存储介质,所述存储介质存储有神经网络训练及推理程序,所述神经网络训练及推理程序被处理器执行时用于实现如第一方面所述的神经网络训练及推理方法的操作。
第五方面,本发明还提供一种设备,包括:电路模块,所述电路模块用于实现如第一方面所述的神经网络训练及推理方法的操作。
本发明采用上述技术方案具有以下效果:
本发明在神经网络信息的正向传播过程中,通过映射处理及伯努利过程采样,使得每层输入变为二值化状态,大幅度降低了信息正向传播的算力需求;并且,在神经网络训练过程中,将每个网络节点的导数信息存储到特定的记忆单元中,使得需要存储的网络节点 导数信息由高精度数值转变为二进制数值,大幅度降低了神经网络训练过程中的存储需求;以及在误差反向传播过程中,将每个网络节点的误差信息由高精度数值转变为符号化数值状态,大幅度降低了误差反向传播的算力需求;本发明在减少算力需求的情况下提高了神经网络的识别精度。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图示出的结构获得其他的附图。
图1是本发明的一种实现方式中神经网络训练及推理方法的流程图。
图2是本发明的一种实现方式中神经网络训练过程中采用技术要素一、技术要素二以及技术要素三的示意图。
图3是本发明的一种实现方式中神经网络正向传播的挤压函数的示意图。
图4是本发明的一种实现方式中神经网络推理过程中采用技术要素四的两种等效方法示意图。
图5是本发明的一种实现方式中神经网络推理过程中采用技术要素五的流程图。
图6是本发明的一种实现方式中不同全周期组合技术方案的实施效果对比示意图。
图7是本发明的一种实现方式中终端的功能原理图。
本发明目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
为使本发明的目的、技术方案及优点更加清楚、明确,以下参照附图并举实施例对本发明进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
示例性方法
在传统神经网络训练和推理中,正向传播的信息和反向传播的误差需要采用高精度的数值来描述。但是,高精度数值在计算机中的存储和处理开销较大,造成了神经网络训练对算力和能耗的较高需求。算力和能耗问题成为神经网络进一步广泛应用的瓶颈。此外,当利用忆阻器阵列实现神经网络加速时,高精度数值所描述的信息和误差需要复杂的外围电路来完成,增加了硬件加速神经网络运算的成本和功耗。
为解决或缓解神经网络训练和推理过程中的算力瓶颈和能耗瓶颈问题,人们发明了多种技术方法。主要包括神经网络量化(Neural Network Quantization)技术和神经网络二值化(Neural Network Binarization)技术;其中,神经网络量化在一定程度上降低了神经网络推理过程中对算力的需求,但是降低了神经网络的识别精度,而二值化网络的训练过程中,反向传播的误差仍采用高精度的数值来描述,在加速训练的过程中仍然存在神经网络识别精度下降的问题。
针对上述技术问题,本实施例中提供了一种神经网络训练及推理方法,本实施例在神经网络信息的正向传播过程中,通过映射处理及伯努利过程采样,使得每层输入变为二值化状态,大幅度降低了信息正向传播的算力需求;并且,在神经网络训练过程中,将每个网络节点的导数信息存储到特定的记忆单元中,使得需要存储的网络节点导数信息由高精度数值转变为二进制数值,大幅度降低了神经网络训练过程中的存储需求;以及,在误差反向传播过程中,将每个网络节点的误差信息由高精度数值转变为符号化数值状态,大幅度降低了误差反向传播的算力需求;本实施例在减少算力需求的情况下提高了神经网络的识别精度。
如图1所示,本发明实施例提供一种神经网络训练及推理方法,包括以下步骤:
步骤S100,根据激活函数对正向传播的网络节点信息进行映射处理,并根据映射后的数值进行伯努利过程采样,得到本层网络产生的随机二值化数值,将得到的随机二值化数值作为下一层网络的输入。
在本实施例中,该神经网络训练及推理方法应用于终端上,该终端包括但不限于:计算机、计算机板卡、专用集成电路等设备;所述终端设置有基于神经网络训练及推理框架。
在本实施例中,采用随机二值化的信号进行正向传播和符号化的误差进行反向传播的神经网络,主要包含以下三个可应用于神经网络训练的技术要素:
技术要素一:对正向传播的网络节点信息,首先通过激活函数将其映射成0至1之间的数值,随后以此数值进行伯努利过程采样,获得二值化的随机状态,作为下一层网络的输入。
技术要素二:对激活函数的导数进行伯努利过程采样,获得二值化的随机状态,用于误差反向传播过程。
技术要素三:对反向传播的误差进行符号化处理,获得取值仅能为-1,0或1的符号化状态。
本实施例中的上述三个技术要素的独立应用或组合应用,有效降低了信息正向传播和 误差反向传播过程中对算力的需求,同时,确保了训练结果的高准确度。
在本实施例中,主要还包含以下两个可应用于神经网络推理过程的技术要素:
技术要素四:对正向传播的网络节点信息,以确定性的方式获得取值为0或1的二值化状态,作为下一层网络的输入。
技术要素五:采用技术要素一中的随机二值化方法进行网络信息的正向传播,并采用多次正向传播后输出网络节点投票(Voting)的方式获得最终推理结果。
在本实施例中,采用技术要素四可大幅提高神经网络推理的计算效率并降低推理延迟,但是会造成推理准确度的下降。采用技术要素五,神经网络的推理过程随正向传播次数的增加而逐渐升高,可实现推理过程中计算资源消耗与推理准确度的平衡取舍。
单独应用技术要素一或技术要素四是既有技术方案。技术要素二、技术要素三和技术要素五为本实施例中所采用的技术要素。本实施例中同时提出组合应用技术要素一、技术要素二、技术要素三以及技术要素四或技术要素五的技术方案。在实施过程中,技术要素一、技术要素二、技术要素三可单独或组合应用,取代传统高精度计算模式,形成8种神经网络训练的组合技术方案;技术要素四和技术要素五与传统高精度推理过程形成3种神经网络推理的技术方案。
上述8种神经网络训练和3种神经网络推理方案组合,可形成24种神经网络全周期技术方案。传统高精度神经网络训练和推理方案、单独应用技术要素一的既有技术方案以及单独应用技术要素四的既有技术方案,是本实施例中的24种组合技术方案的特例。
在本实施例的一种实现方式中,包含三个可应用于神经网络训练的技术要素和两个可应用于神经网络推理过程的技术要素。这些技术要素相互组合,形成用于实现神经网络训练和推理的新技术方案。其中神经网络的训练,涉及技术要素一、技术要素二和技术要素三。下面对这三个技术要素进行详细说明。
在本实施例中,在神经网络训练的过程中,在实施技术要素一时,需要将正向传播信息进行映射处理和随机二值化处理,以减小正向传播过程的算力。
具体地,在本实施例的一种实现方式中,步骤S100之前包括以下步骤:
步骤S101a,获取上一层网络中连接到所述本层网络的各节点的输出信息,得到所有连接所述本层网络的节点的输入;
步骤S101b,将所有连接所述本层网络的节点的输入与对应的权重进行相乘,并将得到的所有乘积进行求和,得到所述本层网络的节点的输入信息。
在本实施例中,本层网络的节点的输入为上一层网络的经过随机二值化处理后的输出, 即上一层网络的输出为经过本实施例中所述的方法进行了随机二值化处理。
如图2中101所示,在信息正向传播时,前一层网络节点的输出作为本层网络的输入,输入信息用
Figure PCTCN2022133546-appb-000001
表示(104)。本层网络节点j的输入信息可表述为:
Figure PCTCN2022133546-appb-000002
其中,
Figure PCTCN2022133546-appb-000003
是前一层网络中连接到本层网络节点j(106)的第i个节点输出信息,上角标b表示前一层网络的输出信息已经按照本方法进行了随机二值化处理;w ij为连接前一层网络中第i个网络节点和本层网络节点j的权重(105);上式对所有连接到本层网络节点j的输入和权重的乘积进行求和。
具体地,在本实施例的一种实现方式中,步骤S100包括以下步骤:
步骤S101,根据所述激活函数对所述本层网络的输入信息进行映射处理,将所述本层网络的输入信息映射成0至1之间的数值;
步骤S102,以映射得到的数值为概率进行伯努利过程采样,获得对应的随机二值化网络节点的输出结果;其中,所述输出结果为所述本层网络产生的随机二值化数值;
步骤S103,将获得的随机二值化网络节点的输出结果作为所述下一层网络的输入。
在本实施例中,获得网络节点j的输入信息y j后,可以采用激活函数(107)对其作用,获得结果z j(108):
z j=f(y j)
其中,函数f(u)应为输出范围为0到1之间、单调递增的函数。
在本实施例中,这类函数又称挤压函数(SquashingFunction):当输入值较小时,输出应接近0或等于0;当输入值较大时,输出应接近1或等于1;当输入值从较小的数值逐渐增加到较大的数值时,输出从0到1单调增加。
符合这些要求的典型挤压函数如图3所示,包括:
Logistic函数:
Figure PCTCN2022133546-appb-000004
误差函数:
Figure PCTCN2022133546-appb-000005
剪切式整流线性单元(ClippedReLU)函数:f(u)=min(max(0,au),1);
对称剪切式整流线性单元函数:
Figure PCTCN2022133546-appb-000006
等,其中a为大于0的常数。
随后,本实施例中以z j的数值作为概率进行伯努利过程采样(109),获得随机二值化网络节点输出结果
Figure PCTCN2022133546-appb-000007
(110):
Figure PCTCN2022133546-appb-000008
取值为1的概率为z j;否则
Figure PCTCN2022133546-appb-000009
取值为0
其中,
Figure PCTCN2022133546-appb-000010
中上角标b表示输出结果信息已进行了随机二值化处理,即它的数值只能取0或1。采样结果
Figure PCTCN2022133546-appb-000011
作为后一层神经网络的输入。
值得注意的是,当挤压函数的常数a趋于无穷大时,挤压函数变为阶跃函数,代表概率的z j取值仅为0或1,随机二值化采样过程转变为确定性的二值化操作。因此,确定性二值化可以看做是随机二值化的特殊情况。
如图1所示,在本发明实施例的一种实现方式中,神经网络训练及推理方法还包括以下步骤:
步骤S200,对所述激活函数的导数进行伯努利过程采样,得到随机二值化后的激活函数的导数。
在本实施例中,在神经网络训练的过程中,在实施技术要素二时,需要将激活函数导数进行随机二值化处理,以将每个网络节点的导数信息存储到特定的记忆单元中,使得需要存储的网络节点导数信息由高精度数值转变为二进制数值,大幅度降低了神经网络训练过程中的存储需求。
具体地,在本实施例的一种实现方式中,步骤S200包括以下步骤:
步骤S201,获取所述激活函数的导数;
步骤S202,以获得的导数为概率进行伯努利过程采样,获得对应的随机二值化网络节点的输出结果;
步骤S203,将获得的随机二值化网络节点的输出结果作为反向传播过程的误差信息计算值。
在本实施例中,如图2中102所示,在信息正向传播的同时,可以获得激活函数的导数(Derivative)信息(111、112):
Figure PCTCN2022133546-appb-000012
随后,本实施例中以
Figure PCTCN2022133546-appb-000013
的数值作为概率进行伯努利过程采样(113),获得随机二值化网络节点输出结果
Figure PCTCN2022133546-appb-000014
(114):
Figure PCTCN2022133546-appb-000015
取值为1的概率为
Figure PCTCN2022133546-appb-000016
否则
Figure PCTCN2022133546-appb-000017
取值为0
其中,
Figure PCTCN2022133546-appb-000018
中上角标b表示输出结果信息已进行了随机二值化处理,即它的数值只能 取0或1。采样结果
Figure PCTCN2022133546-appb-000019
将用于误差反向传播过程。
原则上,本实施例中应选取导数取值范围为0到1之间的激活函数,以获得有效的导数采样结果;实践中发现,对激活函数的导数的幅值进行等比例地缩放或近似化处理,仅影响神经网络的收敛速度,不会影响神经网络最终训练结果。
具体地,在本实施例的一种实现方式中,步骤S202之前包括以下步骤:
步骤S202a,对所述激活函数的导数的幅值进行等比例缩放或近似化处理。
在本实施例中,当采用Logistic函数,设定常数a=1时,激活函数为
Figure PCTCN2022133546-appb-000020
它的导数取值区间为0到0.25。本实施例中有两种处理方式可供选择:(i)以此数值作为概率进行伯努利过程采样;(ii)将导数数值乘以4后作为概率进行伯努利过程采样。
当采用Logistic函数,设定常数a=4时,激活函数为
Figure PCTCN2022133546-appb-000021
它的导数取值区间为0到1。本实施例中不需要对其进行处理,可直接以此数值作为概率进行伯努利过程采样。
当采用Logistic函数,设定常数a=8时,激活函数为
Figure PCTCN2022133546-appb-000022
它的导数取值区间为0到2。本实施例中有两种处理方式可供选择:(i)将导数数值除以2后作为概率进行伯努利过程采样;(ii)将数值大于1的导数赋值为1,其他导数数值不变,然后进行伯努利过程采样。
当采用剪切式整流线性单元(Clipped ReLU)函数,设定常数a=1时,它的导数取值仅可为0或1。导数的伯努利过程采样过程退化为确定性的二值化过程。
当采用剪切式整流线性单元(Clipped ReLU)函数,设定常数a=2时,它的导数取值仅可为0或0.5。本实施例中有两种处理方式可供选择:(i)以此数值作为概率进行伯努利过程采样;(ii)将导数数值乘以2后作为概率进行伯努利过程采样过程(退化为确定性的二值化过程)。
当采用剪切式整流线性单元(ClippedReLU)函数,设定常数
Figure PCTCN2022133546-appb-000023
时,它的导数取值仅可为0或2。本实施例中可以将数值除以2后,进行伯努利过程采样过程(退化为确定性的二值化过程)。
当采用Logistic函数作为正向传播的激活函数时,它的导数可以用剪切式整流线性单元(Clipped ReLU)函数来近似,并采用确定性二值化的方法获得用于误差反向传播过程的二值化导数信息。
如图1所示,在本发明实施例的一种实现方式中,神经网络训练及推理方法还包括以下步骤:
步骤S300,对所述下一层网络的反向传播的误差进行符号化处理,并根据符号化处理后的值和所述随机二值化后的激活函数的导数计算所述本层网络的误差信息。
在本实施例中,在神经网络训练的过程中,在实施技术要素三时,需要将反向传播误差进行符号化处理,从而可以将每个网络节点的误差信息由高精度数值转变为符号化数值状态,大幅度降低了误差反向传播的算力需求。
具体地,在本实施例的一种实现方式中,步骤S300包括以下步骤:
步骤S301,对所述下一层网络的反向传播的误差进行符号化处理,获得符号化的误差;
步骤S302,将获得的符号化的误差与所述随机二值化后的激活函数的导数相乘,得到所述本层网络的误差信息。
在本实施例中,反向传播的误差进行符号化处理是指,反向传播的误差的取值为-1、0或1;如图2中103所示,在误差(Error)反向传播时,本实施例中对后一层反向传回的误差δz j(115)进行符号化(Sign)处理(116),获得符号化的误差(SignedError)
Figure PCTCN2022133546-appb-000024
(117):
当δz j≥0时,
Figure PCTCN2022133546-appb-000025
当δz j<0时,
Figure PCTCN2022133546-appb-000026
Figure PCTCN2022133546-appb-000027
的上角标s表示误差信息已进行了符号化处理,其取值只能为-1或1。符号化的误差与随机二值化的激活函数的导数相乘(118),获得本网络节点的误差信息
Figure PCTCN2022133546-appb-000028
(119):
Figure PCTCN2022133546-appb-000029
Figure PCTCN2022133546-appb-000030
表示此网络节点的误差信息为符号化信息,取值仅能为-1、0或1。符号化的误差将继续沿神经网络反向传播(119、105、120)。前一层神经网络中节点i的误差由连接到它的所有本层节点的误差与权重的乘积的和组成:
Figure PCTCN2022133546-appb-000031
误差信息反向传输到本层后,结合正向传播过程中产生的二值化的节点状态信息,可以获得网络整体输出误差函数相对于本层神经网络中权重(105)变化的梯度,从而可以利用梯度下降算法,调整权重,完成一次神经网络训练。
如图1所示,在本发明实施例的一种实现方式中,神经网络训练及推理方法还包括以下步骤:
步骤S400,根据所述本层网络的误差信息和上一层网络产生的随机二值化输出对本层网络进行训练。
在本实施例中,在神经网络训练中,可以独立地或组合地采用上述三种技术要素。上述三个技术要素对应神经网络训练的三个不可或缺的过程:信息的正向传播、激活函数导 数的计算、误差的反向传播。其中的每个过程都可选择采用传统高精度计算模式或采用上述技术要素进行替代,形成2x2x2=8种组合技术方案。
具体地,在本实施例的一种实现方式中,步骤S400包括以下步骤:
步骤S401,根据所述本层网络的误差信息和所述上一层网络产生的随机二值化输出,计算得到网络整体输出误差函数相对于所述本层网络中权重变化的梯度;
步骤S402,根据所述权重变化的梯度和梯度下降算法调整权重。
在本实施例中,在神经网络训练中,技术方案包括:
训练技术方案一:上述三种技术要素都不采用的情况,即全部采用高精度计算模式的情况,是传统神经网络的训练方法。
训练技术方案二:仅采用技术要素二,神经网络中正向传播的信息和反向传播的误差采用高精度数值。
训练技术方案三:仅采用技术要素三,神经网络中正向传播的信息和激活函数的导数采用高精度数值。
训练技术方案四:采用技术要素二和技术要素三,神经网络中正向传播的信息采用高精度数值。
训练技术方案五:仅采用技术要素一,神经网络中激活函数导数和反向传播的误差采用高精度数值,是现有二值化神经网络的计算方法。
训练技术方案六:采用技术要素一和技术要素二,神经网络中反向传播的误差采用高精度数值。
训练技术方案七:采用技术要素一和技术要素三,神经网络中激活函数的导数采用高精度数值。
训练技术方案八:采用技术要素一、技术要素二和技术要素三,神经网络中的所有节点信息无高精度数值。
在本实施例中,经过上述8种训练技术方案,即可得到训练后的神经网络,进而在实践过程中可以利用训练后的神经网络进行推理;其中,基于训练后的神经网络进行推理过程可以是传统的高精度推理过程,也可以是采用技术要素四和技术要素五的神经网络推理的技术方案。
如图1所示,在本发明实施例的一种实现方式中,神经网络训练及推理方法还包括以下步骤:
步骤S500,根据逐层传播的随机二值化数值进行神经网络的推理。
神经网络训练完成后,神经网络的推理可采用传统高精度计算模式。为降低推理过程中的算力消耗,神经网络的推理过程也需要进行二值化处理。由于神经网络的推理过程与神经网络训练过程中的信息正向传播过程相同,可直接应用技术要素一实现随机二值化的神经网络推理。但是,由于随机性的存在以及二值化过程中信息的丢失,神经网络推理的准确度下降。可在推理过程中,选用下列技术要素四或技术要素五,以提高推理准确度:
具体地,在本发明实施例的一种实现方式中,神经网络推理方法还包括以下步骤:
步骤S601,对正向传播的网络节点信息以确定性的方式获得取值为0或1的二值化数值;
步骤S602,将获取的二值化数值传输到所述神经网络的最后一层,获得所述神经网络的推理结果。
在本实施例中,在神经网络推理的过程中,在实施技术要素四时,需要基于确定性二值化网络进行推理;在信息正向传播过程中,本实施例中不再以激活函数的输出数值作为概率进行采样,而是进行确定性的二值化处理,获得二值化网络节点输出结果(如图4所示的201、202):
如果z j≥0.5,则
Figure PCTCN2022133546-appb-000032
否则
Figure PCTCN2022133546-appb-000033
等效地,本实施例中可以省略掉激活函数,直接从本层网络节点j的输入信息y j中获得二值化网络节点输出(如图4所示的203、204):
如果z j≥0,则
Figure PCTCN2022133546-appb-000034
否则
Figure PCTCN2022133546-appb-000035
二值化结果
Figure PCTCN2022133546-appb-000036
作为后一层神经网络的输入。最终二值化信息传输到最后一层,获得神经网络的推理结果。
具体地,在本发明实施例的一种实现方式中,神经网络推理方法还包括以下步骤:
步骤S701,进行多次重复随机二值化的正向传播过程,并根据多次推理结果的投票结果获得所述神经网络的最终推理结果。
在本实施例中,在神经网络推理的过程中,在实施技术要素五时,需要基于重复式随机二值化网络结果进行推理:可多次重复随机二值化的正向传播过程,通过多次推理结果投票的形式获得最终推理结果(如图5所示)。不断增加随机二值化的正向传播推理过程的重复次数,可使推理准确度不断提高。
因此,在神经网络推理过程中,上述两种技术要素使我们有三种推理技术方案可供选择。
推理技术方案一:采用高精度正向传播进行神经网络推理(传统方案)。
推理技术方案二:采用技术要素四,即确定性二值化方法,进行神经网络推理(既有技术方案)。
推理技术方案三:采用技术要素五,即重复式随机二值化方法,进行神经网络推理。
在神经网络的全周期(即训练和推理)的应用过程中,上述8种训练技术方案和3种推理技术方案可相互组合,形成24种神经网络全周期技术方案。其中,采用训练技术方案一和推理技术方案一是传统高精度神经网络训练和推理方法。现有的神经网络量化或二值化方法,仅采用技术要素一或技术要素四,形成训练技术方案五和推理技术方案二的组合。
以下以多层感知器(Multi-layer perceptron,MLP)神经网络学习和识别手写数字的应用作为实例,展示上述技术要素和技术方案的实施方式和效果。
采用【784-500-200-10】结构的全连接的多层神经网络,对MNIST手写数字集进行学习和识别。MNIST手写数字集包括由60000个手写数字图像组成的训练集和10000个手写数字图像组成的测试集组成。训练集中的图像用于神经网络的训练(或学习);测试集中的数据用于测试神经网络的识别精度(推理精度)。MNIST中每个数字图像由32x32像素点组成,对应神经网络第一层的784个输入节点;神经网络的最后一层的10个输出节点对应数字图像的10个分类(0,1,…,9)。神经网络训练还遵循下列设定:
(1)采用固定学习速率η=0.1。
(2)采用小批量(Mini Batch)训练模式,训练集中的60000个样本图像划分为600个批次,每批次训练样本为100个。每批次中100个样本所获得的梯度的平均值用作权重更新。
(3)对训练集中的所有样本完成一次学习,为一个训练周期。每一个训练周期完成后,对测试集中的10000个样本图像按照推理技术方案一进行推理,获得识别精度(或推理错误率)。
(4)隐藏层采用参数a=4的Logistic函数作为激活函数。
(5)输出层采用Softmax函数作为激活函数,采用Cross Entropy作为输出误差的目标函数。
从实施效果对比来看,采用正向传播的信息二值化(技术要素一)的训练方案(方案五至方案八)普遍优于采用高精度模式的信息正向传播的训练方案(方案一至方案四)。采用技术要素二、技术要素三的技术方案与没有采用技术要素二、技术要素三的技术方案相比,训练结果差别较小。
本实施例中列出了四种不同训练技术方案下三种不同推理技术方案的推理错误率结 果。在单次推理情况下,推理方案一优于推理方案二,推理方案二优于推理方案三。推理方案一和推理方案二是确定性的推理过程,多次重复推理不会提高推理效果。推理方案三为随机性推理过程,增加推理过程的重复次数会不断提升推理效果;推理效果在几次重复推理后优于推理技术方案二的效果,并逼近或超越推理技术方案一的效果。
如图6所示,从不同训练方案和不同推理方案的组合效果来看,引入技术要素一、技术要素二、技术要素三的训练方案,在训练完成后不同推理效果普遍优于未引入技术要素一、技术要素二、技术要素三的训练方案。
本实施例仅作为应用本发明所提出的技术方案的示例性说明,不能作为本发明应用于神经网络训练和推理过程中的限制。
本发明所提出的技术方案可应用于卷积神经网络(convolutional neural network)、长短时记忆(long short-term memory)神经网络、反馈式神经网络(recurrent neural network)、强化学习(Reinforcement Learning)网络等多种采用误差反向传播和梯度下降算法作为基础算法的神经网络。这些神经网络可用于多种不同的应用场景,如图像识别、语音识别、自然语言处理、人机对弈、自动驾驶等多种应用场景。
本实施例通过上述技术方案达到以下技术效果:
本实施例在神经网络信息的正向传播过程中,技术要素一使得每层输入变为二值化状态,网络层的输入与权重的乘积运算由两个高精度数值的乘积变为取值为0或1的数值与另一个高精度数值的相乘,大幅度降低了信息正向传播的算力需求。另外,在神经网络的训练过程中,需要存储正向传播过程中每层网络的输入状态,用于误差反向传播到本层后权重梯度的计算。采用技术要素一,使得需要存储的网络节点状态由高精度数值转变为二进制数值,大幅度降低了神经网络训练过程中的存储需求。
本实施例在神经网络训练过程中,每个网络节点的导数信息也需要存储到特定的记忆单元中,用于误差反向传播过程。采用技术要素二,使得需要存储的网络节点导数信息由高精度数值转变为二进制数值,大幅度降低了神经网络训练过程中的存储需求。
本实施例在误差反向传播过程中,采用技术要素二和技术要素三,使得每个网络节点的误差信息由高精度数值转变为符号化数值状态,仅取值为-1,0或1。误差与权重的乘积运算由两个高精度数值的乘积变为取值为-1,0或1的数值与另一个高精度数值的相乘,大幅度降低了误差反向传播的算力需求。
本实施例中同时采用技术要素一、技术要素二、技术要素三,使得权重梯度的计算由两个高精度数值的乘积变为取值0或1的数值和取值-1,0或1的数值的乘积,降低了权重更新 过程对算力的需求。并且,权重梯度的数值只能取值为-1,0或1,使权重的数值以确定的单位变化,利于权重的量化处理。神经网络的权重可以完全以固定点数型数值或整数型数值来描述。
通过采用本实施例中的技术方案,当利用忆阻器阵列实现神经网络加速时,技术要素一、技术要素二、技术要素三、技术要素四、技术要素五将有效降低忆阻器阵列外围电路的复杂度,有效降低硬件加速神经网络运算的所需的成本和功耗。
示例性设备
基于上述实施例,本发明还提供一种神经网络训练及推理装置,包括:
正向信息随机二值化模块,用于根据激活函数对正向传播的网络节点信息进行映射处理,并根据映射后的数值进行伯努利过程采样,得到本层网络产生的随机二值化数值,将得到的随机二值化数值作为下一层网络的输入;
导数信息随机二值化模块,用于对所述激活函数的导数进行伯努利过程采样,得到随机二值化后的激活函数的导数;
误差符号化处理模块,用于对所述下一层网络的反向传播的误差进行符号化处理,并根据符号化处理后的值和所述随机二值化后的激活函数的导数计算所述本层网络的误差信息;
训练模块,用于根据所述本层网络的误差信息和上一层网络产生的随机二值化输出对本层网络进行训练;
推理模块,用于根据逐层传播的随机二值化数值进行神经网络的推理。
基于上述实施例,本发明还提供一种终端,其原理框图可以如图7所示。
该终端包括:通过系统总线连接的处理器、存储器、接口、显示屏以及通讯模块;其中,该终端的处理器用于提供计算和控制能力;该终端的存储器包括存储介质以及内存储器;该存储介质存储有操作系统和计算机程序;该内存储器为存储介质中的操作系统和计算机程序的运行提供环境;该接口用于连接外部设备,例如,移动终端以及计算机等设备;该显示屏用于显示相应的信息;该通讯模块用于与云端服务器或移动终端进行通讯。
该计算机程序被处理器执行时用以实现一种神经网络训练及推理方法的操作。
本领域技术人员可以理解的是,图7中示出的原理框图,仅仅是与本发明方案相关的部分结构的框图,并不构成对本发明方案所应用于其上的终端的限定,具体的终端可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,提供了一种终端,其中,包括:处理器和存储器,存储器存储有神 经网络训练及推理程序,神经网络训练及推理程序被处理器执行时用于实现如上的神经网络训练及推理方法的操作。
在一个实施例中,提供了一种存储介质,其中,存储介质存储有神经网络训练及推理程序,神经网络训练及推理程序被处理器执行时用于实现如上的神经网络训练及推理方法的操作。
在一个实施例中,提供了一种设备,包括:电路模块,所述电路模块用于实现如上的神经网络训练及推理方法的操作。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,计算机程序可存储于一非易失性存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本发明所提供的各实施例中所使用的对存储器、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。
综上,本发明提供了一种神经网络训练及推理方法、装置、终端及存储介质,方法包括:根据激活函数对正向传播的网络节点信息进行映射处理,并根据映射后的数值进行伯努利过程采样,得到本层网络产生的随机二值化数值,将得到的随机二值化数值作为下一层网络的输入;对激活函数的导数进行伯努利过程采样,得到随机二值化后的激活函数的导数;对下一层网络的反向传播的误差进行符号化处理,并根据符号化处理后的值和随机二值化后的激活函数的导数计算本层网络的误差信息;根据本层网络的误差信息和上一层网络产生的随机二值化输出对本层网络进行训练;根据逐层传播的随机二值化数值进行神经网络的推理。本发明采用随机二值化的信号进行正向传播和符号化的误差进行反向传播的神经网络,降低计算资源且提高了识别精度。
应当理解的是,本发明的应用不限于上述的举例,对本领域普通技术人员来说,可以根据上述说明加以改进或变换,所有这些改进和变换都应属于本发明所附权利要求的保护范围。

Claims (14)

  1. 一种神经网络训练及推理方法,其特征在于,包括:
    根据激活函数对正向传播的网络节点信息进行映射处理,并根据映射后的数值进行伯努利过程采样,得到本层网络产生的随机二值化数值,将得到的随机二值化数值作为下一层网络的输入;
    对所述激活函数的导数进行伯努利过程采样,得到随机二值化后的激活函数的导数;
    对所述下一层网络的反向传播的误差进行符号化处理,并根据符号化处理后的值和所述随机二值化后的激活函数的导数计算所述本层网络的误差信息;
    根据所述本层网络的误差信息和上一层网络产生的随机二值化输出对本层网络进行训练;
    根据逐层传播的随机二值化数值进行神经网络的推理。
  2. 根据权利要求1所述的神经网络训练及推理方法,其特征在于,所述根据激活函数对正向传播的网络节点信息进行映射处理,之前包括:
    获取上一层网络中连接到所述本层网络的各节点的输出信息,得到所有连接所述本层网络的节点的输入;其中,所述本层网络的节点的输入为所述上一层网络的经过随机二值化处理后的输出;
    将所有连接所述本层网络的节点的输入与对应的权重进行相乘,并将得到的所有乘积进行求和,得到所述本层网络的节点的输入信息。
  3. 根据权利要求1所述的神经网络训练及推理方法,其特征在于,所述根据激活函数对正向传播的网络节点信息进行映射处理,并根据映射后的数值进行伯努利过程采样,得到本层网络产生的随机二值化数值,将得到的随机二值化数值作为下一层网络的输入,包括:
    根据所述激活函数对所述本层网络的输入信息进行映射处理,将所述本层网络的输入信息映射成0至1之间的数值;
    以映射得到的数值为概率进行伯努利过程采样,获得对应的随机二值化网络节点的输出结果;其中,所述输出结果为所述本层网络产生的随机二值化数值;
    将获得的随机二值化网络节点的输出结果作为所述下一层网络的输入。
  4. 根据权利要求1所述的神经网络训练及推理方法,其特征在于,所述激活函数为挤压函数,包括:Logistic函数、误差函数、剪切式整流线性单元函数以及对称剪切式整流线性单元函数中的一种或组合。
  5. 根据权利要求1所述的神经网络训练及推理方法,其特征在于,所述对所述激活 函数的导数进行伯努利过程采样,得到随机二值化后的激活函数的导数,包括:
    获取所述激活函数的导数;
    以获得的导数为概率进行伯努利过程采样,获得对应的随机二值化网络节点的输出结果;
    将获得的随机二值化网络节点的输出结果作为反向传播过程的误差信息计算值。
  6. 根据权利要求5所述的神经网络训练及推理方法,其特征在于,所述以获得的导数为概率进行伯努利过程采样,获得对应的随机二值化网络节点的输出结果,之前包括:
    对所述激活函数的导数的幅值进行等比例缩放或近似化处理。
  7. 根据权利要求1所述的神经网络训练及推理方法,其特征在于,所述对所述下一层网络的反向传播的误差进行符号化处理,并根据符号化处理后的值和所述随机二值化后的激活函数的导数计算所述本层网络的误差信息,包括:
    对所述下一层网络的反向传播的误差进行符号化处理,获得符号化的误差;
    将获得的符号化的误差与所述随机二值化后的激活函数的导数相乘,得到所述本层网络的误差信息;
    其中,所述本层网络的误差信息的取值为-1、0或1。
  8. 根据权利要求1所述的神经网络训练及推理方法,其特征在于,所述根据所述本层网络的误差信息和上一层网络产生的随机二值化输出对本层网络进行训练,包括:
    根据所述本层网络的误差信息和所述上一层网络产生的随机二值化输出,计算得到网络整体输出误差函数相对于所述本层网络中权重变化的梯度;
    根据所述权重变化的梯度和梯度下降算法调整权重。
  9. 根据权利要求1所述的神经网络训练及推理方法,其特征在于,还包括:
    对正向传播的每层网络节点信息以确定性的方式获得取值为0或1的二值化数值;
    将获取的二值化数值逐层正向传输到所述神经网络的最后一层,获得所述神经网络的推理结果。
  10. 根据权利要求1所述的神经网络训练及推理方法,其特征在于,还包括:
    进行多次重复随机二值化的正向传播过程,并根据多次推理结果的投票结果获得所述神经网络的最终推理结果。
  11. 一种神经网络训练及推理装置,其特征在于,包括:
    正向信息随机二值化模块,用于根据激活函数对正向传播的网络节点信息进行映射处理,并根据映射后的数值进行伯努利过程采样,得到本层网络产生的随机二值化数值,将 得到的随机二值化数值作为下一层网络的输入;
    导数信息随机二值化模块,用于对所述激活函数的导数进行伯努利过程采样,得到随机二值化后的激活函数的导数;
    误差符号化处理模块,用于对所述下一层网络的反向传播的误差进行符号化处理,并根据符号化处理后的值和所述随机二值化后的激活函数的导数计算所述本层网络的误差信息;
    训练模块,用于根据所述本层网络的误差信息和上一层网络产生的随机二值化输出对本层网络进行训练;
    推理模块,用于根据逐层传播的随机二值化数值进行神经网络的推理。
  12. 一种终端,其特征在于,包括:处理器以及存储器,所述存储器存储有神经网络训练及推理程序,所述神经网络训练及推理程序被所述处理器执行时用于实现如权利要求1-10中任意一项所述的神经网络训练及推理方法的操作。
  13. 一种存储介质,其特征在于,所述存储介质为计算机可读存储介质,所述存储介质存储有神经网络训练及推理程序,所述神经网络训练及推理程序被处理器执行时用于实现如权利要求1-10中任意一项所述的神经网络训练及推理方法的操作。
  14. 一种设备,其特征在于,包括:电路模块,所述电路模块用于实现如权利要求1-10中任意一项所述的神经网络训练及推理方法的操作。
PCT/CN2022/133546 2022-11-01 2022-11-22 一种神经网络训练及推理方法、装置、终端及存储介质 WO2024092896A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211356224.7A CN115906936A (zh) 2022-11-01 2022-11-01 一种神经网络训练及推理方法、装置、终端及存储介质
CN202211356224.7 2022-11-01

Publications (1)

Publication Number Publication Date
WO2024092896A1 true WO2024092896A1 (zh) 2024-05-10

Family

ID=86495757

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/133546 WO2024092896A1 (zh) 2022-11-01 2022-11-22 一种神经网络训练及推理方法、装置、终端及存储介质

Country Status (2)

Country Link
CN (1) CN115906936A (zh)
WO (1) WO2024092896A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116664958B (zh) * 2023-07-27 2023-11-14 鹏城实验室 基于二值神经网络模型的图像分类方法以及相关设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110837887A (zh) * 2019-11-12 2020-02-25 西安微电子技术研究所 一种深度卷积神经网络的压缩及加速方法、神经网络模型及其应用
CN110956263A (zh) * 2019-11-14 2020-04-03 深圳华侨城文化旅游科技集团有限公司 一种二值化神经网络的构建方法、存储介质及终端设备
CN111523637A (zh) * 2020-01-23 2020-08-11 北京航空航天大学 一种信息保留网络的生成方法及装置
US20210056427A1 (en) * 2019-08-20 2021-02-25 Korea Advanced Institute Of Science And Technology Apparatus and method for training deep neural network
CN113159273A (zh) * 2021-01-30 2021-07-23 华为技术有限公司 一种神经网络的训练方法及相关设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210056427A1 (en) * 2019-08-20 2021-02-25 Korea Advanced Institute Of Science And Technology Apparatus and method for training deep neural network
CN110837887A (zh) * 2019-11-12 2020-02-25 西安微电子技术研究所 一种深度卷积神经网络的压缩及加速方法、神经网络模型及其应用
CN110956263A (zh) * 2019-11-14 2020-04-03 深圳华侨城文化旅游科技集团有限公司 一种二值化神经网络的构建方法、存储介质及终端设备
CN111523637A (zh) * 2020-01-23 2020-08-11 北京航空航天大学 一种信息保留网络的生成方法及装置
CN113159273A (zh) * 2021-01-30 2021-07-23 华为技术有限公司 一种神经网络的训练方法及相关设备

Also Published As

Publication number Publication date
CN115906936A (zh) 2023-04-04

Similar Documents

Publication Publication Date Title
CN109784490B (zh) 神经网络的训练方法、装置和电子设备
US20200193297A1 (en) System and method for binary recurrent neural network inferencing
CN109934336B (zh) 基于最优结构搜索的神经网络动态加速平台设计方法及神经网络动态加速平台
WO2024092896A1 (zh) 一种神经网络训练及推理方法、装置、终端及存储介质
US11625583B2 (en) Quality monitoring and hidden quantization in artificial neural network computations
WO2022111002A1 (zh) 用于训练神经网络的方法、设备和计算机可读存储介质
KR20200026455A (ko) 인공 신경망 시스템 및 인공 신경망의 고정 소수점 제어 방법
US5274744A (en) Neural network for performing a relaxation process
EP3924891A1 (en) Quality monitoring and hidden quantization in artificial neural network computations
WO2024114659A1 (zh) 一种摘要生成方法及其相关设备
US20200192797A1 (en) Caching data in artificial neural network computations
WO2024074072A1 (zh) 脉冲神经网络加速器学习方法、装置、终端及存储介质
US20200242445A1 (en) Generic quantization of artificial neural networks
Lin et al. On relationship of multilayer perceptrons and piecewise polynomial approximators
US11068784B2 (en) Generic quantization of artificial neural networks
CN110009091B (zh) 学习网络在等价类空间中的优化
JPH076146A (ja) 並列データ処理システム
CN111582461B (zh) 神经网络训练方法、装置、终端设备和可读存储介质
CN114118358A (zh) 图像处理方法、装置、电子设备、介质及程序产品
TWI763975B (zh) 降低類神經網路之運算複雜度的系統與方法
WO2020121030A1 (en) Caching data in artificial neural network computations
WO2023240578A1 (zh) 应用于神经网络的存内计算架构的操作方法、装置和设备
CN114495236B (zh) 图像分割方法、装置、设备、介质及程序产品
US20230289558A1 (en) Quantization method for accelerating the inference of neural networks
US20240135698A1 (en) Image classification method, model training method, device, storage medium, and computer program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22964194

Country of ref document: EP

Kind code of ref document: A1