CN110782030A

CN110782030A - Deep learning weight updating method, system, computer device and storage medium

Info

Publication number: CN110782030A
Application number: CN201910872174.XA
Authority: CN
Inventors: 王健宗
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2019-09-16
Filing date: 2019-09-16
Publication date: 2020-02-11
Also published as: WO2021051556A1

Abstract

The embodiment of the invention provides a deep learning weight value updating method based on parameter rewriting, which comprises the following steps: constructing a deep neural network model according to a plurality of neuron output functions; updating parameters of each weight vector in the deep neural network model to obtain each updated weight vector; inputting a training sample into the deep neural network model, and obtaining calculation output from the deep neural network model; and updating each weight vector according to the calculation output. By the embodiment of the invention, the weight parameter can be rewritten, the problem of limitation of batch normalization on the number of samples is avoided, meanwhile, the convergence speed of the neural network is improved, and the training process of the neural network is accelerated.

Description

Deep learning weight updating method, system, computer device and storage medium

Technical Field

The embodiment of the invention relates to the field of artificial neural networks, in particular to a method, a system, computer equipment and a computer readable storage medium for updating deep learning weight.

Background

The batch normalization is a common sample characteristic normalization method used when a neural network model is trained, namely, the mean value and the variance of sample data are reduced, so that the distribution of the data is optimized, and the training speed of the neural network is accelerated. However, batch normalization has a limit on the number of training samples, and when the number of samples is 1, batch normalization does not work.

The use of batch normalization requires the storage of the mean and variance of small batches at each time step, which is inefficient and occupies memory, and slows down the convergence rate of the neural network to a certain extent.

Therefore, the invention aims to solve the problems of limitation of batch normalization on samples and low convergence speed of the neural network.

Disclosure of Invention

In view of this, an object of the embodiments of the present invention is to provide a method, a system, a computer device, and a computer-readable storage medium for updating a deep learning weight based on parameter rewriting, which are not limited by batch normalization on the number of samples, and can accelerate the convergence rate of a neural network model.

In order to achieve the above object, an embodiment of the present invention provides a method for updating a deep learning weight, where the method includes:

constructing a deep neural network model according to a plurality of neuron output functions, wherein the output function of each neuron is y ═ phi (WX + b), wherein y represents the output value of the corresponding neuron, phi represents an excitation function, X represents multidimensional input features, W represents a weight vector, and b represents a deviation scalar of the corresponding neuron;

performing parameter updating on each weight vector in the deep neural network model to obtain each updated weight vector, wherein an updating formula for parameter updating is as follows:

wherein, W _nRepresenting the updated weight vector of the corresponding neuron, v represents W _nUnit vector of (1), g represents W _nThe scalar quantity of (a), the g | | | W _n||，v _n-1A unit vector representing each weight vector when the deep neural network model is trained for the (n-1) th time;

inputting a training sample into the deep neural network model, and obtaining calculation output from the deep neural network model;

and updating each weight vector according to the calculation output.

Further, before the step of constructing the deep neural network model according to the plurality of neuron output functions, the method further includes:

and initializing each weight vector W and each deviation scalar b.

Further, the step of updating each weight vector according to the calculation output includes:

calculating a training error by using the calculated output and a preset target output according to a training error formula, wherein the training error formula is as follows:

wherein J (W) represents a training error, t _kRepresents the target output of the kth training, z _kRepresenting the calculation output of the k training, wherein k is a positive integer, and k is 1, 2 … c;

judging whether back propagation needs to be executed or not according to the training error;

and when the reverse propagation is not required to be executed, taking each weight vector as each weight vector after the deep neural network model is updated.

Further, the step of determining whether back propagation needs to be performed according to the training error includes: comparing the training error with a preset expected value; and

and when the training error is larger than the preset expected value, executing the back propagation to update each weight vector.

Further, after the step of comparing the training error with the preset expected value, the method further includes:

and when the training error is not greater than the preset expected value, obtaining each weight vector without executing the back propagation, and taking each weight vector as each weight vector after the deep neural network model is updated.

Further, when the training error is greater than the preset expected value, the step of performing the back propagation to update each weight vector further includes:

updating each weight vector according to a weight updating formula, wherein the weight updating formula is as follows: w (n +1) ═ W (n) +ΔW(n)， W (n) represents a weight vector of the corresponding neuron in the n training of the deep neural network model, W (n +1) represents a weight vector of the corresponding neuron in the n +1 training of the deep neural network model, Δ W (n) represents a change of the weight vector of the corresponding neuron in a gradient descending direction in the n training of the deep neural network model, η represents a learning rate,

a partial derivative function representing the training error to the weight vector of the corresponding neuron.

Further, when the training error is greater than the preset expected value, after the step of performing the back propagation to update each weight vector, the method further includes:

updating each weight vector W according to the vector v and the change value of the scalar g, wherein the change value of the scalar g in the gradient descending direction is as follows: wherein ▽ g L represents a partial derivative function of the error function with respect to the parameter g, and the vector v varies in the gradient descent direction by:

where ▽ vL represents the partial derivative of the error function with respect to the parameter v.

In order to achieve the above object, an embodiment of the present invention further provides a deep learning weight updating system, including:

the deep neural network model is constructed according to a plurality of neuron output functions, wherein the output function of each neuron is y ═ phi (WX + b), wherein y represents the output value of the corresponding neuron, phi represents an excitation function, X represents multidimensional input characteristics, W represents a weight vector, and b represents a deviation scalar of the corresponding neuron;

a parameter updating module, configured to perform parameter updating on each weight vector in the deep neural network model to obtain each updated weight vector, where an updating formula for parameter updating is as follows:

the training module is used for inputting training samples into the deep neural network model and obtaining calculation output from the deep neural network model;

and the updating module is used for updating each weight vector according to the calculation output.

In order to achieve the above object, an embodiment of the present invention further provides a computer device, where the computer device includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor implements the steps of the deep learning weight updating method as described above when executing the computer program.

In order to achieve the above object, an embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored, where the computer program is executable by at least one processor to cause the at least one processor to execute the steps of the deep learning weight update method described above.

The deep learning weight updating method, the deep learning weight updating system, the computer equipment and the computer readable storage medium provided by the embodiment of the invention update the weight of the deep neural network model based on parameter rewriting, are free from the problem of limitation of batch normalization on the number of samples, and accelerate the convergence speed of the neural network model.

The invention is described in detail below with reference to the drawings and specific examples, but the invention is not limited thereto.

Drawings

Fig. 1 is a flowchart illustrating steps of a deep learning weight updating method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a program module of a deep learning weight update system according to an embodiment of the present invention;

fig. 3 is a schematic diagram of a hardware structure of a computer device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Technical solutions between various embodiments may be combined with each other, but must be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.

Example one

Referring to fig. 1, a flowchart illustrating a method for updating deep learning weights according to a first embodiment of the present invention is shown. It is to be understood that the flow charts in the embodiments of the present method are not intended to limit the order in which the steps are performed. The following description is given by taking a computer device as an execution subject, specifically as follows:

and S100, constructing a deep neural network model according to a plurality of neuron output functions.

Specifically, the output function of each neuron is y ═ Φ (WX + b), where y represents the output value of the neuron, Φ represents an excitation function, X represents a multi-dimensional input feature, W represents a weight vector representing the weight occupied by the input on the neuron, and b represents a deviation scalar of the neuron, and the neuron generates an output when the input is greater than a threshold of the excitation function.

Usually, a neural network is composed of an input layer, one or more hidden layers and an output layer, and the number of the hidden layers of the deep neural network is greater than or equal to 2.

In a preferred embodiment, before the deep neural network model is constructed according to a plurality of neuron output functions, the weight vectors W and the bias scalars b are initialized, and the initialization refers to randomly taking values for the weight vectors W and the bias scalars b within a preset value range.

And step S102, performing parameter updating on each weight vector in the deep neural network model to obtain each updated weight vector.

Specifically, the update formula for updating the parameters is as follows:

wherein, W _nRepresenting the updated weight vector of the corresponding neuron, v represents W _nUnit vector of (1), g represents W _nThe scalar quantity of (a), the g | | | W _n||，v _n-1A unit vector representing the respective weight vectors at the time of training the n-1 th time of the deep neural network model, is an initial value of the unit vector v, and

is also w _nThe initial coefficient of (a). In this embodiment, v ₀And taking a v value when the weight vector W is initialized.

And step S104, inputting the training sample into the deep neural network model, and obtaining calculation output from the deep neural network model.

Specifically, forward propagation calculation is performed by using the weight vectors to obtain calculation output. The forward propagation calculation refers to that the training samples are calculated forward layer by layer through the deep neural network model, and then calculation output is output by the output layer.

In a preferred embodiment, in the step of updating the weight vectors according to the calculated output, the calculated output and a preset target output are further input into a preset training error formula to calculate a training error, where the training error formula is:

wherein W represents the corresponding weight vector, J (W) represents the training error, t _kRepresents the target output of the kth training, z _kRepresents the calculation output of the k-th training, wherein k is a positive integer, and k is 1, 2 … c. And then judging whether reverse propagation needs to be executed or not according to the training error, and when the reverse propagation does not need to be executed, taking each weight vector as each weight vector after the deep neural network model is updated. Illustratively, in the 1 st training, the preset target output is 0.5, the calculated output is 0.4, and the training error is j (w) ═ 1/2 ^ (0.5-0.4) ^2 ^ 0.005.

In another preferred embodiment, the training error is compared with a preset expected value before determining whether back propagation is required according to the training error. If the training error is larger than the preset expected value, the reverse propagation is needed; and if the training error is not greater than the preset expected value, stopping training the deep neural network, and taking each weight vector as each weight vector after the deep neural network model is updated. Illustratively, in the 1 st training, the training error is 0.005, the preset expected value is 0.1, and if the training error is not greater than the expected value, the training of the deep neural network is stopped, and each weight vector is the updated weight vector of the deep neural network.

In another preferred embodiment, if the training error is greater than the predetermined expected value, then the back propagation is required, and the weight vectors are processed according to a weight update formulaUpdating, wherein the weight value updating formula is as follows: w (n +1) ═ W (n) + Δ W (n), wherein W (n) represents a weight vector of the corresponding neuron when the deep neural network model is trained for the nth time, W (n +1) represents a weight vector of the corresponding neuron when the deep neural network model is trained for the n +1 th time, AW (n) represents a change of the weight vector of the corresponding neuron when the deep neural network model is trained for the nth time in a gradient descending direction, η represents a learning rate,

a partial derivative function representing a weight vector of the corresponding neuron.

It should be noted that the gradient descending direction refers to a training direction that can make the training error within the fastest time to be smaller than the expected value. And returning the training error to each neuron of each layer by the back propagation, solving the partial derivative function according to the training error and the weight of each neuron, and updating each weight vector according to the solution of the partial derivative function.

In another preferred embodiment, if the training error is greater than the preset expected value, then the propagation in reverse direction is required, and the weight vectors may be further updated according to the vector v and the change value of the scalar g, where the change value of the scalar g in the gradient descending direction is:

wherein ▽ g L represents the partial derivative function of the error function to the parameter g, ▽ WL represents the partial derivative function of the error function to the weight W, and the variation value of the vector v in the gradient descending direction is:

wherein ▽ vL represents the partial derivative of the error function on the parameter v. since the weight W is rewritten, the weight W is originally calculatedMay be converted into a variation of the parameters v and g.

Illustratively, when the back propagation calculation is performed, the variation value of the scalar g and the variation value of the parameter v are obtained by differentiating the partial derivative function of the error function with respect to the parameter g and the partial derivative function of the error function with respect to the parameter v. Then, scalar g and vector v are updated with the change value of scalar g and the change value of vector v. And finally, updating each weight vector according to the updated scalar g and the vector v.

In another preferred embodiment, after updating each weight vector according to the gradient of the vector v and the scalar g, the deep neural network model is continuously trained by each weight vector, and a corresponding calculation output is obtained, and then the calculation output and the target output are recalculated into a corresponding training error according to the training error formula. And stopping training the neural network when the training error is not greater than the preset expected value or the training times reach the preset training times.

And S106, updating each weight vector according to the calculated output.

Specifically, calculation output is obtained from the deep neural network model, and then each weight vector is updated according to the calculation output.

Illustratively, a deep neural network is used for classifying blue points and red points in a certain image data set, when a weight vector is valued by using a random initialization method, the weight vector needs to be valued from a standard normal distribution, and then the deep neural network is trained by using the weight vector, so that the training effect is as follows: the gradient descending speed is 41.9968s, and the classification accuracy is 93%; when the weight updating method is used for updating each weight vector in each iteration of the deep neural network, the obtained training effect is as follows: the gradient descending speed is 40.8717s, which is 1.12 seconds faster than the original gradient descending speed, the classification accuracy is 96%, which is 3% higher than the original accuracy.

The invention updates the weight of the deep neural network model based on parameter rewriting, is not limited by batch normalization to the number of samples, and can accelerate the convergence rate of the neural network model.

Example two

Please refer to fig. 2, which illustrates a schematic diagram of a processing module of a deep learning weight updating system according to a second embodiment of the present invention. In this embodiment, the deep learning weight update system 20 may include or be divided into one or more program modules, and the one or more program modules are stored in a storage medium and executed by one or more processors to implement the deep learning weight update method. The program module referred to in the embodiments of the present invention refers to a series of computer program instruction segments capable of performing specific functions, and is more suitable for describing the execution process of the deep learning weight updating system 20 in the storage medium than the program itself. The following description will specifically describe the functions of the program modules of the present embodiment:

a building module 200, configured to build a deep neural network model according to the plurality of neuron output functions.

Specifically, the output function of each neuron is y ═ Φ (WX + b), where y represents the output value of the neuron, Φ is an excitation function, X represents a multidimensional input feature, W is a weight vector representing the weight occupied by the input on the neuron, and b represents a deviation scalar of the neuron, and the neuron generates an output when the input is greater than a threshold of the excitation function.

A parameter updating module 202, configured to perform parameter updating on each weight vector in the deep neural network model to obtain each updated weight vector.

Specifically, the update formula for updating the parameters is as follows: wherein, W _nRepresenting the updated weight vector of the corresponding neuron, v represents W _nUnit vector of (1), g represents W _nThe scalar quantity of (a), the g | | | W _n||，v _n-1A unit vector representing the respective weight vectors at the time of training the n-1 th time of the deep neural network model, is an initial value of the unit vector v, and

And the training module 204 is configured to input a training sample into the deep neural network model, and obtain a calculation output from the deep neural network model.

Specifically, the training module 204 performs forward propagation calculation by using each weight vector, and obtains calculation output. The forward propagation calculation refers to that the training samples are calculated forward layer by layer through the deep neural network model, and then calculation output is output by the output layer.

In a preferred embodiment, in the step of updating the weight vectors according to the calculated output, the training module 204 further inputs the calculated output and a preset target output into a preset training error formula to calculate a training error, where the training error formula is:

wherein W represents the corresponding weight vector, J (W) represents the training error, t _kRepresents the target output of the kth training, z _kRepresents the calculated output of the k-th trainingAnd k is a positive integer, and k is 1, 2 … c. And then judging whether reverse propagation needs to be executed or not according to the training error, and when the reverse propagation does not need to be executed, taking each weight vector as each weight vector after the deep neural network model is updated. Illustratively, in the 1 st training, the preset target output is 0.5, the calculated output is 0.4, and the training error is j (w) ═ 1/2 ^ (0.5-0.4) ^2 ^ 0.005.

In another preferred embodiment, the training module 204 further compares the training error with a predetermined expected value before determining whether the back propagation is required according to the training error. If the training error is larger than the preset expected value, the reverse propagation is needed; and if the training error is not greater than the preset expected value, stopping training the deep neural network, and taking each weight vector W as the updated weight vector of the deep neural network model. Illustratively, in the 1 st training, the training error is 0.005, the preset expected value is 0.1, and if the training error is not greater than the expected value, the training of the deep neural network is stopped, and each weight vector is the updated weight vector of the deep neural network.

In another preferred embodiment, if the training error is greater than the preset expected value, then the back propagation is required, and the training module 204 updates each weight vector according to a weight update formula, where the weight update formula is: w (n +1) ═ W (n) + Δ W (n),

wherein W (n) represents a weight vector of the corresponding neuron when the deep neural network model is trained for the nth time, W (n +1) represents a weight vector of the corresponding neuron when the deep neural network model is trained for the n +1 th time, AW (n) represents a change of the weight vector of the corresponding neuron when the deep neural network model is trained for the nth time in a gradient descending direction, η represents a learning rate,

In another preferred embodiment, if the training error is greater than the preset expected value and the reverse propagation is required, the training module 204 may further update each weight vector according to the vector v and a variation value of the scalar g, where the variation value of the scalar g in the gradient descending direction is represented by ▽ g L, the partial derivative function of the error function on the parameter g is represented by ▽ wL, the partial derivative function of the error function on the weight W is represented by ▽ wL, and the variation value of the vector v in the gradient descending direction is represented by:

since parameter rewriting is performed on the weight value W, the change of the original weight value W can be converted into the change of the parameters v and g.

Illustratively, when performing the back propagation calculation, the training module 204 obtains the variation value of the scalar g and the variation value of the parameter v by differentiating the partial derivative function of the error function with respect to the parameter g and the partial derivative function of the error function with respect to the parameter v. Then, scalar g and vector v are updated with the change value of scalar g and the change value of vector v. And finally, updating each weight vector according to the updated scalar g and the vector v.

In another preferred embodiment, after updating each weight vector according to the gradient of the vector v and the scalar g, the training module 204 continues the training of the deep neural network model with each weight vector, obtains a corresponding calculation output, and then recalculates the corresponding training error between the calculation output and the target output according to the training error formula. And stopping training the neural network when the training error is not greater than the preset expected value or the training times reach the preset training times.

An updating module 206, configured to update each weight vector according to the calculation output.

Specifically, the updating module 206 obtains the calculation output from the deep neural network model, and then updates each weight vector according to the calculation output.

EXAMPLE III

Fig. 3 is a schematic diagram of a hardware architecture of a computer device according to a third embodiment of the present invention. In the present embodiment, the computer device 2 is a device capable of automatically performing numerical calculation and/or information processing in accordance with a preset or stored instruction. The computer device 2 may be a rack server, a blade server, a tower server or a rack server (including an independent server or a server cluster composed of a plurality of servers), and the like. As shown in fig. 3, the computer device 2 includes, but is not limited to, at least a memory 21, a processor 22, a network interface 23, and a deep learning weight value updating system 20, which are connected to each other in a communication manner through a system bus. Wherein:

in this embodiment, the memory 21 includes at least one type of computer-readable storage medium including a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the storage 21 may be an internal storage unit of the computer device 2, such as a hard disk or a memory of the computer device 2. In other embodiments, the memory 21 may also be an external storage device of the computer device 2, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like provided on the computer device 2. Of course, the memory 21 may also comprise both internal and external memory units of the computer device 2. In this embodiment, the memory 21 is generally used to store an operating system and various application software installed on the computer device 2, for example, the program code of the deep learning weight updating system 20 in the second embodiment. Further, the memory 21 may also be used to temporarily store various types of data that have been output or are to be output.

Processor 22 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 22 is typically used to control the overall operation of the computer device 2. In this embodiment, the processor 22 is configured to run a program code stored in the memory 21 or process data, for example, run the deep learning weight updating system 20, so as to implement the deep learning weight updating method of the first embodiment.

The network interface 23 may comprise a wireless network interface or a wired network interface, and the network interface 23 is generally used for establishing communication connection between the computer device 2 and other electronic apparatuses. For example, the network interface 23 is used to connect the computer device 2 to an external terminal through a network, establish a data transmission channel and a communication connection between the computer device 2 and the external terminal, and the like. The network may be a wireless or wired network such as an Intranet (Intranet), the Internet (Internet), a Global System of Mobile communication (GSM), Wideband Code Division Multiple Access (WCDMA), a 4G network, a 5G network, Bluetooth (Bluetooth), Wi-Fi, and the like.

It is noted that fig. 3 only shows the computer device 2 with components 20-23, but it is to be understood that not all shown components are required to be implemented, and that more or less components may be implemented instead.

In this embodiment, the deep learning weight update system 20 stored in the memory 21 can be further divided into one or more program modules, and the one or more program modules are stored in the memory 21 and executed by one or more processors (in this embodiment, the processor 22) to complete the present invention.

For example, fig. 2 shows a schematic diagram of program modules for implementing the deep learning weight updating system 20, in this embodiment, the deep learning weight updating system 20 may be divided into a building module 200, a parameter updating module 202, a training module 204, and an updating module 206. The program module referred to in the present invention refers to a series of computer program instruction segments capable of performing specific functions, and is more suitable than a program for describing the execution process of the deep learning weight value updating system 20 in the computer device 2. The specific functions of the program modules 200 and 206 have been described in detail in the second embodiment, and are not described herein again.

Example four

The present embodiment also provides a computer-readable storage medium, such as a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application mall, etc., on which a computer program is stored, which when executed by a processor implements corresponding functions. The computer-readable storage medium of this embodiment is used for storing a deep learning weight updating system 20, and when being executed by a processor, the deep learning weight updating system implements the deep learning weight updating method of the first embodiment.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A method for updating deep learning weight value is characterized in that the method comprises the following steps:

wherein, W _nRepresenting the updated weight vector of the corresponding neuron, v represents W _nUnit vector of (1), g represents W _nThe scalar quantity of (a), the g | | | W _n||，v _n-1Representing the n-1 th time of the deep neural network modelThe unit vector of each weight vector during training;

and updating each weight vector according to the calculation output.

2. The method for updating deep learning weights according to claim 1, wherein the step of constructing the deep neural network model according to the plurality of neuron output functions further comprises:

and initializing each weight vector and each deviation scalar.

3. The method for updating deep learning weights according to claim 1, wherein the step of updating the weight vectors according to the computation output comprises:

wherein J (W) represents a training error, t _kRepresenting the target output for the kth training of the deep neural network model, z _kRepresenting the computation output of the k training of the deep neural network model, wherein k is a positive integer and is 1, 2 … c;

4. The method of claim 3, wherein the step of determining whether back propagation needs to be performed according to the training error comprises:

comparing the training error with a preset expected value; and

5. The method of claim 4, wherein the step of comparing the training error with a preset expected value is followed by:

6. The method of claim 4, wherein the step of performing the back propagation to update the weight vectors when the training error is greater than the predetermined expected value comprises:

updating each weight vector according to a weight updating formula, wherein the weight updating formula is as follows: w (n +1) ═ W (n) + Δ W (n), w (n) represents a weight vector of the corresponding neuron in the n training of the deep neural network model, W (n +1) represents a weight vector of the corresponding neuron in the n +1 training of the deep neural network model, Δ W (n) represents a change of the weight vector of the corresponding neuron in a gradient descending direction in the n training of the deep neural network model, η represents a learning rate,

7. The method of claim 4, wherein the step of performing the back propagation to update the weight vectors when the training error is greater than the preset expected value further comprises:

updating each weight vector according to the vector v and the change value of the scalar g, wherein the change value of the scalar g in the gradient descending direction is as follows:

wherein ▽ gL represents a partial derivative function of the error function to the parameter g, and the change value of the vector v in the gradient descent direction is:

8. A deep learning weight updating system is characterized by comprising:

9. A computer device, characterized by a computer device memory, a processor and a computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing the steps of the deep learning weight update method according to any one of claims 1 to 7.

10. A computer-readable storage medium, having stored therein a computer program, the computer program being executable by at least one processor to cause the at least one processor to perform the steps of the deep learning weight update method according to any one of claims 1-7.