CN111937011A

CN111937011A - Method and equipment for determining weight parameters of neural network model

Info

Publication number: CN111937011A
Application number: CN201880092139.XA
Authority: CN
Inventors: 杨帆; 钟刚
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2018-06-15
Filing date: 2018-06-15
Publication date: 2020-11-13
Also published as: WO2019237357A1

Abstract

A method for determining weight parameters of a neural network model comprises the following steps: processing the sample data based on undetermined weight parameters of the neural network model to obtain an output result; calculating an original error value of the output result and a preset expected result, wherein the original error value is a numerical representation of the difference between the output result and the expected result; correcting the original error value based on the correction value to obtain a corrected error value; determining a model weight parameter for the neural network model based on the corrected error value and the pending weight parameter; wherein the correction value is obtained according to the following formula: r ═ w_k‑Q(w_k))×Q(w_k) R represents said correction value, w_kA kth pending weight parameter, Q (w), representing the neural network model_k) And representing the quantization value of the kth undetermined weight parameter, wherein k is a non-negative integer.

Description

Method and equipment for determining weight parameters of neural network model

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a method and an apparatus for determining weight parameters of a neural network model.

Background

In recent years, neural network models have attracted much attention because they exhibit extremely superior performance in applications such as computer vision and speech processing. The cost of success and accompaniment of the neural network model is that a large number of parameters and calculation are introduced, and the quantization technology of relevant model parameters of the neural network model can reduce the redundancy of the precision of the relevant model parameters and realize the purpose of model compression on the premise of reducing the adverse effect on the model accuracy.

The model compression can reduce the occupation of memory bandwidth and the energy consumption of data access, and the low-precision operation often brings lower operation energy consumption. For some computing units supporting multiple precision calculations, the number of times that low-precision calculations can be completed per unit time is higher than the number of times that high-precision calculations can be completed.

Disclosure of Invention

The embodiment of the application provides a method and equipment for determining weight parameters of a neural network model, and in various data processing scenes in which the neural network model is applied, such as image recognition, voice recognition, image super-resolution processing and the like, the method can reduce quantization errors and avoid the problem of overfitting caused by the fact that part of weight parameters with larger numerical values dominate the inference result of the neural network by introducing appropriate correction values into the error between the output result of model training and the expected result.

In order to achieve the above purpose, the embodiment of the present application adopts the following technical solutions:

in a first aspect, an embodiment of the present application provides a method for determining a weight parameter of a neural network model, including: processing the sample data based on undetermined weight parameters of the neural network model to obtain an output result; calculating an original error value of the output result and a preset expected result, wherein the original error value is a numerical representation of the difference between the output result and the expected result; correcting the original error value based on the correction value to obtain a corrected error value; determining a model weight parameter for the neural network model based on the corrected error value and the pending weight parameter; wherein the correction value is obtained according to the following formula:

R＝(w _k-Q(w _k))×Q(w _k)

r represents the correction value, w_kA kth pending weight parameter, Q (w), representing the neural network model_k) And representing the quantization value of the kth undetermined weight parameter, wherein k is a non-negative integer.

According to the embodiment of the application, a proper correction value is introduced into the error between the output result and the expected result of model training, so that the quantization error is reduced, and the problem of overfitting caused by the fact that part of weight parameters with large numerical values dominate the reasoning result of a neural network is solved.

In a first possible implementation manner of the first aspect, the corrected error value is obtained according to the following formula:

wherein E1 represents the corrected error value, E0 represents the original error value, α is a constant, m is the total number of pending weight parameters for processing the sample data, F ((w)_k-Q(w _k))×Q(w _k) Is) represents a function with the correction value as an argument, and m is a positive integer.

In a second possible embodiment of the first aspect, the function with the correction value as an argument is calculating an absolute value of the correction value; correspondingly, the corrected error value is obtained according to the following formula:

wherein, | (w)_k-Q(w _k))×Q(w _k) I denotes calculation (w)_k-Q(w _k))×Q(w _k) Absolute value of (a).

In a third possible implementation manner of the first aspect, the neural network model includes p network layers, each of the network layers includes q pending weight parameters, and the kth pending weight parameter is a jth pending weight parameter of an ith network layer in the neural network model; correspondingly, the corrected error value is obtained according to the following formula:

wherein p and q are positive integers and i and j are non-negative integers.

In the above feasible implementation manner of the embodiment of the application, different calculation methods and forms of the correction value are exemplarily given to correct the error value of the training result, so that the effects of reducing the quantization error and avoiding the problem of overfitting caused by the inference result of the neural network dominated by part of weight parameters with larger values are achieved.

It should be understood that in some neural network models, some network layers do not have the pending weight parameter, and it is apparent that network layers without the pending weight parameter cannot participate in the calculation of the corrected error value in the above formula.

In a fourth possible implementation manner of the first aspect, the processing sample data based on the undetermined weight parameter of the neural network model includes: obtaining the undetermined weight parameter; quantizing the obtained undetermined weight parameter to obtain a quantized weight parameter, wherein the quantized weight parameter is a quantized value of the undetermined weight parameter; taking the quantization weight parameter as a model weight parameter of the neural network model, and processing the sample data by adopting a forward propagation algorithm; obtaining the output result from an output layer of the neural network model.

It should be understood that there may be a plurality of forward propagation algorithms mentioned in the embodiments of the present application, and the embodiments of the present invention are not limited thereto. And deducing the input sample data based on the neural network model through a forward propagation algorithm to obtain an output result.

In a fifth possible implementation manner of the first aspect, the obtaining the model weight parameter of the neural network model by using iterative training, and when the iterative training satisfies an end condition, the determining the model weight parameter of the neural network model based on the corrected error value and the undetermined weight parameter includes: and taking the quantization weight parameter as a model weight parameter of the neural network model.

According to the embodiment of the application, the weight parameters are quantized, so that the neural network model is compressed, the occupation and energy consumption of memory bandwidth are reduced, and the operation efficiency of the processor is improved.

In a sixth possible implementation manner of the first aspect, when the iterative training does not satisfy the end condition, the determining a model weight parameter of the neural network model based on the corrected error value and the pending weight parameter includes: and adjusting the undetermined weight parameters layer by layer for the network layer of the neural network model by adopting a back propagation algorithm according to the corrected error value until reaching the input layer of the neural network model so as to obtain the adjusted weight parameters of the neural network model.

In a seventh possible implementation manner of the first aspect, the undetermined weight parameter of the neural network model is adjusted according to the following formula:

wherein, w0_kDenotes the kth of said pending weight parameter, w1_kRepresents the k-th adjusted weight parameter, and beta is a normal number.

It should be understood that there may be a plurality of back propagation algorithms, which are also referred to as back propagation algorithms, and the embodiments of the present invention are not limited thereto. And training the weight parameters through a back propagation algorithm, and further optimizing the neural network model by the updated weight parameters.

In an eighth possible implementation manner of the first aspect, for an nth training period in the iterative training, N is an integer greater than 1, and M is a positive integer smaller than N, and the ending condition includes one or more of the following conditions: the original error value in the Nth training period is smaller than a preset first threshold value; the correction error value in the Nth training period is smaller than a preset second threshold value; the difference between the original error value in the Nth training period and the original error value in the Nth-M training period is smaller than a preset third threshold; the difference between the correction error value in the Nth training period and the correction error value in the Nth-Mth training period is smaller than a preset fourth threshold; the difference between the undetermined weight parameter in the Nth training period and the undetermined weight parameter in the Nth-M training period is smaller than a preset fifth threshold; and N is greater than a preset sixth threshold.

In a ninth possible implementation of the first aspect, when the nth training period does not satisfy the end condition, one or more combinations of the following physical quantities are stored: an original error value in the nth training period; a corrected error value in the nth training period; a pending weight parameter in the nth training period; and a cycle number N of the nth training cycle.

In the embodiment of the application, the training efficiency is improved by reasonably setting the training ending condition, and the balance between the training effect and the resources consumed by training is achieved.

In a tenth possible implementation manner of the first aspect, the obtaining the pending weight parameter includes: during a first training period of the iterative training, taking a preset initial weight parameter as the undetermined weight parameter; and when the iterative training is not in the first training period, taking the adjusted weight parameter of the neural network model as the undetermined weight parameter.

In an eleventh possible implementation of the first aspect, the neural network model is used for image recognition; correspondingly, the sample data comprises an image sample; correspondingly, the output result comprises a recognition result of the image recognition which is characterized in a probability form.

In a twelfth possible implementation of the first aspect, the neural network model is used for voice recognition; correspondingly, the sample data comprises a sound sample; correspondingly, the output result comprises a recognition result of the voice recognition which is characterized in a probability form.

In a thirteenth possible implementation manner of the first aspect, the neural network model is used for super-resolution image acquisition; correspondingly, the sample data comprises an image sample; correspondingly, the output result comprises the pixel value of the super-resolution processed image.

The practical implementation manner of the embodiment of the present application exemplarily provides a specific application scenario of the neural network model in the embodiment of the present application, and through the application of the neural network model, the recognition rates of image recognition and voice recognition can be improved, the image quality of image super-resolution processing can be improved, and meanwhile, a significant beneficial effect can be obtained in other application fields.

In a second aspect, an embodiment of the present application provides an apparatus for determining weight parameters of a neural network model, including: the forward propagation module is used for processing the sample data based on undetermined weight parameters of the neural network model to obtain an output result; a comparison module, configured to calculate an original error value between the output result and a preset expected result, where the original error value is a numerical representation of a difference between the output result and the expected result; the correction module is used for correcting the original error value based on the correction value so as to obtain a corrected error value; a determining module for determining a model weight parameter of the neural network model based on the corrected error value and the undetermined weight parameter; wherein the correction value is obtained according to the following formula:

R＝(w _k-Q(w _k))×Q(w _k)

In a first possible embodiment of the second aspect, the corrected error value is obtained according to the following formula:

In a second possible embodiment of the second aspect, the function with the correction value as an argument is calculating an absolute value of the correction value; correspondingly, the corrected error value is obtained according to the following formula:

In a third possible implementation manner of the second aspect, the neural network model includes p network layers, each of the network layers includes q pending weight parameters, and the kth pending weight parameter is the jth pending weight parameter of the ith network layer in the neural network model; correspondingly, the corrected error value is obtained according to the following formula:

wherein p and q are positive integers and i and j are non-negative integers.

In a fourth possible implementation manner of the second aspect, the forward propagation module is specifically configured to: obtaining the undetermined weight parameter; quantizing the obtained undetermined weight parameter to obtain a quantized weight parameter, wherein the quantized weight parameter is a quantized value of the undetermined weight parameter; taking the quantization weight parameter as a model weight parameter of the neural network model, and processing the sample data by adopting a forward propagation algorithm; obtaining the output result from an output layer of the neural network model.

In a fifth possible implementation manner of the second aspect, the model weight parameters of the neural network model are obtained by using iterative training, and when the iterative training satisfies an end condition, the determining module is specifically configured to: and taking the quantization weight parameter as a model weight parameter of the neural network model.

In a sixth possible implementation manner of the second aspect, the method further includes a back propagation module, and when the iterative training does not satisfy the end condition, the back propagation module is specifically configured to: and adjusting the undetermined weight parameters layer by layer for the network layer of the neural network model by adopting a back propagation algorithm according to the corrected error value until reaching the input layer of the neural network model so as to obtain the adjusted weight parameters of the neural network model.

In a seventh possible implementation manner of the second aspect, the undetermined weight parameter of the neural network model is adjusted according to the following formula:

In an eighth possible implementation manner of the second aspect, for an nth training period in the iterative training, N is an integer greater than 1, and M is a positive integer smaller than N, and the ending condition includes one or more of the following conditions: the original error value in the Nth training period is smaller than a preset first threshold value; the correction error value in the Nth training period is smaller than a preset second threshold value; the difference between the original error value in the Nth training period and the original error value in the Nth-M training period is smaller than a preset third threshold; the difference between the correction error value in the Nth training period and the correction error value in the Nth-Mth training period is smaller than a preset fourth threshold; the difference between the undetermined weight parameter in the Nth training period and the undetermined weight parameter in the Nth-M training period is smaller than a preset fifth threshold; and N is greater than a preset sixth threshold.

In a ninth possible implementation of the second aspect, when the nth training period does not satisfy the end condition, one or more combinations of the following physical quantities are stored: an original error value in the nth training period; a corrected error value in the nth training period; a pending weight parameter in the nth training period; and a cycle number N of the nth training cycle.

In a tenth possible implementation manner of the second aspect, the forward propagation module is specifically configured to: during a first training period of the iterative training, taking a preset initial weight parameter as the undetermined weight parameter; and when the iterative training is not in the first training period, taking the adjusted weight parameter of the neural network model as the undetermined weight parameter.

In an eleventh possible implementation of the second aspect, the neural network model is used for image recognition; correspondingly, the sample data comprises an image sample; correspondingly, the output result comprises a recognition result of the image recognition which is characterized in a probability form.

In a twelfth possible implementation of the second aspect, the neural network model is used for voice recognition; correspondingly, the sample data comprises a sound sample; correspondingly, the output result comprises a recognition result of the voice recognition which is characterized in a probability form.

In a thirteenth possible implementation of the second aspect, the neural network model is used for super-resolution image acquisition; correspondingly, the sample data comprises an image sample; correspondingly, the output result comprises the pixel value of the super-resolution processed image.

In a third aspect, an embodiment of the present application provides an electronic device, including: one or more processors and one or more memories. One or more memories coupled to the one or more processors for storing computer program code comprising computer instructions which, when executed by the one or more processors, cause the electronic device to perform the method of determining neural network model weight parameters as in any one of the first aspect.

In a fourth aspect, an embodiment of the present application provides a computer storage medium, which includes computer instructions, and when the computer instructions are executed on an electronic device, the electronic device is caused to execute the data processing method according to any one of the first aspect.

In a fifth aspect, the present application provides a computer program product, which when run on a computer, causes the computer to execute the method for determining the neural network model weight parameters according to any one of the first aspect.

In a sixth aspect, embodiments of the present application provide a chip comprising a processor and a memory, the memory being configured to store computer program code, the computer program code comprising computer instructions, which when executed by the processor, cause an electronic device to perform the method for determining neural network model weight parameters according to any one of the first aspect.

For the beneficial effects of the second to sixth aspects, reference may be made to the description of the first aspect, and details are not repeated here.

Drawings

FIG. 1 is a schematic diagram of an exemplary neural network architecture;

FIG. 2 is a schematic diagram of an exemplary neuron;

FIG. 3 is a schematic diagram of an exemplary alternative neural network architecture;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure;

fig. 5 is an exemplary flowchart of a method for determining weight parameters of a neural network model according to an embodiment of the present application;

fig. 6 is a block diagram illustrating an exemplary structure of an apparatus for determining weight parameters of a neural network model according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of another electronic device according to an embodiment of the present application.

Detailed Description

Neural networks (also known as deep neural networks) may be used to process various data, such as image data, audio data, and the like. The neural network may include one or more network layers (also referred to as neural network layers), which may be convolutional layers, fully-connected layers, deconvolution layers, cyclic layers, or the like. A typical neural network model is shown in figure 1.

For the purpose of facilitating understanding of embodiments of the present invention, concepts related to the embodiments of the present application are given, by way of example, for reference.

(1) Neural network model and Forward propagation (Forward propagation): for the training sample set (x:)ⁱ)，y( ⁱ) Neural network algorithms can provide a complex and non-linear hypothesis model h_W，b(x) It has parameters W, b with which the data can be fitted. For describing the neural network, first of all, from the simplest neural network model, the neural network model is also referred to herein simply as a neural network, and this neural network is composed of only one "neuron", which is illustrated in fig. 2.

The "neuron" is a neuron with x₁，x ₂，x ₃And an arithmetic unit having an input value of intercept +1 and an output of

Wherein the function

Referred to as an "activation function".

So-called neural networks are networks that connect a number of individual "neurons" together so that the output of one "neuron" can be the input of another "neuron". Fig. 3 is a simple neural network.

Circles are used to represent the inputs to the neural network, and the circles labeled "+ 1" are referred to as bias nodes, i.e., intercept terms. The leftmost layer of the neural network is called the input layer and the rightmost layer is called the output layer (in this example, the output layer has only one node). The layer formed by all nodes in the middle is called the hidden layer (in other embodiments, the hidden layer may not exist, or multiple layers exist). It can also be seen that in the above example of the neural network, there are 3 input units (excluding the bias units), 3 hidden units and one output unit.

Not using n_lTo indicate the number of layers of the network, n in this example_lLet us denote the L-th layer as L3_lIs then L₁Is an input layer and an output layer

The neural network has the parameters (W, b) ═ W⁽¹⁾，b ⁽¹⁾，W ⁽²⁾，b ⁽²⁾) Wherein

Is the link parameter (which is the weight on the connecting line) between the jth cell of the ith layer and the ith cell of the (l + 1) th layer,

is the bias term for the ith cell of the l +1 th layer. No other cells are connected to the bias cell (i.e. the bias cell has no input),since they always output + 1. At the same time, using s_lIndicating the number of nodes (offset units are not included) in the l-th layer. By using

Indicates the activation value (output value) of the ith cell of the ith layer. When l is equal to 1, the ratio of the total of the two,

i.e. the ith input value (i.e. the ith characteristic of the input value). For a given set of parameters W, b, the neural network can follow the function h_W，b(x) To calculate the output result. The calculation steps of the neural network in this example are as follows:

out of use

Indicating that the ith element of the ith layer inputs a weighted sum (including the offset elements), e.g.,

then

The above calculation step is called forward propagation.

(2) Back propagation and error Back propagation algorithms: the back propagation algorithm is mainly iterated by two links (excitation propagation and weight updating) repeatedly and circularly until the response of the network to the input reaches a preset target range.

The learning process of the error back propagation algorithm consists of a forward propagation process and a back propagation process. In the forward propagation process, input information passes through the hidden layer through the input layer, is processed layer by layer and is transmitted to the output layer. If the expected output value cannot be obtained in the output layer, various characterization forms (such as square sum) of the output and the expected error are taken as the target function, the backward propagation is carried out, the partial derivative of the target function to each neuron weight is calculated layer by layer to form the gradient of the target function to the weight vector, the gradient is used as the basis for modifying the weight, and the network learning is completed in the weight modifying process. And when the error reaches the expected value, the network learning is finished.

In the excitation propagation step, a propagation link in each iteration comprises two steps: sending the training input into the network to obtain the stimulus response (forward propagation phase); the excitation response is subtracted from the target output corresponding to the training input to obtain the response error of the hidden layer and the output layer (back propagation phase).

In the weight updating step, the weight on each neuron is updated according to the following steps: multiplying the input excitation and response errors, thereby obtaining a gradient of the weight; this gradient is multiplied by a proportion and inverted and added to the weight. This ratio will affect the speed and effectiveness of the training process and is therefore referred to as the "training factor". The direction of the gradient indicates the direction of error propagation and therefore needs to be inverted when updating the weights, thereby reducing the weight-induced errors.

Matt Mazur in "A Step by Step Back propagation instance" ("Step decomposition of a backward propagation instance", can download from https:// mattmazur. com/2015/03/17/a-Step-by-Step-backward propagation instance/, and is incorporated herein in its entirety) exemplarily introduces an implementation of an error back propagation algorithm, which can be applied to the embodiments of the present application, and is not described in detail.

The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. In the description of the embodiments herein, "/" means "or" unless otherwise specified, for example, a/B may mean a or B; "and/or" herein is merely an association describing an associated object, and means that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, in the description of the embodiments of the present application, "a plurality" means two or more than two.

The data processing device according to the embodiment of the present application is an electronic device that processes data such as an image and a voice using a convolutional neural network, and may be, for example, a server or a terminal. For example, when the electronic device is a terminal, the electronic device may specifically be a desktop computer, a laptop computer, a Personal Digital Assistant (PDA), a tablet computer, an embedded device, a mobile phone, an intelligent peripheral (e.g., an intelligent watch, a bracelet, glasses, etc.), a television set-top box, a monitoring camera, and the like. The embodiment of the application does not limit the specific type of the electronic equipment.

For example, fig. 4 shows a hardware structure diagram of an electronic device 400 according to an embodiment of the present application. The electronic device 400 may include at least one processor 401, a communication bus 402, and memory 403. The electronic device 400 may also include at least one communication interface 404.

The processor 401 may be a general-purpose Central Processing Unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), a graphics processor, or a combination thereof

A Graphics Processing Unit (GPU), a Field Programmable Gate Array (FPGA), or one or more integrated circuits for controlling the execution of programs according to the present disclosure.

Communication bus 402 may include a path that transfers information between the above components.

The communication interface 404 may be any device, such as a transceiver, for communicating with other devices or communication networks, such as an ethernet, a Radio Access Network (RAN), a Wireless Local Area Network (WLAN), etc.

The memory 403 may be a read-only memory (ROM) or other type of static storage device that can store static information and instructions, a Random Access Memory (RAM) or other type of dynamic storage device that can store information and instructions, an electrically erasable programmable read-only memory (EEPROM), a compact disc read-only memory (CD-ROM) or other optical disk storage, optical disk storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to these. The memory may be self-contained and coupled to the processor via a bus. The memory may also be integral to the processor.

The memory 403 is used for storing application program codes for executing the scheme provided by the embodiment of the present application, and a neural network model structure, weights, and intermediate results obtained when the processor 401 operates according to the scheme provided by the embodiment of the present application, and the processor 401 controls the execution. The processor 401 is configured to execute the application program codes stored in the memory 203, so as to implement the data processing method provided by the following embodiments of the present application.

In particular implementations, processor 401 may include one or more CPUs such as CPU0 and CPU1 in fig. 4 as an example.

In particular implementations, electronic device 400 may include multiple processors, such as processor 401 and processor 407 in FIG. 4, for example, as an embodiment. Each of these processors may be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor. A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).

In particular implementations, electronic device 400 may also include an output device 405 and an input device 406, as one embodiment. An output device 405 is in communication with the processor 401 and may display information in a variety of ways. For example, the output device 405 may be a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display device, a Cathode Ray Tube (CRT) display device, a projector (projector), or the like. The input device 406 is in communication with the processor 401 and can accept user input in a variety of ways. For example, the input device 406 may be a mouse, keyboard, camera, microphone, touch screen device, or sensing device, among others.

As described above, the neural network is composed of network layers, and each network layer processes its input data and transmits it to the next network layer. In the network layer, different weights are used to perform operations such as convolution, multiplication and addition on input data according to different properties of the network layer (such as convolution layer, full link layer, and the like). The way of these operations is determined by the attributes of the network layer, but the values of the weights used by the different operations are obtained by training. Different data processing results can be obtained by adjusting the weight value.

In the training process of the model weight parameters, excessive emphasis on data fitting of the training data set may result in poor generalization, which is represented as: the model has good fitting effect and high accuracy on the training data set, but cannot fit data well on the data set outside the training data set, the effect is not good, and the accuracy is seriously reduced.

The neural network weight parameters are redundant in precision, low-precision data formats (such as INT8, binary and the like) are adopted to record the weight parameters, and the compression of the weight parameters can be realized instead of high-precision data formats (such as FP32 and FP 64).

Quantization of the weight parameters is a feasible implementation way to achieve compression of the weight parameters, compression of neural network parameter precision can achieve balance among storage, energy consumption and accuracy, and data that can be quantized include: the weights (weights), feature tensors (activities), gradients (gradients), and other parameters of the neural network model are not limited.

In one possible embodiment, the weight coefficients of the neural network with high precision (without quantization, typically with precision of FP32 or FP64) are obtained by training first, and then the weight coefficients expressed in the high-precision data format are expressed in the low-precision data format. In this case, since the low-precision data format cannot express details of high-precision data, differences in numerical values are caused, and the differences are accumulated a plurality of times during calculation, which eventually causes the accuracy of the final calculation result of the quantized model to be lower than that of the original model before quantization. In order to solve the above problems, the quantized model parameters are generally trained again on the premise of ensuring the low-precision data format thereof, and the model accuracy close to the high precision is finally achieved under the low precision through retraining and adjusting the weight value.

Specifically, in the training process, sample data is input to the neural network, the difference between the sample data and expected data is calculated, the gradient (i.e., the trend that the weight should be adjusted) of all weights in the neural network is calculated by using the difference, and the weights in the neural network are adjusted, so that the purpose of reducing the difference, namely achieving higher accuracy is achieved. However, in the case of a quantized neural network, since the gradient value of the adjustment weight in the training process is extremely small, and the probability is much smaller than the minimum value of the expressible interval under the weight precision, the value of the weight cannot be actually adjusted by the gradient. For example, in the data format of INT4, the range of values is defined as 0 to 1, and the minimum value of expressible interval is 2^-4However, the gradient is likely to be much smaller than this value, assuming 2^-6At this point, any value of A under INT4, after summing with the gradient, should result in A +2^-6However, in the data format of INT4, 2 cannot be expressed due to its minimum interval^-6The actual result is still a. Therefore, the quantization weights cannot be used directly for training.

In order to solve the above problem, a common training method is to use a high-precision reference weight to carry information of unquantized weights, the quantized weights are obtained from the reference weight, error calculation is performed by the quantized weights, and gradient obtained by the error is applied to the unquantized weights (also called reference weights). The reference weight accumulation is adjusted beyond a certain amount, and then is expressed in the quantization weight. (for example, the quantized weight can only express an integer of 0-255, the reference weight is initially 1, the value is slightly floating during the adjustment process, the quantization weight is not changed to 2 until the reference weight is greater than 1.5, the quantization weight is not changed to 0 until the reference weight is less than 0.5, otherwise, the quantization weight is 1).

The above-mentioned method brings about the current technical problems including: different from a non-quantized neural network, a quantized weight is used in the calculation process of the quantized neural network instead of a reference weight, and the difference between the reference weight and the quantized weight is accumulated in the calculation of the neural network, so that the calculation result is influenced finally, and a quantization error is generated. The gradient used for adjusting the reference weight is derived from the inference error obtained by calculating the derivation result by the quantization weight, and is not accurate enough, and the gradient error exists. The above problems may make it difficult for model training to reach a global optimal solution.

The embodiment of the application provides a method for determining weight parameters of a neural network model, as shown in fig. 5, the method specifically includes:

s501, processing the sample data based on the undetermined weight parameters of the neural network model to obtain an output result.

Specifically, in one possible embodiment, the present step includes:

s5011, obtaining the undetermined weight parameter.

It is not assumed that, in the embodiment of the present application, the model weight parameters of the neural network model are obtained by using an iterative training mode.

When the training is started for the first time, that is, during the first training period of the iterative training, the step S5011 is executed to obtain a default initial value, for example, a constant such as 0,1 is assigned to the undetermined weight parameter, or obtain a predetermined value according to an empirical value and assign the predetermined value to the undetermined weight parameter, for example, a stored weight parameter of a pre-trained neural network model.

When the training is in the iterative training process, that is, in a non-first training period of the iterative training, the step S5011 is executed to obtain the undetermined weight parameter updated by the back propagation algorithm in the previous training period (that is, the adjusted weight parameter) as the undetermined weight parameter obtained in this step. The specific implementation will be detailed in the following steps, which are not described in detail.

S5012, quantizing the obtained undetermined weight parameter to obtain a quantized weight parameter, wherein the quantized weight parameter is a quantized value of the undetermined weight parameter.

The quantization of the weighting parameter to be determined may use different preset quantization schemes, for example, the high-precision expression mode of the weighting parameter is converted into the low-precision expression mode described above, which is not limited in this step.

S5013, the quantization weight parameters are used as model weight parameters of the neural network model, and a forward propagation algorithm is adopted to process the sample data.

In the step, quantized undetermined weight parameters are used as model parameters of a neural network model, sample data is used as input, and the forward propagation algorithm is used as a calculation criterion for calculation. It should be understood that this step does not limit the specific forward propagation algorithm.

S5014, obtaining the output result from an output layer of the neural network model.

Outputting a calculation result for the sample data from an output layer of the neural network model. For example, if the neural network model is used for image recognition, the sample data includes an image sample, and the output result includes a recognition result of the image recognition characterized in a probability form, specifically, the probability of judging that the sample image is the target image is 90%. If the neural network model is used for voice recognition, the sample data includes a voice sample, and the output result includes a recognition result of the voice recognition characterized in a probability form, specifically, the probability of judging the sample voice as a target voice is 20%. . And if the neural network model is used for acquiring the super-resolution image, the sample data comprises an image sample, and the output result comprises the pixel value of the super-resolution processed image.

It should be understood that the neural network model may also be used in other applications involving the field of artificial intelligence, and correspondingly, the sample data as input data and the output result as output data may also be other types of physical quantities, without limitation.

S502, calculating an original error value of the output result and a preset expected result, wherein the original error value is a numerical representation of the difference between the output result and the expected result.

In this step, the difference between the expected output result, i.e., the expected result, and the actual output result is calculated, and the difference is represented in a numerical form, corresponding to step S5014. For example, the difference may be a difference between the recognition results, such as 90% of the recognition result, and the expected result is 100%, and the original error value is 10%, or may be a pixel difference between the original image before the sample image is subjected to the super-resolution processing and the image after the sample image is subjected to the super-resolution processing, such as a signal peak signal-to-noise ratio (PSNR) between the two images, such as-0.2 decibel (dB), or a variance between pixels of the two images, and the like, depending on the specific application of the neural network model, without limitation.

S503, correcting the original error value based on the correction value to obtain a corrected error value.

In step (ii), a correction value is first obtained, which in a possible embodiment is obtained according to the following formula:

R＝(w _k-Q(w _k))×Q(w _k)

Correspondingly, the corrected error value is obtained according to the following formula:

In a possible embodiment, the function with the correction value as an argument is to calculate an absolute value of the correction value;

In another possible embodiment, the neural network model includes p network layers, each of the network layers includes q pending weight parameters, and the kth pending weight parameter is the jth pending weight parameter of the ith network layer in the neural network model;

wherein p and q are positive integers and i and j are non-negative integers.

It should be understood that in some possible embodiments, a certain network layer of the neural network model may not include the pending weight parameter, i.e. when q corresponding to the network layer is 0, it is obvious that the network layer is not used for calculation of the correction error value.

In the embodiment of the application, a difference value between the weight parameter and the quantized weight parameter is used as a penalty term by using a correction value (regularization function), and the unquantized weight parameter is guided to be close to the quantized weight value in the training process, so that the quantization error is reduced. Meanwhile, the difference value serving as the penalty term is multiplied by the quantized weight parameter, so that the over-fitting problem caused by the inference result that part of weights with larger numerical values dominate the neural network is avoided.

Next, a model weight parameter of the neural network model is determined based on the corrected error value and the pending weight parameter.

And S504, judging whether the iterative training meets an end condition.

Step S504 is not performed in the nth training period in the iterative training, where N is an integer greater than 1, and M is a positive integer less than N, and the ending condition includes one or a combination of more than one of the following conditions:

the original error value in the Nth training period is smaller than a preset first threshold value;

the correction error value in the Nth training period is smaller than a preset second threshold value;

the difference between the original error value in the Nth training period and the original error value in the Nth-M training period is smaller than a preset third threshold;

the difference between the correction error value in the Nth training period and the correction error value in the Nth-Mth training period is smaller than a preset fourth threshold;

the difference between the undetermined weight parameter in the Nth training period and the undetermined weight parameter in the Nth-M training period is smaller than a preset fifth threshold; and

n is larger than a preset sixth threshold value.

It is to be understood that when M is 1, it means that in the correlation end condition, the difference values of the corresponding physical quantities in the nearest adjacent two training periods need to be compared.

It should also be understood that step S504 may be executed in each training cycle, or may be executed in each M training cycles, and the execution frequency of step S504 is not limited in this embodiment of the application.

For a training period, it is not understood to be a process of calculating a correction error value, adjusting a weight parameter according to the correction error value, and then obtaining a new training result by using the adjusted weight parameter.

Corresponding to step S504, when the nth training period does not satisfy the end condition, storing a combination of one or more of the following physical quantities:

an original error value in the nth training period;

a corrected error value in the nth training period;

a pending weight parameter in the nth training period; and

a number N of cycles of the Nth training cycle.

The stored physical quantity is called when step S504 is subsequently executed.

And S505, when the iterative training meets an end condition, taking the quantitative weight parameter as a model weight parameter of the neural network model.

Generally, the end of the iterative training means that the quantization weight parameters have been optimized to a desired degree by training, and can be determined as the model weight parameters of the neural network model.

In some possible embodiments, the model is trained using a training sample set a and tested using a test sample set B. After the model is trained in N training periods by using the A, testing the model on a test data set B to obtain a first test result X; and after the model is trained for M training periods by the aid of the model A, testing the model on the test data set B to obtain a second test result Y, finishing training when the difference between the X and the Y is smaller than a threshold value, and otherwise, continuously training the model by the aid of the model A.

Correspondingly, in this embodiment, the ending condition includes that the difference between the first test result X and the second test result Y is smaller than the threshold.

S506, when the iterative training does not meet the end condition, adjusting the undetermined weight parameters layer by layer for the network layer of the neural network model by adopting a back propagation algorithm according to the corrected error value until the input layer of the neural network model so as to obtain the adjusted weight parameters of the neural network model.

The back propagation algorithm has already been introduced and will not be described in detail here. Illustratively, the undetermined weight parameter of the neural network model is adjusted according to the following formula:

When the weights from the input layer are adjusted, all weights of the neural network model are obtained, i.e., the adjusted weight parameters of the neural network model are obtained.

And then, taking the adjusted weight parameter as an undetermined quantization parameter, executing the step S5011, and continuing to perform iterative training.

It is understood that the electronic device, in order to perform the above-described method, comprises corresponding hardware structures and/or software modules for performing the respective functions. Those of skill in the art will readily appreciate that the present application is capable of hardware or a combination of hardware and computer software implementing the various illustrative algorithm steps described in connection with the embodiments disclosed herein. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiment of the present application, the electronic device may be divided into the functional modules according to the method example, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. It should be noted that, in the embodiment of the present application, the division of the module is schematic, and is only one logic function division, and there may be another division manner in actual implementation.

Fig. 6 shows a possible composition diagram of the electronic device involved in the above-described embodiment, in the case of dividing each functional module by corresponding functions.

A neural network model weight parameter determination apparatus 600, comprising:

the forward propagation module 601 is configured to process sample data based on undetermined weight parameters of the neural network model to obtain an output result;

a comparing module 602, configured to calculate an original error value between the output result and a preset expected result, where the original error value is a numerical representation of a difference between the output result and the expected result;

a correction module 603, configured to correct the original error value based on the correction value to obtain a corrected error value;

a determining module 604, configured to determine a model weight parameter of the neural network model based on the corrected error value and the undetermined weight parameter;

wherein the correction value is obtained according to the following formula:

R＝(w _k-Q(w _k))×Q(w _k)

In one possible embodiment, the corrected error value is obtained according to the following formula:

In a possible embodiment, the function with the correction value as an argument is to calculate an absolute value of the correction value; correspondingly, the corrected error value is obtained according to the following formula:

In a possible embodiment, the neural network model includes p network layers, each of the network layers includes q pending weight parameters, and the kth pending weight parameter is the jth pending weight parameter of the ith network layer in the neural network model; correspondingly, the corrected error value is obtained according to the following formula:

wherein p and q are positive integers and i and j are non-negative integers.

In a possible implementation, the forward propagation module 601 is specifically configured to: obtaining the undetermined weight parameter; quantizing the obtained undetermined weight parameter to obtain a quantized weight parameter, wherein the quantized weight parameter is a quantized value of the undetermined weight parameter; taking the quantization weight parameter as a model weight parameter of the neural network model, and processing the sample data by adopting a forward propagation algorithm; obtaining the output result from an output layer of the neural network model.

In a possible implementation manner, the model weight parameters of the neural network model are obtained by using an iterative training manner, and when the iterative training satisfies an end condition, the determining module 604 is specifically configured to: and taking the quantization weight parameter as a model weight parameter of the neural network model.

In a possible implementation manner, the method further includes a back propagation module 605, and when the iterative training does not satisfy the end condition, the back propagation module 605 is specifically configured to: and adjusting the undetermined weight parameters layer by layer for the network layer of the neural network model by adopting a back propagation algorithm according to the corrected error value until reaching the input layer of the neural network model so as to obtain the adjusted weight parameters of the neural network model.

In a possible embodiment, the undetermined weight parameter of the neural network model is adjusted according to the following formula:

In a possible embodiment, for the nth training period in the iterative training, N is an integer greater than 1, and M is a positive integer less than N, and the ending condition includes one or more of the following conditions: the original error value in the Nth training period is smaller than a preset first threshold value; the correction error value in the Nth training period is smaller than a preset second threshold value; the difference between the original error value in the Nth training period and the original error value in the Nth-M training period is smaller than a preset third threshold; the difference between the correction error value in the Nth training period and the correction error value in the Nth-Mth training period is smaller than a preset fourth threshold; the difference between the undetermined weight parameter in the Nth training period and the undetermined weight parameter in the Nth-M training period is smaller than a preset fifth threshold; and N is greater than a preset sixth threshold.

In a possible embodiment, when the nth training period does not satisfy the end condition, one or more of the following physical quantities in combination are stored: an original error value in the nth training period; a corrected error value in the nth training period; a pending weight parameter in the nth training period; and a cycle number N of the nth training cycle.

In a possible implementation, the forward propagation module 601 is specifically configured to: during a first training period of the iterative training, taking a preset initial weight parameter as the undetermined weight parameter; and when the iterative training is not in the first training period, taking the adjusted weight parameter of the neural network model as the undetermined weight parameter.

In one possible embodiment, the neural network model is used for image recognition; correspondingly, the sample data comprises an image sample; correspondingly, the output result comprises a recognition result of the image recognition which is characterized in a probability form.

In one possible embodiment, the neural network model is used for voice recognition; correspondingly, the sample data comprises a sound sample; correspondingly, the output result comprises a recognition result of the voice recognition which is characterized in a probability form.

In one possible embodiment, the neural network model is used for super-resolution image acquisition; correspondingly, the sample data comprises an image sample; correspondingly, the output result comprises the pixel value of the super-resolution processed image.

It should be noted that all relevant contents of each step related to the above method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again.

Of course, the electronic device includes, but is not limited to, the above listed unit modules, for example, the electronic device may further include a communication unit, and the communication unit may include a transmitting unit for transmitting data or signals to other devices, a receiving unit for receiving data or signals transmitted by other devices, and the like. In addition, the functions that can be specifically realized by the functional units also include, but are not limited to, the functions corresponding to the method steps of the above example, and the detailed description of the corresponding method steps may be referred to for the detailed description of other units of the electronic device, which is not described herein again in this embodiment of the present application.

The processing unit 701 in fig. 7 may be a processor or a controller, and may be, for example, a central processing unit CPU, a general purpose processor, a Digital Signal Processor (DSP), an application specific integrated circuit ASIC, a field programmable gate array FPGA, a graphics processing unit GPU, or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. A processor may also be a combination of computing functions, e.g., comprising one or more microprocessors, a DSP and a microprocessor, or the like. The storage unit 702 may be a memory. The communication unit may be a transceiver, a radio frequency circuit or a communication interface, etc. The processing unit 701 performs the determination method of the neural network model weight parameters as shown in fig. 5.

Embodiments of the present application also include a computer storage medium including computer instructions, which, when executed on an electronic device, cause the electronic device to perform a method for determining neural network model weight parameters as shown in fig. 5.

Embodiments of the present application also include a computer program product, which when run on a computer causes the computer to execute the method for determining the neural network model weight parameters as shown in fig. 5.

Through the above description of the embodiments, it is clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, a module or a unit may be divided into only one logic function, and may be implemented in other ways, for example, a plurality of units or components may be combined or integrated into another apparatus, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may be one physical unit or a plurality of physical units, may be located in one place, or may be distributed to a plurality of different places. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially or partially contributed to by the prior art, or all or part of the technical solutions may be embodied in the form of a software product, where the software product is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

A method for determining weight parameters of a neural network model is characterized by comprising the following steps:

processing the sample data based on undetermined weight parameters of the neural network model to obtain an output result;

calculating an original error value of the output result and a preset expected result, wherein the original error value is a numerical representation of the difference between the output result and the expected result;

correcting the original error value based on the correction value to obtain a corrected error value;

determining a model weight parameter for the neural network model based on the corrected error value and the pending weight parameter;

wherein the correction value is obtained according to the following formula:

R＝(w _k-Q(w _k))×Q(w _k)

r represents the correction value, w_kA kth pending weight parameter, Q (w), representing the neural network model_k) And representing the quantization value of the kth undetermined weight parameter, wherein k is a non-negative integer.
The method of claim 1, wherein the modified error value is obtained according to the following equation:

wherein E1 represents the corrected error value, E0 represents the original error value, α is a constant, m is the total number of pending weight parameters for processing the sample data, F ((w)_k-Q(w _k))×Q(w _k) Is) represents a function with the correction value as an argument, and m is a positive integer.
A method according to claim 2, characterized in that the function with the correction value as an argument calculates the absolute value of the correction value;

correspondingly, the corrected error value is obtained according to the following formula:

wherein, | (w)_k-Q(w _k))×Q(w _k) I denotes calculation (w)_k-Q(w _k))×Q(w _k) Absolute value of (a).
The method of any of claims 1 to 3, wherein the neural network model comprises p network layers, each of the network layers comprising q of the pending weight parameters, the kth pending weight parameter being the jth pending weight parameter of the ith network layer in the neural network model;

correspondingly, the corrected error value is obtained according to the following formula:

wherein p and q are positive integers and i and j are non-negative integers.
The method according to any one of claims 1 to 4, wherein the processing the sample data based on the undetermined weight parameters of the neural network model comprises:

obtaining the undetermined weight parameter;

quantizing the obtained undetermined weight parameter to obtain a quantized weight parameter, wherein the quantized weight parameter is a quantized value of the undetermined weight parameter;

taking the quantization weight parameter as a model weight parameter of the neural network model, and processing the sample data by adopting a forward propagation algorithm;

obtaining the output result from an output layer of the neural network model.
The method of claim 5, wherein the model weight parameters of the neural network model are obtained by iterative training, and when the iterative training satisfies an end condition, the determining the model weight parameters of the neural network model based on the corrected error value and the undetermined weight parameters comprises:

and taking the quantization weight parameter as a model weight parameter of the neural network model.
The method of claim 6, wherein determining a model weight parameter for the neural network model based on the revised error value and the pending weight parameter when the iterative training does not satisfy the end condition comprises:

and adjusting the undetermined weight parameters layer by layer for the network layer of the neural network model by adopting a back propagation algorithm according to the corrected error value until reaching the input layer of the neural network model so as to obtain the adjusted weight parameters of the neural network model.
The method of claim 7, wherein the undetermined weight parameters of the neural network model are adjusted according to the following formula:

wherein, w0_kDenotes the kth of said pending weight parameter, w1_kRepresents the k-th adjusted weight parameter, and beta is a normal number.
The method according to any one of claims 6 to 8, wherein for an nth training period in the iterative training, N is an integer greater than 1, and M is a positive integer less than N, and the termination condition comprises one or more of the following conditions in combination:

the original error value in the Nth training period is smaller than a preset first threshold value;

the correction error value in the Nth training period is smaller than a preset second threshold value;

the difference between the original error value in the Nth training period and the original error value in the Nth-M training period is smaller than a preset third threshold;

the difference between the correction error value in the Nth training period and the correction error value in the Nth-Mth training period is smaller than a preset fourth threshold;

the difference between the undetermined weight parameter in the Nth training period and the undetermined weight parameter in the Nth-M training period is smaller than a preset fifth threshold; and

n is larger than a preset sixth threshold value.
Method according to claim 9, characterized in that when the nth training period does not satisfy the end condition, a combination of one or more of the following physical quantities is stored:

an original error value in the nth training period;

a corrected error value in the nth training period;

a pending weight parameter in the nth training period; and

a number N of cycles of the Nth training cycle.
The method of any of claims 6 to 10, wherein said obtaining said pending weight parameter comprises:

during a first training period of the iterative training, taking a preset initial weight parameter as the undetermined weight parameter;

and when the iterative training is not in the first training period, taking the adjusted weight parameter of the neural network model as the undetermined weight parameter.
The method of any one of claims 1 to 11, wherein the neural network model is used for image recognition;

correspondingly, the sample data comprises an image sample;

correspondingly, the output result comprises a recognition result of the image recognition which is characterized in a probability form.
The method of any one of claims 1 to 11, wherein the neural network model is used for voice recognition;

correspondingly, the sample data comprises a sound sample;

correspondingly, the output result comprises a recognition result of the voice recognition which is characterized in a probability form.
The method according to any one of claims 1 to 11, wherein the neural network model is used for super-resolution image acquisition;

correspondingly, the sample data comprises an image sample;

correspondingly, the output result comprises the pixel value of the super-resolution processed image.
An apparatus for determining weight parameters of a neural network model, comprising:

the forward propagation module is used for processing the sample data based on undetermined weight parameters of the neural network model to obtain an output result;

a comparison module, configured to calculate an original error value between the output result and a preset expected result, where the original error value is a numerical representation of a difference between the output result and the expected result;

the correction module is used for correcting the original error value based on the correction value so as to obtain a corrected error value;

a determining module for determining a model weight parameter of the neural network model based on the corrected error value and the undetermined weight parameter;

wherein the correction value is obtained according to the following formula:

R＝(w _k-Q(w _k))×Q(w _k)

r represents the correction value, w _kA kth pending weight parameter, Q (w), representing the neural network model_k) And representing the quantization value of the kth undetermined weight parameter, wherein k is a non-negative integer.
The apparatus of claim 15 wherein the modified error value is obtained according to the following equation:

wherein E1 represents the corrected error value, E0 represents the original error value, α is a constant, m is the total number of pending weight parameters for processing the sample data, F ((w)_k-Q(w _k))×Q(w _k) Is) represents a function with the correction value as an argument, and m is a positive integer.
The apparatus according to claim 16, wherein the function with the correction value as an argument calculates an absolute value of the correction value;

correspondingly, the corrected error value is obtained according to the following formula:

wherein, | (w)_k-Q(w _k))×Q(w _k) I denotes calculation (w)_k-Q(w _k))×Q(w _k) Absolute value of (a).
The apparatus of any of claims 15 to 17, wherein the neural network model comprises p network layers, each of the network layers comprising q of the pending weight parameters, the kth pending weight parameter being the jth pending weight parameter of the ith network layer in the neural network model;

correspondingly, the corrected error value is obtained according to the following formula:

wherein p and q are positive integers and i and j are non-negative integers.
The apparatus according to any of claims 15 to 18, wherein the forward propagation module is specifically configured to:

obtaining the undetermined weight parameter;

quantizing the obtained undetermined weight parameter to obtain a quantized weight parameter, wherein the quantized weight parameter is a quantized value of the undetermined weight parameter;

taking the quantization weight parameter as a model weight parameter of the neural network model, and processing the sample data by adopting a forward propagation algorithm;

obtaining the output result from an output layer of the neural network model.
The apparatus according to claim 19, wherein the model weight parameters of the neural network model are obtained by iterative training, and when the iterative training satisfies an end condition, the determining module is specifically configured to:

and taking the quantization weight parameter as a model weight parameter of the neural network model.
The device of claim 20, further comprising a back propagation module, when the iterative training does not satisfy the end condition, specifically configured to:

and adjusting the undetermined weight parameters layer by layer for the network layer of the neural network model by adopting a back propagation algorithm according to the corrected error value until reaching the input layer of the neural network model so as to obtain the adjusted weight parameters of the neural network model.
The apparatus of claim 21, wherein the pending weight parameters of the neural network model are adjusted according to the following formula:

wherein, w0_kDenotes the kth of said pending weight parameter, w1_kRepresents the k-th adjusted weight parameter, and beta is a normal number.
The apparatus according to any one of claims 20 to 22, wherein for an nth training period in the iterative training, N is an integer greater than 1, M is a positive integer less than N, and the termination condition comprises one or more of the following conditions in combination:

the original error value in the Nth training period is smaller than a preset first threshold value;

the correction error value in the Nth training period is smaller than a preset second threshold value;

the difference between the original error value in the Nth training period and the original error value in the Nth-M training period is smaller than a preset third threshold;

the difference between the correction error value in the Nth training period and the correction error value in the Nth-Mth training period is smaller than a preset fourth threshold;

the difference between the undetermined weight parameter in the Nth training period and the undetermined weight parameter in the Nth-M training period is smaller than a preset fifth threshold; and

n is larger than a preset sixth threshold value.
The apparatus according to claim 23, characterized in that when the nth training period does not satisfy the end condition, a combination of one or more of the following physical quantities is stored:

an original error value in the nth training period;

a corrected error value in the nth training period;

a pending weight parameter in the nth training period; and

a number N of cycles of the Nth training cycle.
The apparatus according to any of claims 20 to 24, wherein the forward propagation module is specifically configured to:

during a first training period of the iterative training, taking a preset initial weight parameter as the undetermined weight parameter;

and when the iterative training is not in the first training period, taking the adjusted weight parameter of the neural network model as the undetermined weight parameter.
The apparatus of any one of claims 15 to 25, wherein the neural network model is used for image recognition;

correspondingly, the sample data comprises an image sample;

correspondingly, the output result comprises a recognition result of the image recognition which is characterized in a probability form.
The apparatus of any one of claims 15 to 25, wherein the neural network model is used for voice recognition;

correspondingly, the sample data comprises a sound sample;

correspondingly, the output result comprises a recognition result of the voice recognition which is characterized in a probability form.
The apparatus according to any one of claims 15 to 25, wherein the neural network model is used for super-resolution image acquisition;

correspondingly, the sample data comprises an image sample;

correspondingly, the output result comprises the pixel value of the super-resolution processed image.
An electronic device, comprising: one or more processors and one or more memories;

the one or more memories coupled to the one or more processors for storing computer program code comprising computer instructions which, when executed by the one or more processors, cause the electronic device to perform the method of determining neural network model weight parameters of any of claims 1-14.
A computer storage medium comprising computer instructions which, when run on an electronic device, cause the electronic device to perform the method of determining neural network model weight parameters of any one of claims 1 to 14.
A computer program product, characterized in that it, when run on a computer, causes the computer to carry out the method of determination of neural network model weight parameters according to any one of claims 1 to 14.