CN111639745B

CN111639745B - Data processing method and device

Info

Publication number: CN111639745B
Application number: CN202010404466.3A
Authority: CN
Inventors: 刘宇达; 申浩; 王赛; 王子为; 鲁继文; 周杰
Original assignee: Tsinghua University; Beijing Sankuai Online Technology Co Ltd
Current assignee: Tsinghua University; Beijing Sankuai Online Technology Co Ltd
Priority date: 2020-05-13
Filing date: 2020-05-13
Publication date: 2024-03-01
Anticipated expiration: 2040-05-13
Also published as: CN111639745A

Abstract

The specification discloses a data processing method and device, which trains a second quantization model, and the obtained trained second quantization model has a quantization effect equivalent to that of a first quantization model; and, the second quantization model after training quantizes at least part of parameters of the neural network, so that the loss obtained in the adjustment (training) process for the neural network has a certain gradient. Further, it is possible to avoid the phenomenon that a gradient cannot be obtained due to quantization during adjustment of the neural network. In addition, the method and the device in the specification can obtain the real loss of the neural network in the adjustment process, so that the adjusted neural network obtained by adjustment according to the real loss also has better data processing capability, and is beneficial to ensuring the accuracy of the data processing result.

Description

Data processing method and device

Technical Field

The present application relates to the field of internet technologies, and in particular, to a data processing method and apparatus.

Background

Artificial intelligence technology has recently been widely developed and applied, and research and application of various neural network technologies have become technical hotspots. For example, in the automatic driving technology, deep Neural Networks (DNNs) or Convolutional Neural Networks (CNNs) are widely used for sensing calculations such as vehicle, pedestrian, traffic light detection, and the like.

In order to enable the neural network to process various information in a traffic environment with high quality, the neural network for information processing is often complex, and is specifically expressed as follows: the neural network has more parameters and high data processing complexity. This makes the storage space and the computational cost required for the neural network high. If the neural network is applied to a hardware environment with very limited capacity and computing capacity such as an unmanned vehicle and an intelligent terminal and higher requirements on data processing quality, the function of the neural network is seriously affected, the use of the data processing result is affected, and further the user experience is affected.

Therefore, how to effectively compress at least part of parameters of the neural network, reduce the data impurity degree of the neural network processing, and avoid influencing the performance of the neural network becomes a problem to be solved.

Disclosure of Invention

The embodiment of the present disclosure provides a method and an apparatus for data processing, so as to partially solve the above-mentioned problems in the prior art.

The embodiment of the specification adopts the following technical scheme:

a method of data processing provided in the present specification, the method comprising:

determining a pre-trained neural network, a preset sample and a first quantization model;

Quantizing at least part of parameters of the pre-trained neural network by adopting a second quantization model to be trained to obtain a first neural network to be stabilized; the second quantization model to be trained is generated according to the first quantization model;

inputting the sample into the first to-be-stabilized neural network to obtain a processing result output by the first to-be-stabilized neural network as an intermediate result;

determining the loss of the first to-be-stabilized neural network as an intermediate loss according to the intermediate result and the label corresponding to the sample;

training the second quantization model to be trained by taking the minimum intermediate loss and the maximum gradient of the intermediate loss as training targets to obtain a trained second quantization model;

quantizing at least part of parameters of the pre-trained neural network by adopting the trained second quantization model to obtain a second undetermined neural network;

according to the preset sample, the second undetermined neural network is adjusted to obtain a third undetermined neural network by taking the minimization of loss when the second undetermined neural network is adopted to process the sample as a training target;

when the data to be processed is required to be processed, the first quantization model is adopted to quantize the third to-be-determined neural network to obtain a quantized neural network, and the data to be processed is input into the quantized neural network to obtain a processing result of the data to be processed.

Optionally, the second quantization model includes: a quantization first sub-model for quantizing weights of the neural network;

quantizing at least part of parameters of the pre-trained neural network by adopting the second quantization model to be trained to obtain a first neural network to be trained, which specifically comprises the following steps:

and quantizing the weight of the pre-trained neural network by adopting the first sub-model to be trained to obtain a neural network with quantized weight, and taking the neural network as a first neural network to be stabilized.

Optionally, the second quantization model includes: a quantized second sub-model for quantizing at least a portion of the activation values generated by the neural network;

inputting the sample into the first to-be-stabilized neural network to obtain a processing result output by the first to-be-stabilized neural network, which specifically comprises the following steps:

inputting the sample into the first network to be stabilized;

for each layer of the first to-be-stabilized neural network, quantifying the activation value output by the layer by adopting the quantified second submodel to obtain the quantified activation value output by the layer;

and obtaining a processing result output by the first to-be-stabilized neural network according to the quantized activation value output by each layer.

Optionally, training the second quantization model to be trained with the intermediate loss minimized and the gradient of the intermediate loss maximized as a training target, to obtain a trained second quantization model, which specifically includes:

inputting the sample into the pre-trained neural network to obtain a processing result output by the pre-trained neural network as a reference result;

determining reference loss when the pre-trained neural network processes the sample according to the reference result and the label corresponding to the sample;

determining a quantization loss caused by quantizing the pre-trained neural network through the second quantization model to be trained according to the reference loss and the intermediate loss;

and training the second quantization model to be trained by taking the minimum quantization loss and the maximum gradient of the intermediate loss as training targets to obtain a trained second quantization model.

Optionally, determining the quantization loss caused by quantizing the pre-trained neural network through the second quantization model to be trained according to the reference loss and the intermediate loss specifically includes:

determining a gradient of the reference loss from the reference loss;

Determining a difference between the gradient of the reference loss and the gradient of the intermediate loss;

determining a quantization loss of the neural network by the quantization model to be trained based on the difference and the gradient of the intermediate loss, wherein the quantization loss is positively correlated with the difference and the quantization loss is negatively correlated with the gradient of the intermediate loss.

Optionally, according to the preset sample, to minimize a loss when the sample is processed by using the second undetermined neural network as a training target, the second undetermined neural network is adjusted to obtain a third undetermined neural network, which specifically includes:

inputting the preset sample into the second undetermined neural network to obtain a processing result output by the second undetermined neural network;

determining loss of the second undetermined neural network when the second undetermined neural network processes the sample according to a processing result output by the second undetermined neural network and a label corresponding to the sample;

adjusting the trained second quantization model and the second undetermined neural network with the aim of minimizing loss when the second undetermined neural network processes the sample to obtain an intermediate second quantization model and an intermediate second undetermined neural network;

Judging whether the middle second to-be-determined neural network meets preset conditions or not;

if yes, determining the middle second undetermined neural network as a third undetermined neural network;

and if not, quantifying at least part of parameters of the intermediate second undetermined neural network by adopting an intermediate second quantification model to obtain a quantified intermediate second undetermined neural network, and continuing training the intermediate second quantification model and the intermediate second undetermined neural network according to the quantified intermediate second undetermined neural network.

Optionally, the pre-trained neural network specifically includes:

acquiring a neural network to be trained and a pre-training sample for pre-training the neural network;

quantizing at least part of parameters of the neural network to be trained by adopting the first quantization model to obtain a quantized neural network to be trained;

training the quantized neural network to be trained according to the pre-training sample to obtain a pre-trained neural network;

determining a pre-trained neural network and a preset sample, wherein the method specifically comprises the following steps of:

determining the pre-trained neural network as the pre-trained neural network; and determining the pre-training sample as a preset sample.

The data processing apparatus provided in the present specification includes:

the preparation module is used for determining a pre-trained neural network, a preset sample and a first quantization model;

the first to-be-stabilized neural network determining module is used for quantifying at least part of parameters of the pre-trained neural network by adopting a second to-be-trained quantifying model to obtain a first to-be-stabilized neural network; the second quantization model to be trained is generated according to the first quantization model;

the intermediate result determining module is used for inputting the sample into the first to-be-stabilized neural network to obtain a processing result output by the first to-be-stabilized neural network as an intermediate result;

the intermediate loss determining module is used for determining the loss of the first to-be-stabilized neural network as an intermediate loss according to the intermediate result and the label corresponding to the sample;

the second quantization model training module is used for training the second quantization model to be trained by taking the intermediate loss as a minimum and the gradient of the intermediate loss as a training target to obtain a trained second quantization model;

the second undetermined neural network determining module is used for quantifying at least part of parameters of the pre-trained neural network by adopting the trained second quantifying model to obtain a second undetermined neural network;

The third pending neural network determining module is configured to, according to the preset sample, adjust the second pending neural network to obtain a third pending neural network, with a training target that minimizes a loss when the second pending neural network is used to process the sample;

and the data processing module is used for quantizing the third to-be-determined neural network by adopting the first quantization model when the to-be-processed data is required to be processed, so as to obtain a quantized neural network, and inputting the to-be-processed data into the quantized neural network, so as to obtain a processing result of the to-be-processed data.

A computer readable storage medium is provided in the present specification, the storage medium storing a computer program which, when executed by a processor, implements a method of data processing as described above.

An electronic device provided in the present specification includes a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing a method of data processing as described above when executing the program.

The above-mentioned at least one technical scheme that this description embodiment adopted can reach following beneficial effect:

In the data processing method and device in the embodiment of the present disclosure, training a second quantization model, where the obtained trained second quantization model has a quantization effect equivalent to that of the first quantization model; and, the second quantization model after training quantizes at least part of parameters of the neural network, so that the loss obtained in the adjustment (training) process for the neural network has a certain gradient. Further, it is possible to avoid the phenomenon that a gradient cannot be obtained due to quantization during adjustment of the neural network. In addition, the method and the device in the specification can obtain the real loss of the neural network in the adjustment process, so that the adjusted neural network obtained by adjustment according to the real loss also has better data processing capability, and is beneficial to ensuring the accuracy of the data processing result. That is, the quantized neural network obtained by the method and the device in the specification can be effectively compressed in terms of parameters and complexity of data processing, and has no obvious influence on the data processing capacity and the data processing effect. The data processing method and the data processing device in the specification are more suitable for being loaded in hardware environments with limited storage space and high requirements on computing performance and computing quality, such as unmanned vehicles, intelligent terminals and the like.

Drawings

The accompanying drawings, which are included to provide a further understanding of the specification, illustrate and explain the exemplary embodiments of the present specification and their description, are not intended to limit the specification unduly. In the drawings:

FIG. 1 is a diagram illustrating a data processing process according to an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of a data processing procedure according to an embodiment of the present disclosure;

FIG. 3a is a schematic diagram of a first training process for a neural network according to an embodiment of the present disclosure;

FIG. 3b is a schematic diagram of a first training process for a second quantization model according to an embodiment of the present disclosure;

FIG. 3c is a schematic diagram of a second training procedure for a neural network according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of a data processing apparatus according to an embodiment of the present disclosure;

fig. 5 is a schematic structural diagram of a portion of an electronic device corresponding to fig. 1 according to an embodiment of the present disclosure.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the present specification more apparent, the technical solutions of the present specification will be clearly and completely described below with reference to specific embodiments of the present specification and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present specification. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the disclosure, are intended to be within the scope of the disclosure herein.

The following describes in detail the technical solutions provided by the embodiments of the present specification with reference to the accompanying drawings.

Fig. 1 is a process of data processing according to an embodiment of the present disclosure, which may specifically include the following steps:

s100: a pre-trained neural network, a pre-set sample, and a first quantization model are determined.

The first quantization model in this specification is used to quantize at least some of the parameters of the neural network. The parameter of the neural network may be at least one of a weight of the neural network trained in advance, and an activation value generated when the neural network trained in advance performs data processing. The quantization for the parameter may be binarization (e.g., quantizing the parameter to "1" or "-1") or tri-quantization (e.g., quantizing the parameter to "1", "0" or "-1").

Optionally, the neural network is a convolutional neural network. The convolutional neural network may include several layers, and the first quantization model may separately quantize at least some of the parameters of the convolutional neural network.

The data processing process in the present specification can be applied to the unmanned technical field. For example, the process in this specification may be applied in an unmanned vehicle, a road facility (e.g., a monitoring facility provided in a road environment).

Taking the scenario of a driving strategy determined by the unmanned vehicle according to the acquired image as an example, the pre-trained neural network can be used for executing multi-classification tasks; the preset sample may be an image historically collected for a traffic environment; the label corresponding to the sample may be a classification result of each dynamic barrier in the image. Alternatively, in the scenario, the pre-trained neural network may be used to generate a driving strategy, the preset sample may be a motion state of each obstacle in the environment where the unmanned vehicle is located, and the label corresponding to the sample may be the driving strategy.

S102: quantizing at least part of parameters of the pre-trained neural network by adopting a second quantization model to be trained to obtain a first neural network to be stabilized; the second quantization model to be trained is generated from the first quantization model.

After the first quantization model is determined, a second quantization model to be trained may be further obtained according to the first quantization model. The second quantization model to be trained may also be used to quantize at least some of the parameters of the neural network.

Compared with the first quantization model, the quantization effect of the second quantization model to be trained on the parameters is similar to that of the first quantization model, but the processing of the parameters by the second quantization model to be trained is not binarization or tri-quantization processing. At least some of the parameters quantized by the second quantization model to be trained in the first network to be stabilized may still be floating point numbers. The neural network is quantized by adopting the second quantization model, and the real gradient of the loss of the neural network during data processing can still be obtained.

Specifically, at least part of parameters of the pre-trained neural network may be input into a second quantization model to be trained, so as to obtain quantized parameters output by the second quantization model to be trained. And taking the obtained quantized parameters as parameters of the pre-trained neural network to obtain a first neural network to be stabilized.

S104: and inputting the sample into the first to-be-stabilized neural network to obtain a processing result output by the first to-be-stabilized neural network as an intermediate result.

Optionally, if the second quantization model to be trained includes an activation value in a quantization range of a parameter generated by the neural network to be trained in advance, this step may quantize at least a portion of the activation value generated by the first neural network to be trained by using the second quantization model to be trained when the first neural network to be trained performs data processing.

S106: and determining the loss of the first to-be-stabilized neural network as an intermediate loss according to the intermediate result and the label corresponding to the sample.

Because of the quantization of the parameters of the neural network, the data processing capability of the neural network is necessarily affected to a certain extent, and the data processing capability of the first to-be-trained neural network in the present specification is also affected by the quantization of the second to-be-trained quantization model.

The intermediate loss of the first to-be-stabilized network obtained through this step may include: a loss caused by the data processing capacity of the neural network itself trained in advance, and a loss caused by quantization of the second quantization model to be trained.

S108: and training the second quantization model to be trained by taking the minimum intermediate loss and the maximum gradient of the intermediate loss as training targets to obtain a trained second quantization model.

Therefore, in the training process for the second quantization model, the middle loss is used as a training target, at least part of parameters of the second quantization model are adjusted, and the obtained trained second quantization model at least can reduce negative influence on quantization of a pre-trained neural network, so that loss caused by quantization of the second quantization model to be trained is reduced.

Furthermore, the training objectives of the present specification for the second quantization model also include: maximizing the gradient of the intermediate losses. In the method of training a neural network by gradient back propagation, whether the lost gradient is true or not, and whether it is effective (the magnitude of the gradient and the effectiveness of the gradient are positively correlated) determines the training effect to a large extent. The process in this specification can not only avoid the phenomenon that the true gradient cannot be determined, such as caused by binary or ternary quantization, but also increase the effectiveness of the gradient through the trained second quantization model, so as to improve the effect of subsequent adjustment on the pre-trained neural network.

S110: and quantifying at least part of parameters of the pre-trained neural network by adopting the trained second quantification model to obtain a second undetermined neural network.

Specifically, for at least part of parameters of the pre-trained neural network, the parameters may be input into a trained second quantization model, so as to obtain quantized parameters output by the trained second quantization model. And taking the obtained quantized parameters as parameters of the pre-trained neural network to obtain a second undetermined neural network.

The parameter quantized by the trained second quantization model in this step may be at least one of a pre-trained neural network weight and an activation value generated when the second undetermined neural network performs data processing.

Optionally, if the trained second quantization model includes an activation value for a quantization range of parameters generated by the pre-trained neural network, then in this step, at least a portion of the activation values generated by the second pending neural network may be quantized using the trained second quantization model when the second pending neural network performs data processing.

In addition, the second undetermined neural network obtained in the step still keeps all parameters of the pre-trained neural network, and when the second undetermined neural network performs forward propagation, corresponding calculation is performed according to the quantized parameters of the second undetermined neural network.

S112: and according to the preset sample, the second undetermined neural network is adjusted to obtain a third undetermined neural network by taking the minimization of loss when the second undetermined neural network is adopted to process the sample as a training target.

The trained second quantization model obtained through the training step has good quantization capability. The better quantization capability can be embodied in particular in: the second quantization model after training is more similar to the first quantization model in terms of quantization effect of parameters; the loss caused by the quantization of the pre-trained neural network by the trained second quantization model is lower; the trained second quantization model quantizes the pre-trained neural network, so that the loss of the pre-trained neural network when processing data can be guaranteed to have an effective and real gradient.

The step is aimed at the adjustment (training) of the second neural network, on the one hand, the negative influence caused by the quantization of the trained second quantization model can be reduced; on the other hand, the trained second quantization model does not cause gradient deletion or serious distortion of the second neural network in the adjustment process, so that the adjustment process related to the step can obtain a third undetermined neural network with better data processing capability; further, when the second neural network is adjusted through the step, the convergence speed of the second neural network is also higher.

Specifically, the process of adjustment may be: and inputting the preset sample into the second undetermined neural network to obtain a processing result output by the second undetermined neural network. And determining the loss of the second undetermined neural network according to the processing result output by the second undetermined neural network.

Then judging whether the loss of the second undetermined neural network meets a preset condition, if so, taking the second undetermined neural network as a third undetermined neural network; if not, according to the determined loss of the second undetermined neural network, aiming at reducing the loss of the second undetermined neural network, and adjusting at least part of parameters of the second undetermined neural network (the object of adjustment is parameters of pre-trained neural networks reserved in the second undetermined neural network, rather than parameters quantized by the second undetermined neural network); and quantifying at least part of parameters of the adjusted second undetermined neural network by adopting the trained second quantification model until the loss of the second undetermined neural network meets a preset condition.

S114: when the data to be processed is required to be processed, the first quantization model is adopted to quantize the third to-be-determined neural network to obtain a quantized neural network, and the data to be processed is input into the quantized neural network to obtain a processing result of the data to be processed.

The parameter quantized by the first quantization model in this step may be at least one of a weight of the third to-be-determined neural network and an activation value generated when the quantized neural network performs data processing.

The process of quantifying the third to-be-determined neural network by using the first quantization model to obtain a quantized neural network may specifically be: and inputting at least part of parameters of the third to-be-determined neural network into a first quantization model to obtain quantized parameters output by the first quantization model. Taking the quantized parameter as the parameter of the quantized neural network.

Taking the scenario that the unmanned vehicle determines the driving strategy according to the collected image as an example, the data to be processed may be the collected environmental image, and the processing result of the data to be processed may be the classification result of each dynamic obstacle in the collected environmental image. Or, the data to be processed may be a motion state of each obstacle in the environment where the unmanned vehicle is located, and the processing result of the data to be processed may be a driving policy.

The third pending neural network obtained by the steps has better data processing capability, and at least part of parameters of the third pending neural network can still be floating point numbers.

Because the amount of resources consumed by the calculation of the floating point number is large and the complexity of the calculation is high when the method is used on line, at least part of parameters of the third to-be-determined neural network are quantized by adopting a first quantization model, so that the quantized neural network is obtained. Compared with the third pending neural network, the quantized neural network has the advantages of more simplified parameters, higher data processing efficiency and less consumed computing resources.

The procedure of the data processing described in the present specification is described in detail below.

As is clear from the foregoing, in order to achieve the technical object of the technical solution in the present specification, in implementing the technical solution in the present specification, the neural network and the quantization model related to the present specification are trained/adjusted at least once, in addition to the design of the quantization model (including but not limited to the second quantization model).

The respective descriptions will be made below for "training" and "design", respectively.

1. At least part of the training process referred to in the data processing process of the present specification.

At least some of the training procedures that may be involved in the procedure in this specification will now be described in terms of an optimal order of execution.

It should be noted that each of the training processes described below is not necessarily a training process required for the data processing method in the present specification; the execution order of the training processes described below is not limited to the execution order of the training processes that may be referred to in the present specification.

1) First training (pre-training) for neural networks.

The pre-trained neural network obtained through the process in the specification has certain data processing capability, so that the subsequent other training processes can all have certain training directionality, and the training efficiency of the subsequent other training processes is improved. This first training for the neural network may be performed prior to step S100.

Specifically, as shown in fig. 2 and 3a, the pre-training process may be:

(1) obtaining a neural network to be trained and a pre-training sample for pre-training the neural network.

(2) And quantizing at least part of parameters of the neural network to be trained by adopting the first quantization model to obtain the quantized neural network to be trained.

The parameter quantized by the first quantization model in this step may be at least one of a weight of the neural network to be trained and an activation value generated when the quantized neural network to be trained performs data processing.

Optionally, as shown in fig. 3a, the present description applies weights (e.g., w ^k And w ^k+1 ) And the activation value of each layer output (e.g., a ^k And a ^k+1 ) All are quantized

(3) And inputting each pre-training sample into the neural network to be trained to obtain the output of the neural network to be trained.

(4) And determining the loss of the neural network to be trained according to the output of the neural network to be trained and the label corresponding to the pre-training sample.

(5) And adjusting at least part of parameters of the neural network to be trained with the aim of minimizing the loss of the neural network to be trained, so as to obtain the adjusted neural network to be trained.

In this step, at least part of the parameters of the neural network to be trained may be adjusted by counter-propagating the gradient of the loss of the neural network to be trained.

In an alternative embodiment of the present specification, the first quantization model may be a sign function, and the gradient of the loss of the neural network to be trained may be obtained by means of direct estimation (Straight Through Estimato, STE).

(6) Judging whether the adjusted neural network to be trained meets a preset pre-training condition, if so, determining the adjusted neural network to be trained as the pre-training neural network. And if not, quantifying at least part of parameters of the adjusted neural network to be trained by adopting the first quantification model, and continuing to train the adjusted neural network to be trained after being quantified by adopting the first quantification model according to the pre-training sample until the adjusted neural network to be trained meets a preset pre-training condition.

Optionally, the pre-training sample used for pre-training the neural network in this step may be used as a pre-set sample for other training processes related to this specification.

Furthermore, this pre-training process for neural networks is not a necessary process for the present description.

2) The first training for the second quantization model.

Corresponding to step S108, the training performed by the second quantization model determines the usage effect of the finally obtained quantized neural network to a certain extent. As shown in fig. 2 and 3b, the process of training the second quantization model in this specification may be:

(1) and inputting the sample into the pre-trained neural network to obtain a processing result output by the pre-trained neural network as a reference result.

(2) And determining the reference loss when the pre-trained neural network processes the sample according to the reference result and the label corresponding to the sample.

The pre-trained neural network is not quantized at this time, and the reference loss can represent the original data processing capacity of the pre-trained neural network before quantization.

(3) And determining quantization loss caused by quantizing the pre-trained neural network through the second quantization model to be trained according to the reference loss and the intermediate loss.

Optionally, as shown in fig. 3b, the present specification performs a first training process for the second quantization model, and weights (e.g., w ^k And w ^k+1 ) And the activation value of each layer output (e.g., a ^k And a ^k+1 ) Quantization is performed.

As can be seen from the foregoing, the intermediate loss obtained in step S106 includes two parts: a penalty caused by the data processing capacity of the neural network itself being pre-trained (i.e., a reference penalty), and a penalty caused by quantization of the second quantization model to be trained (i.e., a quantization penalty). The quantization loss can be calculated by the intermediate loss and the reference loss.

In an alternative embodiment of the present disclosure, the process of determining quantization loss may be: and determining the gradient of the reference loss according to the reference loss. From the resulting intermediate losses, a gradient of the intermediate losses is determined. Then, a difference between the gradient of the reference loss and the gradient of the intermediate loss is determined. Determining a quantization loss for quantizing the neural network by the quantization model based on the difference and the gradient of the intermediate loss.

In an ideal situation, the trained second quantization model would not affect the data processing capacity of the pre-trained neural network, at which point the difference should be 0. And the gradient of the intermediate loss is also larger, which is beneficial to ensuring the subsequent training effect on the neural network. Based on this idea, the quantization loss in the present specification is positively correlated with the difference, and the quantization loss is negatively correlated with the gradient of the intermediate loss.

Wherein the step of determining the gradient of the reference loss and the step of determining the gradient of the intermediate loss are performed in an order of execution, regardless of the order.

For example, a quantization loss minJ can be defined ₁ The method comprises the following steps:

wherein: f is a predetermined loss function that may be cross entropy loss function for determining the difference between the output of the pre-trained neural network and the label. Δf _r Is the gradient of the intermediate loss and Δf is the gradient of the reference loss. II delta f _r ‖ ₂ Is the second norm of the gradient of the intermediate loss. II delta f-delta f _r ‖ ₂ Is the second norm of the difference between the gradient of the reference loss and the gradient of the intermediate loss. q is a hyper-parameter, which can be obtained empirically.

For another example, define quantization loss minJ for a pre-trained neural network ₁ It can also be:

wherein: x is a sample input to the pre-trained neural network.

(4) And training the second quantization model to be trained by taking the minimum quantization loss and the maximum gradient of the intermediate loss as training targets to obtain a trained second quantization model.

Specifically, the process of training the second quantization model may be: and according to the determined quantization loss and the gradient of the intermediate loss, minimizing the quantization loss, maximizing the gradient of the intermediate loss as a training target, and adjusting at least part of parameters of the second quantization model to obtain an adjusted second quantization model.

Judging whether the adjusted second quantization model meets the preset second quantization model training condition, and if so, taking the adjusted second quantization model as the trained second quantization model. And if not, quantifying the pre-trained neural network by adopting the adjusted second quantification model to obtain a redetermined first neural network to be stabilized. And continuing to train the adjusted second quantization model according to the redetermined first to-be-stabilized neural network and the sample.

If the loss of the first to-be-stabilized neural network is smaller than the preset threshold value, the second quantization model meets the preset second quantization model training condition.

3) A second training for the neural network.

Alternatively, the second training for the neural network may be performed on the basis of the trained second quantization model, corresponding to step S112.

Specifically, as shown in fig. 2 and 3c, the second training process for the neural network in this specification may be:

(1) and quantifying at least part of parameters of the pre-trained neural network by adopting the trained second quantification model to obtain a second undetermined neural network.

(2) And inputting the preset sample into the second undetermined neural network to obtain a processing result output by the second undetermined neural network.

(3) And determining the loss of the second undetermined neural network when the second undetermined neural network processes the sample according to the processing result output by the second undetermined neural network and the label corresponding to the sample.

Optionally, as shown in fig. 3c, the present disclosure performs a second training process on the neural network to weight the layers (e.g., w ^k And w ^k+1 ) And the activation value of each layer output (e.g., a ^k And a ^k+1 ) Quantization is performed.

In an alternative embodiment of the present disclosure, a loss minJ of the second pending neural network in processing the sample may be defined ₂ The method comprises the following steps:

minJ ₂ ＝f(X)

formula (3)

In another alternative embodiment of the present disclosure, a loss minJ of the second pending neural network in processing the sample may be defined ₂ The method comprises the following steps:

wherein: p (Δf ) _r ) Is the rate of improvement of the back propagation in the training process of the second undetermined neural network.

(4) And adjusting the second undetermined neural network with the aim of minimizing loss when the second undetermined neural network processes the sample, and the second undetermined neural network is arranged in the middle.

(5) Judging whether the middle second to-be-determined neural network meets preset conditions or not;

(6) if yes, determining the middle second undetermined neural network as a third undetermined neural network; otherwise, quantizing at least part of parameters of the intermediate second undetermined neural network by adopting a trained second quantization model to obtain a quantized intermediate second undetermined neural network, and continuing training the intermediate second undetermined neural network according to the preset sample.

Through the second training of the neural network, a third to-be-determined neural network with better matching effect with the trained second quantization model can be obtained through adjusting at least part of parameters of the pre-trained neural network. From the foregoing, it can be seen that, if the quantization effect of the trained second quantization model is equivalent to that of the first quantization model to a certain extent, the second training performed on the neural network can enable the obtained third to-be-determined neural network to be better matched with the first quantization model.

4) A third training for the neural network, and a second training for the second quantization model.

Alternatively, the third training for the neural network may be performed on the basis of the trained second quantization model, corresponding to step S112. The second training for the second quantization model may be training for the trained second quantization model.

In particular, the process of the third training for the neural network, and the second training for the second quantization model may be:

In an alternative embodiment of the present disclosure, a loss minJ of the second pending neural network in processing the sample may be defined ₃ The method comprises the following steps:

(4) and adjusting the trained second quantization model and the second undetermined neural network with the aim of minimizing the loss when the second undetermined neural network processes the sample, so as to obtain an intermediate second quantization model and an intermediate second undetermined neural network.

If yes, determining the middle second undetermined neural network as a third undetermined neural network; and if not, quantifying at least part of parameters of the intermediate second undetermined neural network by adopting an intermediate second quantification model to obtain a quantified intermediate second undetermined neural network, and continuing training the intermediate second quantification model and the intermediate second undetermined neural network according to the quantified intermediate second undetermined neural network.

The third undetermined neural network obtained through the at least partial training process has better data processing capability. And the third to-be-determined neural network is quantized through the first quantization model or the trained second quantization model, so that the data processing capacity of the obtained quantization model is not obviously influenced by the quantization, and the data processing effect of the quantization model is guaranteed when the third to-be-determined neural network is used on line.

Alternatively, the process "second training for neural network" and the process "third training for neural network, and the second training for second quantization model" may be performed either according to the actual usage scenario or in the order of prediction.

2. The data processing process of the present specification involves the design of at least part of a quantization model.

Since the quantization model design related to the procedure in this specification is mainly directed to the second quantization model, several parts to which the second quantization model can be related will be separately described below.

1) The design of the model for quantifying the weights of the neural network in the second quantization model is directed.

From the foregoing, it can be seen that the second quantization model has at least a quantization function for at least some parameters of the neural network, including but not limited to weight values of the neural network. The "neural network" described herein may be at least one of a pre-trained neural network, a second pending neural network.

In an alternative embodiment of the present specification, the second quantization model includes: a quantization first sub-model for quantizing at least part of the weights of the neural network is shown in fig. 3b and 3 c. The process of quantizing the neural network using the second quantization model may be a process of quantizing at least a portion of the parameters of the neural network using the quantization first sub-model.

Alternatively, the neural network in this specification is a deep convolutional neural network, which may include a cascade of convolutional networks and fully-connected networks when the neural network is used to solve multiple classification, regression problems. The quantization first sub-model is used to quantize at least part of the weights of the convolutional network.

From the foregoing, it can be seen that the quantized first sub-model belonging to the second quantization model plays a significant role in the training process for the second undetermined neural network. The quantized first sub-model not only can be similar to the first quantized model in terms of quantization for weights, but also can ensure the validity and authenticity of gradients during back propagation for the neural network.

Optionally, the quantization first sub-model includes at least one weight quantization unit, such as the kth weight quantization unit k shown in fig. 3b and 3c, which is in one-to-one correspondence with the weights of the neural network.

Taking the second quantization model to be trained to quantize at least part of parameters of the pre-trained neural network to obtain a first neural network to be stabilized as an example, the process specifically can be as follows: for at least part of the weights of the pre-trained neural network, the weights may be input to a weight quantization unit corresponding to the weights, so as to obtain quantized weights output by the weight quantization unit.

In an alternative embodiment of the present description, the first sub-model Q is quantized _w The characterization can be performed by the following formula:

Wherein: w (w) ⁱ Is the weight corresponding to the output end of the ith layer of the neural network;a weight quantization unit (hereinafter referred to as an i-th weight quantization unit) for quantizing a weight corresponding to an output end of an i-th layer of the neural network in the quantization first sub-model; n is the total number of layers the weights of the neural network need to be quantized; />Is a value related parameter of i, and can be obtained through training; />Is the quantization equation of the i-th weight quantization unit.

Specifically, the weight w corresponding to the output end of the ith layer of the neural network can be obtained ⁱ Input into equation (6) to obtain the weight w ⁱ Results of quantization

Further, the quantization equation of the i-th weight quantization unitAny one of the formulas (7) to (9) may be used.

Wherein:is the quantization equation->The parameters related to the values of i can be obtained through training.

Optionally, parametersAt least one of which can be aimed at by the aforementionedThe first training of the second quantization model and/or the second training of the second quantization model. />

In particular, the parameters may be adjusted according to the obtained loss during the training for the second quantization model Is adjusted to the corresponding conditions.

2) The design of the model for quantifying the activation values generated by the neural network is directed to the second quantification model.

In an alternative embodiment of the present specification, the second quantization model includes: a second sub-model for quantifying the activation values generated by the neural network is shown in fig. 3b and 3 c.

Taking the example of quantifying at least part of the activation values generated by the first to-be-stabilized neural network by adopting a second quantization model to be trained, the process can be specifically as follows:

(1) the sample is input into the first network to be stabilized.

(2) And quantizing the activation value output by each layer of the first to-be-stabilized neural network by adopting the quantized second submodel to obtain the quantized activation value output by the layer.

Optionally, the quantized second sub-model includes at least one activation value quantization unit, such as the kth activation value quantization unit k shown in fig. 3b and 3c, which is in one-to-one correspondence with a layer of the neural network. The process of quantifying the activation value generated by the first network to be stabilized using the quantified second sub-model may be: and inputting the activation value generated by each layer into an activation value quantization unit corresponding to the layer aiming at each layer of the first to-be-stabilized nerve network to obtain the quantized activation value output by the activation value quantization unit.

(3) And obtaining a processing result output by the first to-be-stabilized neural network according to the quantized activation value output by each layer.

In the present description one canIn an alternative embodiment, the second quantization model Q _a The characterization can be performed by the following formula:

wherein: a, a ^j Is the activation value corresponding to the j-th layer output end of the neural network;an activation value quantization unit (hereinafter referred to as a j-th activation value quantization unit) for quantizing an activation value corresponding to an output terminal of a j-th layer of the neural network in the quantization second sub-model; m is the total number of layers for which the activation value of the neural network needs to be quantified; />Is a value related parameter of j, and can be obtained through training; />Is the quantization equation of the j-th activation value quantization unit.

The activation value a corresponding to the output end of the j-th layer of the neural network ^j Input into equation (10) to obtain the activation value a ^j Results of quantization

Further, the quantization equation of the j-th activation value quantization unitAny one of the formulas (11) to (13) may be used.

/>

Wherein:is the quantization equation->The parameters related to the values of j in the method can be obtained through training.

Optionally, parametersCan be obtained by the aforementioned "first training for the second quantization model" and/or "second training for the second quantization model".

In particular, the parameters may be based on the loss obtained during the foregoing training for the second quantization modelIs adjusted to the corresponding conditions.

Through at least part of the training steps, the third to-be-determined neural network still having better data processing capability after being quantized by the trained second quantization model can be obtained. Because the training process aiming at the second quantization model can lead the quantization effect of the trained second quantization model to be equal to that of the first quantization model, the first quantization model is adopted to quantize the third to-be-determined neural network, and the obtained quantized neural network also has better data processing capability.

Alternatively, the quantized neural network may be any one of a binary neural network (Binarized Neural Networks, BNN), a ternary neural network (Ternary Neural Networks, TNN).

In an alternative embodiment of the present disclosure, the first quantization model for quantizing the third pending neural network may include: at least one of a weight quantization sub-model and an activation value quantization sub-model. The weight quantization sub-model is used for quantizing at least part of weights of the third undetermined neural network; the activation value quantization sub-model is used to quantize at least a portion of the activation values of the third pending neural network.

Taking the quantized neural network as a binary neural network as an example, a weight quantization sub-model P adopted by the quantized neural network is obtained _w Can be characterized by the following formulas (14) and (15):

wherein:the weight binarization unit is used for binarizing the weight corresponding to the output end of the ith layer of the third to-be-determined neural network in the weight quantization submodel; />Is the weight corresponding to the output end of the ith layer of the third pending neural network.

Specifically, the weight corresponding to the output end of the ith layer of the third pending neural network may be determinedIs input into a weight binarization unit as shown in formula (15) to obtain the weight +.>Quantized results->

Further, an activation value quantization submodel P adopted by the quantized neural network is obtained _w Can be characterized by the following formulas (16) and (17):

wherein:the activation value binarization unit is used for binarizing the activation value corresponding to the output end of the j-th layer of the third to-be-determined neural network in the activation value quantization submodel; />Is the activation value of the output of the ith layer of the third pending neural network.

Specifically, the activation value of the output of the ith layer of the third pending neural network may beInputting into an activation value binarization unit as shown in formula (17) to obtain the weight +. >Quantized results->

Therefore, through a data processing process in the embodiment of the present disclosure, the second quantization model is trained, and the obtained trained second quantization model has a quantization effect equivalent to that of the first quantization model; and, the second quantization model after training quantizes at least part of parameters of the neural network, so that the loss obtained in the adjustment (training) process for the neural network has a certain gradient. Further, it is possible to avoid the phenomenon that a gradient cannot be obtained due to quantization during adjustment of the neural network. In addition, in the process of the adjustment in the specification, the actual loss of the neural network can be obtained, so that the obtained adjusted neural network is adjusted according to the actual loss, the data processing capability is better, and the accuracy of the data processing result is guaranteed.

That is, the quantized neural network obtained by the method and the device in the specification can be effectively compressed in terms of parameters and complexity of data processing, and has no obvious influence on the data processing capacity and the data processing effect. The data processing process in the specification is more suitable for loading in hardware environments with limited storage space and high requirements on computing performance and computing quality, such as unmanned vehicles, intelligent terminals and the like.

Alternatively, where the pre-trained neural network in this specification is a deep neural network, at least some parameters of the neural network may be normalized in a Batch normalization (Batch Norm) manner during the training process.

The data processing process provided by the specification can be particularly applied to the field of delivery by using the unmanned vehicle, for example, in the scene of delivery by using the unmanned vehicle for express delivery, takeaway and the like. Specifically, in the above-described scenario, delivery may be performed using an automated driving fleet composed of a plurality of unmanned vehicles.

Based on the same idea, the embodiment of the present specification also provides a data processing apparatus corresponding to the process shown in fig. 1, where the data processing apparatus is shown in fig. 4.

Fig. 4 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present disclosure, where the data processing apparatus includes:

a preparation module 400 that determines a pre-trained neural network, a pre-set sample, and a first quantization model;

the first to-be-stabilized neural network determining module 402 quantizes at least part of parameters of the pre-trained neural network by adopting a second quantization model to be trained to obtain a first to-be-stabilized neural network; the second quantization model to be trained is generated according to the first quantization model;

An intermediate result determining module 404, configured to input the sample into the first to-be-stabilized neural network, and obtain a processing result output by the first to-be-stabilized neural network as an intermediate result;

an intermediate loss determining module 406, configured to determine, as an intermediate loss, a loss of the first to-be-stabilized neural network according to the intermediate result and a label corresponding to the sample;

a second quantization model training module 408, configured to train the second quantization model to be trained with the intermediate loss minimized and the gradient of the intermediate loss maximized as a training target, to obtain a trained second quantization model;

a second pending neural network determining module 410, configured to quantize at least a portion of parameters of the pre-trained neural network using the trained second quantization model to obtain a second pending neural network;

the third pending neural network determining module 412 is configured to, according to the preset sample, adjust the second pending neural network to obtain a third pending neural network, with a loss minimization when the second pending neural network is used to process the sample as a training target;

and the data processing module 414 is configured to quantize the third to-be-determined neural network by using the first quantization model when the to-be-processed data is required to be processed, so as to obtain a quantized neural network, and input the to-be-processed data into the quantized neural network, so as to obtain a processing result of the to-be-processed data.

Optionally, the first to-be-stabilized network determination module 402 may include: and a weight quantization sub-module.

And the weight quantization sub-module is used for quantizing the weight of the pre-trained neural network by adopting the quantization first sub-model to be trained to obtain a neural network after the weight quantization, and the neural network is used as a first neural network to be stabilized.

Optionally, the intermediate result determination module 404 may include: an activation value quantization sub-module. The activation value quantization sub-module may include: an input unit, an activation value quantization unit, and a first processing result generation unit.

And the input unit is used for inputting the sample into the first to-be-stabilized neural network.

And the activation value quantization unit is used for quantizing the activation value output by each layer of the first to-be-stabilized nerve network by adopting the quantization second submodel to obtain the quantized activation value output by the layer.

And the first processing result generating unit is used for obtaining a processing result output by the first to-be-stabilized nerve network according to the quantized activation value output by each layer.

Optionally, the second quantization model training module 408 may include: the reference result determination sub-module, the reference loss determination sub-module, the quantization loss determination sub-module, and the second quantization model training sub-module.

And the reference result determining submodule is used for inputting the sample into the pre-trained neural network to obtain a processing result output by the pre-trained neural network as a reference result.

And the reference loss determination submodule is used for determining the reference loss when the pre-trained neural network processes the sample according to the reference result and the label corresponding to the sample.

And the quantization loss determination submodule is used for determining quantization loss caused by quantizing the pre-trained neural network through the second quantization model to be trained according to the reference loss and the intermediate loss.

And the second quantization model training sub-module is used for training the second quantization model to be trained by taking the minimum quantization loss and the maximum gradient of the intermediate loss as a training target to obtain a trained second quantization model.

Optionally, the quantization loss determination submodule may include: a reference loss gradient determination unit, a difference determination unit, and a quantization loss determination unit.

And the reference loss gradient determining unit is used for determining the gradient of the reference loss according to the reference loss.

A difference determination unit for determining a difference between the gradient of the reference loss and the gradient of the intermediate loss.

And a quantization loss determination unit configured to determine a quantization loss for quantizing the neural network by the quantization model to be trained according to the difference and the gradient of the intermediate loss, wherein the quantization loss is positively correlated with the difference, and the quantization loss is negatively correlated with the gradient of the intermediate loss.

Optionally, the third pending neural network determination module 412 may include: the second processing result generation sub-module, the loss determination sub-module, the adjustment sub-module and the judgment sub-module.

And the second processing result generation submodule is used for inputting the preset sample into the second undetermined neural network to obtain a processing result output by the second undetermined neural network.

And the loss determination submodule is used for determining the loss of the second undetermined neural network when the second undetermined neural network processes the sample according to the processing result output by the second undetermined neural network and the label corresponding to the sample.

And the adjustment sub-module is used for adjusting the trained second quantization model and the second undetermined neural network to obtain an intermediate second quantization model and an intermediate second undetermined neural network by taking the loss minimization of the second undetermined neural network when the sample is processed as a target.

The judging sub-module is used for judging whether the middle second to-be-determined neural network meets preset conditions or not; if yes, determining the middle second undetermined neural network as a third undetermined neural network; and if not, quantifying at least part of parameters of the intermediate second undetermined neural network by adopting an intermediate second quantification model to obtain a quantified intermediate second undetermined neural network, and continuing training the intermediate second quantification model and the intermediate second undetermined neural network according to the quantified intermediate second undetermined neural network.

Optionally, the data processing apparatus in the present specification may further include a pre-training module. The pre-training module may include: the device comprises an acquisition sub-module, a quantization sub-module and a pre-training sub-module.

And the acquisition sub-module is used for acquiring the neural network to be trained and a pre-training sample for pre-training the neural network.

And the quantization sub-module is used for quantizing at least part of parameters of the neural network to be trained by adopting the first quantization model to obtain the quantized neural network to be trained.

And the pre-training sub-module is used for training the quantized neural network to be trained according to the pre-training sample to obtain the pre-trained neural network.

Optionally, the preparation module 400 may include: a pre-trained neural network determination submodule and a preset sample determination submodule.

The pre-trained neural network determining submodule is used for determining the pre-trained neural network as the pre-trained neural network.

And the preset sample determination submodule is used for determining the pre-training sample as a preset sample.

The present specification embodiment also provides a computer-readable storage medium storing a computer program operable to perform the above-described process of data processing provided in fig. 1.

The embodiment of the present specification also proposes a schematic structural diagram of the first electronic device shown in fig. 5. At the hardware level, as in fig. 5, the electronic device includes a processor, an internal bus, a network interface, a memory, and a non-volatile storage, although it may include hardware required for other services. The processor reads the corresponding computer program from the non-volatile memory into the memory and then runs to implement the data processing procedure described above with respect to fig. 1. Of course, other implementations, such as logic devices or combinations of hardware and software, are not excluded from the present description, that is, the execution subject of the following processing flows is not limited to each logic unit, but may be hardware or logic devices.

In the 90 s of the 20 th century, improvements to one technology could clearly be distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable Gate Array, FPGA)) is an integrated circuit whose logic function is determined by the programming of the device by a user. A designer programs to "integrate" a digital system onto a PLD without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented by using "logic compiler" software, which is similar to the software compiler used in program development and writing, and the original code before the compiling is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but not just one of the hdds, but a plurality of kinds, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), lava, lola, myHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.

The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.

The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present specification.

It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

The description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.

The foregoing is merely exemplary of the present disclosure and is not intended to limit the disclosure. Various modifications and alterations to this specification will become apparent to those skilled in the art. Any modifications, equivalent substitutions, improvements, or the like, which are within the spirit and principles of the present description, are intended to be included within the scope of the claims of the present description.

Claims

1. A method of data processing, the method comprising:

when the data to be processed is required to be processed, quantizing the third to-be-determined neural network by adopting the first quantization model to obtain a quantized neural network, and inputting the data to be processed into the quantized neural network to obtain a processing result of the data to be processed;

The pre-trained neural network is used for executing multi-classification tasks; the preset sample is an image which is historically collected for a traffic environment, and the label corresponding to the sample is a classification result of each dynamic obstacle in the image; or the pre-trained neural network is used for generating a driving strategy, the preset sample is the motion state of each obstacle in the environment where the unmanned vehicle is located, and the label corresponding to the sample can be the driving strategy.

2. The method of claim 1, wherein the second quantization model comprises: a quantization first sub-model for quantizing weights of the neural network;

3. The method of claim 2, wherein the second quantization model comprises: a quantized second sub-model for quantizing at least a portion of the activation values generated by the neural network;

inputting the sample into the first network to be stabilized;

4. A method according to any of claims 1-3, wherein the second quantization model to be trained is trained with the intermediate loss minimized and the gradient of the intermediate loss maximized as a training target, resulting in a trained second quantization model, comprising in particular:

5. The method of claim 4, wherein determining quantization loss caused by quantizing the pre-trained neural network by the second quantization model to be trained based on the reference loss and the intermediate loss, comprises:

determining a gradient of the reference loss from the reference loss;

6. A method according to any one of claims 1 to 3, wherein, according to the preset sample, with the loss minimization when the sample is processed by using the second undetermined neural network as a training target, the second undetermined neural network is adjusted to obtain a third undetermined neural network, which specifically includes:

judging whether the middle second to-be-determined neural network meets preset adjustment conditions or not;

7. A method according to any one of claims 1-3, characterized in that the pre-trained neural network comprises in particular:

8. A data processing apparatus, the apparatus comprising:

the data processing module is used for quantizing the third to-be-determined neural network by adopting the first quantization model when the to-be-processed data is required to be processed, so as to obtain a quantized neural network, and inputting the to-be-processed data into the quantized neural network, so as to obtain a processing result of the to-be-processed data;

9. A computer readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of any of the preceding claims 1-7.

10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of the preceding claims 1-7 when executing the program.