CN111639745A

CN111639745A - Data processing method and device

Info

Publication number: CN111639745A
Application number: CN202010404466.3A
Authority: CN
Inventors: 刘宇达; 申浩; 王赛; 王子为; 鲁继文; 周杰
Original assignee: Tsinghua University; Beijing Sankuai Online Technology Co Ltd
Current assignee: Tsinghua University; Beijing Sankuai Online Technology Co Ltd
Priority date: 2020-05-13
Filing date: 2020-05-13
Publication date: 2020-09-08
Anticipated expiration: 2040-05-13
Also published as: CN111639745B

Abstract

The present specification discloses a data processing method and apparatus, wherein a second quantization model is trained, and the obtained trained second quantization model has a quantization effect equivalent to that of a first quantization model; and at least part of parameters of the neural network are quantized through the trained second quantization model, so that the loss obtained in the adjustment (training) process aiming at the neural network has a certain gradient. Further, it is possible to avoid a phenomenon that a gradient cannot be obtained due to quantization when the neural network is adjusted. In addition, the method and the device in the present specification can obtain the real loss of the neural network in the adjusting process, and the adjusted neural network obtained by adjusting according to the real loss also has a better data processing capability, which is beneficial to ensuring the accuracy of the data processing result.

Description

Data processing method and device

Technical Field

The present application relates to the field of internet technologies, and in particular, to a data processing method and apparatus.

Background

Artificial intelligence technology has recently been widely developed and applied, and research and application of various neural network technologies have become a technological hotspot. For example, in the automatic driving technology, a Deep Neural Network (DNN) or a Convolutional Neural Network (CNN) is widely used for perception calculation of vehicle, pedestrian, traffic light detection, and the like.

In order to enable the neural network to process various information in a traffic environment with high quality, the neural network for information processing is often complex, and is specifically represented as follows: the neural network has more parameters and high data processing complexity. This results in a high storage space and computational cost for the neural network. If the neural network is applied to a hardware environment with very limited capacity and computing capability and high requirement on data processing quality, such as an unmanned vehicle, an intelligent terminal and the like, the function exertion of the neural network is seriously influenced, the use of a data processing result is influenced, and further the user experience is influenced.

Therefore, how to effectively compress at least part of parameters of the neural network, reduce the data complexity of the neural network processing, and avoid the influence on the performance of the neural network becomes a problem to be solved urgently.

Disclosure of Invention

The embodiments of the present specification provide a method and an apparatus for data processing, so as to partially solve the above problems in the prior art.

The embodiment of the specification adopts the following technical scheme:

the present specification provides a data processing method, including:

determining a pre-trained neural network, a preset sample and a first quantitative model;

quantizing at least part of parameters of the pre-trained neural network by adopting a second quantization model to be trained to obtain a first neural network to be trained; the second quantization model to be trained is generated according to the first quantization model;

inputting the sample into the first to-be-localized neural network to obtain a processing result output by the first to-be-localized neural network as an intermediate result;

determining the loss of the first to-be-determined neural network as an intermediate loss according to the intermediate result and the label corresponding to the sample;

training the second quantitative model to be trained by taking the minimization of the intermediate loss and the maximization of the gradient of the intermediate loss as a training target to obtain a trained second quantitative model;

quantizing at least part of parameters of the pre-trained neural network by adopting the trained second quantization model to obtain a second undetermined neural network;

according to the preset sample, the second undetermined neural network is adjusted by taking the loss minimization of the second undetermined neural network when the sample is processed as a training target, and a third undetermined neural network is obtained;

when the data to be processed needs to be processed, the third to-be-processed neural network is quantized by adopting the first quantization model to obtain a quantized neural network, and the data to be processed is input into the quantized neural network to obtain a processing result of the data to be processed.

Optionally, the second quantization model comprises: a quantized first sub-model for quantizing weights of the neural network;

quantizing at least part of parameters of the pre-trained neural network by adopting the second quantization model to be trained to obtain a first neural network to be trained, and the method specifically comprises the following steps:

and quantizing the weights of the pre-trained neural network by adopting the quantized first submodel to be trained to obtain the neural network with quantized weights, and using the neural network as a first neural network to be fixed.

Optionally, the second quantization model comprises: a second sub-model for quantifying at least part of the activation values generated by the neural network;

inputting the sample into the first to-be-localized network to obtain a processing result output by the first to-be-localized network, specifically comprising:

inputting the sample into the first chilled network;

for each layer of the first to-be-fixed neural network, quantizing the activation value output by the layer by using the quantized second submodel to obtain a quantized activation value output by the layer;

and obtaining a processing result output by the first to-be-fixed neural network according to the quantized activation value output by each layer.

Optionally, training the second quantization model to be trained with the minimization of the intermediate loss and the maximization of the gradient of the intermediate loss as a training target to obtain the trained second quantization model, specifically including:

inputting the sample into the pre-trained neural network to obtain a processing result output by the pre-trained neural network as a reference result;

determining the reference loss of the pre-trained neural network when the pre-trained neural network processes the sample according to the reference result and the label corresponding to the sample;

according to the reference loss and the intermediate loss, determining quantization loss caused by quantizing the pre-trained neural network through the second quantization model to be trained;

and training the second quantization model to be trained by taking the quantization loss minimization and the intermediate loss gradient maximization as a training target to obtain the trained second quantization model.

Optionally, determining, according to the reference loss and the intermediate loss, a quantization loss caused by quantizing the pre-trained neural network by using the second quantization model to be trained, specifically including:

determining a gradient of the reference loss according to the reference loss;

determining a difference between the gradient of the reference loss and the gradient of the intermediate loss;

and determining the quantization loss of the neural network quantized by the quantization model to be trained according to the difference and the gradient of the intermediate loss, wherein the quantization loss is positively correlated with the difference, and the quantization loss is negatively correlated with the gradient of the intermediate loss.

Optionally, according to the preset sample, the second undetermined neural network is adjusted to obtain a third undetermined neural network by using a training target of minimization of loss when the second undetermined neural network is used for processing the sample, and the method specifically includes:

inputting the preset sample into the second undetermined neural network to obtain a processing result output by the second undetermined neural network;

determining the loss of the second pending neural network when the second pending neural network processes the sample according to the processing result output by the second pending neural network and the label corresponding to the sample;

adjusting the trained second quantization model and the second undetermined neural network to obtain an intermediate second quantization model and an intermediate second undetermined neural network by taking minimization of loss of the second undetermined neural network when the sample is processed as a target;

judging whether the middle second to-be-determined neural network meets a preset condition or not;

if so, determining the middle second undetermined neural network as a third undetermined neural network;

otherwise, quantizing at least part of parameters of the intermediate second to-be-determined neural network by adopting an intermediate second quantization model to obtain a quantized intermediate second to-be-determined neural network, and continuing to train the intermediate second quantization model and the intermediate second to-be-determined neural network according to the quantized intermediate second to-be-determined neural network.

Optionally, the pre-trained neural network specifically includes:

acquiring a neural network to be trained and a pre-training sample for pre-training the neural network;

quantizing at least part of parameters of the neural network to be trained by adopting the first quantization model to obtain a quantized neural network to be trained;

training the quantified neural network to be trained according to the pre-training sample to obtain a pre-trained neural network;

determining a pre-trained neural network and a preset sample, and specifically comprising the following steps:

determining a pre-trained neural network as a pre-trained neural network; and determining the pre-training sample as a preset sample.

The data processing device provided by the specification comprises:

the device comprises a preparation module, a first quantitative model and a second quantitative model, wherein the preparation module is used for determining a pre-trained neural network, a preset sample and the first quantitative model;

the first to-be-fixed neural network determining module is used for quantizing at least part of parameters of the pre-trained neural network by adopting a second quantization model to be trained to obtain a first to-be-fixed neural network; the second quantization model to be trained is generated according to the first quantization model;

an intermediate result determining module, configured to input the sample into the first to-be-localized neural network, and obtain a processing result output by the first to-be-localized neural network as an intermediate result;

the intermediate loss determining module is used for determining the loss of the first to-be-determined neural network as an intermediate loss according to the intermediate result and the label corresponding to the sample;

the second quantization model training module is used for training the second quantization model to be trained by taking the minimum intermediate loss and the maximum gradient of the intermediate loss as a training target to obtain a trained second quantization model;

the second undetermined neural network determining module is used for quantizing at least part of parameters of the pre-trained neural network by adopting the trained second quantization model to obtain a second undetermined neural network;

the third undetermined neural network determining module is used for adjusting the second undetermined neural network to obtain a third undetermined neural network according to the preset sample by taking the loss minimization of the second undetermined neural network when the sample is processed as a training target;

and the data processing module is used for quantizing the third to-be-processed neural network by adopting the first quantization model to obtain a quantized neural network when the to-be-processed data needs to be processed, and inputting the to-be-processed data into the quantized neural network to obtain a processing result of the to-be-processed data.

The present specification provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements a method of data processing as described above.

The present specification provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the processor implements the above-mentioned data processing method.

The embodiment of the specification adopts at least one technical scheme which can achieve the following beneficial effects:

in the data processing method and apparatus in the embodiments of the present specification, a second quantization model is trained, and the obtained trained second quantization model has a quantization effect equivalent to that of a first quantization model; and at least part of parameters of the neural network are quantized through the trained second quantization model, so that the loss obtained in the adjustment (training) process aiming at the neural network has a certain gradient. Further, it is possible to avoid a phenomenon that a gradient cannot be obtained due to quantization when the neural network is adjusted. In addition, the method and the device in the present specification can obtain the real loss of the neural network in the adjusting process, and the adjusted neural network obtained by adjusting according to the real loss also has a better data processing capability, which is beneficial to ensuring the accuracy of the data processing result. That is, the parameters and the complexity of data processing of the quantized neural network obtained by the method and apparatus in this specification can be effectively compressed without significantly affecting the data processing capability and data processing effect. The data processing method and apparatus in this specification are more suitable for being loaded in hardware environments with limited storage space and high requirements on computing performance and computing quality, such as unmanned vehicles and intelligent terminals.

Drawings

The accompanying drawings, which are included to provide a further understanding of the specification and are incorporated in and constitute a part of this specification, illustrate embodiments of the specification and together with the description serve to explain the specification and not to limit the specification in a non-limiting sense. In the drawings:

fig. 1 is a data processing process provided by an embodiment of the present specification;

fig. 2 is a schematic diagram of a data processing process provided in an embodiment of the present specification;

fig. 3a is a schematic diagram of a first training process performed on a neural network according to an embodiment of the present disclosure;

fig. 3b is a schematic diagram of a first training process performed on a second quantization model according to an embodiment of the present disclosure;

fig. 3c is a schematic diagram of a second training process performed on a neural network according to an embodiment of the present disclosure;

fig. 4 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present disclosure;

fig. 5 is a schematic structural diagram of a portion of an electronic device corresponding to fig. 1 provided in an embodiment of the present disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the present disclosure more clear, the technical solutions of the present disclosure will be clearly and completely described below with reference to the specific embodiments of the present disclosure and the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present disclosure, and not all embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments in the specification without making any creative effort belong to the protection scope of the specification.

The technical solutions provided by the embodiments of the present description are described in detail below with reference to the accompanying drawings.

Fig. 1 is a process of data processing provided in an embodiment of the present specification, which may specifically include the following steps:

s100: a pre-trained neural network, a preset sample, and a first quantization model are determined.

The first quantization model in this specification is used to quantize at least part of the parameters of the neural network. The parameter of the neural network may be at least one of a weight of the pre-trained neural network, and an activation value generated when the pre-trained neural network performs data processing. The quantization for the parameter may be binary (e.g., quantizing the parameter to "1" or "-1") or ternary (e.g., quantizing the parameter to "1", "0", or "-1").

Optionally, the neural network is a convolutional neural network. The convolutional neural network may include several layers, and the first quantization model may quantize at least part of parameters of the convolutional neural network, respectively.

The data processing process in the specification can be applied to the technical field of unmanned driving. For example, the processes herein may be applied in unmanned vehicles, in road installations (e.g., monitoring installations disposed in a road environment).

Taking the scene that the unmanned vehicle determines the driving strategy according to the collected images as an example, the pre-trained neural network can be used for executing a multi-classification task; the preset sample may be an image historically acquired for a traffic environment; the label corresponding to the sample may be a classification result of each dynamic obstacle in the image. Or, in the scenario, the pre-trained neural network may be used to generate a driving strategy, the preset sample may be a motion state of each obstacle in an environment where the unmanned vehicle is located, and the label corresponding to the sample may be the driving strategy.

S102: quantizing at least part of parameters of the pre-trained neural network by adopting a second quantization model to be trained to obtain a first neural network to be trained; the second quantization model to be trained is generated from the first quantization model.

After the first quantization model is determined, a second quantization model to be trained may be further obtained according to the first quantization model. The second quantitative model to be trained may also be used to quantify at least part of the parameters of the neural network.

Compared with the first quantization model, the second quantization model to be trained has similar quantization effect on the parameters to the first quantization model, but the second quantization model to be trained does not process the parameters in a binary or ternary manner. At least part of the parameters quantized by the second quantization model to be trained in the first passive network may still be floating point numbers. Then the second quantization model is used to quantize the neural network, and the real gradient of the loss of the neural network during data processing can still be obtained.

Specifically, the parameters may be input into the second quantization model to be trained for at least part of the parameters of the pre-trained neural network, so as to obtain the quantized parameters output by the second quantization model to be trained. And taking the obtained quantized parameters as the parameters of the pre-trained neural network to obtain a first to-be-fixed neural network.

S104: and inputting the sample into the first to-be-localized neural network to obtain a processing result output by the first to-be-localized neural network as an intermediate result.

Optionally, if the second quantization model to be trained includes an activation value for the quantization range of the parameter generated by the pre-trained neural network, in this step, when the first neural network to be trained is performing data processing, at least part of the activation value generated by the first neural network to be trained may be quantized by using the second quantization model to be trained.

S106: and determining the loss of the first to-be-determined neural network as an intermediate loss according to the intermediate result and the label corresponding to the sample.

Since the quantization of the parameters of the neural network necessarily affects the data processing capability of the neural network to some extent, the data processing capability of the first quantization model to be trained in this specification is also affected by the quantization of the second quantization model to be trained.

The intermediate loss of the first to-be-localized neural network obtained by this step may include: a loss caused by the data processing power of the pre-trained neural network itself, and a loss caused by the quantization of the second quantization model to be trained.

S108: and training the second quantization model to be trained by taking the minimization of the intermediate loss and the maximization of the gradient of the intermediate loss as a training target to obtain the trained second quantization model.

Therefore, in the training process of the second quantization model, the intermediate loss minimization is taken as a training target, at least part of parameters of the second quantization model are adjusted, and the obtained trained second quantization model can at least reduce negative effects on the quantization of the pre-trained neural network, so that the loss caused by the quantization of the second quantization model to be trained is reduced.

Furthermore, the objectives of the present specification with respect to the training of the second quantization model also include: maximizing the gradient of the intermediate loss. In the method for training the neural network by gradient back propagation, whether the lost gradient is real or effective (the size of the gradient is positively correlated with the effectiveness of the gradient) determines the training effect to a greater extent. The process in this specification can not only avoid the phenomenon that the true gradient cannot be determined due to binary or ternary quantization, but also increase the effectiveness of the gradient through the trained second quantization model, so as to improve the subsequent effect of adjusting the pre-trained neural network.

S110: and quantizing at least part of parameters of the pre-trained neural network by adopting the trained second quantization model to obtain a second undetermined neural network.

Specifically, the parameters may be input into the trained second quantization model for at least part of the parameters of the pre-trained neural network, so as to obtain the quantized parameters output by the trained second quantization model. And taking each quantized parameter as a parameter of the pre-trained neural network to obtain a second undetermined neural network.

The parameter quantized by the trained second quantization model in this step may be at least one of a pre-trained neural network weight and an activation value generated when the second undetermined neural network performs data processing.

Optionally, if the trained second quantization model includes an activation value for the quantization range of the parameter generated by the pre-trained neural network, in this step, when the second undetermined neural network performs data processing, at least part of the activation value generated by the second undetermined neural network may be quantized by using the trained second quantization model.

In addition, the second undetermined neural network obtained in the step still retains all parameters of the neural network trained in advance, and when the second undetermined neural network carries out forward transmission, corresponding calculation is carried out according to the quantized parameters of the second undetermined neural network.

S112: and according to the preset sample, adjusting the second undetermined neural network to obtain a third undetermined neural network by taking the minimum loss of the second undetermined neural network when the sample is processed as a training target.

The trained second quantization model obtained through the training step has better quantization capability. This better quantization capability may be embodied in particular in: the second quantization model after training is closer to the first quantization model in terms of the quantization effect of the parameters; the loss caused by quantizing the pre-trained neural network by the trained second quantization model is lower; the second quantization model after training is used for quantizing the pre-trained neural network, so that the loss of the pre-trained neural network in data processing can be ensured to have effective and real gradient.

Then, the step aims at the adjustment (training) of the second neural network, so that on one hand, the negative influence caused by the quantization of the trained second quantization model can be reduced; on the other hand, because the trained second quantization model does not cause gradient loss or serious distortion of the second neural network in the adjustment process, the third to-be-determined neural network with better data processing capacity can be obtained in the adjustment process related to the step; further, when the second neural network is adjusted through the step, the convergence rate of the second neural network is also high.

Specifically, the process of the adjustment may be: and inputting the preset sample into the second undetermined neural network to obtain a processing result output by the second undetermined neural network. And determining the loss of the second undetermined neural network according to the processing result output by the second undetermined neural network.

Then, judging whether the loss of the second undetermined neural network meets a preset condition, if so, taking the second undetermined neural network as a third undetermined neural network; if not, adjusting at least part of parameters of the second undetermined neural network according to the determined loss of the second undetermined neural network by taking the loss of the second undetermined neural network as a target (the adjusted object is 'parameters of a pre-trained neural network reserved in the second undetermined neural network', but is not 'parameters obtained by quantizing the second undetermined neural network'); and quantizing at least part of parameters of the adjusted second undetermined neural network by adopting the trained second quantization model until the loss of the second undetermined neural network meets the preset condition.

S114: when the data to be processed needs to be processed, the third to-be-processed neural network is quantized by adopting the first quantization model to obtain a quantized neural network, and the data to be processed is input into the quantized neural network to obtain a processing result of the data to be processed.

The parameter quantized by the first quantization model in this step may be at least one of a weight of the third candidate neural network and an activation value generated when the quantization neural network performs data processing.

The process of quantizing the third to-be-determined neural network by using the first quantization model to obtain the quantized neural network may specifically be: and aiming at least part of parameters of the third to-be-determined neural network, inputting the parameters into the first quantization model to obtain quantized parameters output by the first quantization model. And taking the quantized parameters as the parameters of the quantized neural network.

Still taking the foregoing scenario in which the unmanned vehicle determines the driving strategy according to the acquired image as an example, the data to be processed may be the acquired environment image, and the processing result of the data to be processed may be the classification result of each dynamic obstacle in the acquired environment image. Alternatively, the data to be processed may be a motion state of each obstacle in the environment where the unmanned vehicle is located, and the processing result of the data to be processed may be a driving strategy.

The third pending neural network obtained by the previous steps has better data processing capacity, and at least part of parameters of the third pending neural network can still be floating point numbers.

When the method is used on the line, the amount of resources consumed by calculation of floating point numbers is large, and the calculation complexity is high. Compared with the third pending neural network, the quantitative neural network has the advantages of more simplified parameters, higher data processing efficiency and less consumed computing resources.

The following describes in detail the procedure of data processing described in this specification.

As can be seen from the foregoing, in order to achieve the technical purpose of the solution of the technology in the present specification, in the implementation of the solution of the present specification, a neural network and a quantization model related to the present specification are trained/adjusted at least once after a quantization model (including but not limited to the second quantization model) is designed.

The following will be described separately for "training" and "design".

First, the data processing process of the present specification involves at least part of a training process.

At least some of the training processes that may be involved in the processes of this specification will now be described in terms of the best order of execution.

It should be noted that the training processes described below are not all the necessary training processes for the data processing method in this specification; the order of execution of the training processes described below is not intended to limit the order of execution of the training processes that may be referred to in this specification.

1) First training (pre-training) for neural networks.

The pre-trained neural network obtained through the process in the specification has certain data processing capacity, so that other subsequent training processes can have certain training directivity, and the training efficiency of other subsequent training processes is improved. This first training for the neural network may be performed before step S100.

Specifically, as shown in fig. 2 and fig. 3a, the pre-training process may be:

firstly, acquiring a neural network to be trained and a pre-training sample for pre-training the neural network.

Quantizing at least part of parameters of the neural network to be trained by adopting the first quantization model to obtain the quantized neural network to be trained.

The parameter quantized by the first quantization model in this step may be at least one of a weight of the neural network to be trained and an activation value generated when the quantized neural network to be trained performs data processing.

Optionally, as shown in fig. 3a, the present specification performs weights (e.g., w) corresponding to each layer during the pre-training process for the neural network^kAnd w^k+1) And activation values (e.g., a) for each layer output^kAnd a^k+1) Are all quantized

And thirdly, inputting the pre-training sample into the neural network to be trained aiming at each pre-training sample to obtain the output of the neural network to be trained.

And fourthly, determining the loss of the neural network to be trained according to the output of the neural network to be trained and the label corresponding to the pre-training sample.

And fifthly, aiming at minimizing the loss of the neural network to be trained, adjusting at least part of parameters of the neural network to be trained to obtain the adjusted neural network to be trained.

In this step, at least part of the parameters of the neural network to be trained may be adjusted by back-propagating the gradient of the loss of the neural network to be trained.

In an alternative embodiment of the present disclosure, the first quantization model may be a sign function, and the gradient of the loss of the neural network to be trained may be obtained by a direct estimation (STE).

Sixthly, judging whether the adjusted neural network to be trained meets the preset pre-training condition, if so, determining the adjusted neural network to be trained as the pre-training neural network. If not, quantizing at least part of parameters of the adjusted to-be-trained neural network by using the first quantization model, and continuing to train the adjusted to-be-trained neural network quantized by using the first quantization model according to the pre-training sample until the adjusted to-be-trained neural network meets the preset pre-training condition.

Optionally, the pre-training sample used for the pre-training of the neural network in this step may be used as a preset sample for other training processes related to this specification.

In addition, the pre-training process for the neural network is not a necessary process for the present specification.

2) A first training for a second quantization model.

In step S108, the training of the second quantization model determines the usage effect of the finally obtained quantized neural network to some extent. As shown in fig. 2 and fig. 3b, the process of training the second quantization model in this specification may be:

inputting the sample into the pre-trained neural network to obtain a processing result output by the pre-trained neural network as a reference result.

And determining the reference loss of the pre-trained neural network when the pre-trained neural network processes the sample according to the reference result and the label corresponding to the sample.

If the pre-trained neural network is not quantized at this time, the reference loss can represent the original data processing capability of the pre-trained neural network before quantization.

And determining the quantization loss caused by quantizing the pre-trained neural network through the second quantization model to be trained according to the reference loss and the intermediate loss.

Optionally, as shown in fig. 3b, the present specification performs a first training process for the second quantization model, with respect to the corresponding weights (e.g., w) for each layer^kAnd w^k+1) And activation values (e.g., a) for each layer output^kAnd a^k+1) Quantization is performed.

As can be seen from the foregoing, the intermediate loss obtained in step S106 includes two parts: a loss due to the data processing capability of the pre-trained neural network itself (i.e., a reference loss), and a loss due to the quantization of the second quantization model to be trained (i.e., a quantization loss). The quantization loss can be calculated from the intermediate loss and the reference loss.

In an alternative embodiment of the present description, the process of determining the quantization loss may be: determining a gradient of the reference loss according to the reference loss. And determining the gradient of the intermediate loss according to the obtained intermediate loss. Then, a difference between the gradient of the reference loss and the gradient of the intermediate loss is determined. And determining the quantization loss of the neural network quantized by the quantization model according to the difference and the gradient of the intermediate loss.

Ideally, the trained second quantitative model does not affect the data processing capability of the pre-trained neural network, and the difference should be 0. And the gradient of the intermediate loss is larger, which is beneficial to ensuring the subsequent training effect on the neural network. Based on this idea, the quantization loss in this specification is positively correlated with the difference, and the quantization loss is negatively correlated with the gradient of the intermediate loss.

The step of determining the gradient of the reference loss and the step of determining the gradient of the intermediate loss are executed in a non-sequential order.

For example, a quantization loss minJ may be defined₁Comprises the following steps:

in the formula: f is a pre-set loss function that can cross the entropy loss function for determining the difference between the output of the pre-trained neural network and the label. Δ f_rIs the gradient of the intermediate loss and af is the gradient of the reference loss. II Δ f_r‖₂Is the two-norm of the gradient of the intermediate loss. |. DELTA.f-DELTA.f_r‖₂Is the two-norm of the difference between the gradient of the reference loss and the gradient of the intermediate loss. q is a hyperparameter, which can be obtained empirically.

As another example, defining the quantization loss of the pre-trained neural network, minJ₁The method can also comprise the following steps:

in the formula: x is a sample input to the pre-trained neural network.

And fourthly, training the second quantization model to be trained by using the quantization loss minimization and the middle loss gradient maximization as a training target to obtain the trained second quantization model.

Specifically, the process of training the second quantization model may be: and adjusting at least part of parameters of the second quantization model according to the determined quantization loss and the determined gradient of the intermediate loss by taking the quantization loss minimization and the intermediate loss maximization as a training target to obtain the adjusted second quantization model.

And judging whether the adjusted second quantization model meets the preset second quantization model training condition, and if so, taking the adjusted second quantization model as the trained second quantization model. If not, the second quantization model after adjustment is adopted to quantize the pre-trained neural network, and the first to-be-fixed neural network which is determined again is obtained. And continuing to train the adjusted second quantitative model according to the re-determined first to-be-determined neural network and the sample.

And if the loss of the first to-be-fixed neural network is less than a preset threshold value, the second quantitative model meets a preset second quantitative model training condition.

3) A second training for the neural network.

Alternatively, corresponding to step S112, the second training for the neural network may be performed on the basis of the trained second quantitative model.

Specifically, as shown in fig. 2 and fig. 3c, the process of performing the second training on the neural network in this specification may be:

and quantizing at least part of parameters of the pre-trained neural network by adopting the trained second quantization model to obtain a second undetermined neural network.

Inputting the preset sample into the second undetermined neural network to obtain a processing result output by the second undetermined neural network.

And thirdly, determining the loss of the second pending neural network when processing the sample according to the processing result output by the second pending neural network and the label corresponding to the sample.

Optionally, as shown in fig. 3c, the present specification performs a second training process for the neural network with corresponding weights (e.g., w) for each layer^kAnd w^k+1) And activation values (e.g., a) for each layer output^kAnd a^k+1) Quantization is performed.

In an alternative embodiment of the present specification, the second pending neural network may be defined to perform on the sampleLoss in line processing minJ₂Comprises the following steps:

minJ₂(x) formula (3)

In another alternative embodiment of the present specification, a loss minJ for the second pending neural network to process the sample may be defined₂Comprises the following steps:

in the formula: p (Δ f )_r) The efficiency improvement quantity of back propagation in the training process of the second undetermined neural network.

And adjusting the second undetermined neural network by taking the minimization of the loss of the second undetermined neural network when the sample is processed as a target, and the second undetermined neural network is arranged in the middle.

Judging whether the middle second to-be-determined neural network meets preset conditions or not;

if yes, determining the middle second undetermined neural network as a third undetermined neural network; otherwise, quantizing at least part of parameters of the intermediate second to-be-determined neural network by using a trained second quantization model to obtain a quantized intermediate second to-be-determined neural network, and continuing to train the intermediate second to-be-determined neural network according to the preset sample.

Through the second training aiming at the neural network, the third to-be-determined neural network with better matching effect with the trained second quantitative model can be obtained through adjusting at least part of parameters of the pre-trained neural network. As can be seen from the foregoing, the quantization effect of the trained second quantization model is equivalent to that of the first quantization model to a certain extent, and the second training performed on the neural network can better match the obtained third predetermined neural network with the first quantization model.

4) A third training for the neural network, and a second training for the second quantitative model.

Alternatively, corresponding to step S112, the third training for the neural network may be performed on the basis of the trained second quantitative model. The second training for the second quantization model may be training for the trained second quantization model.

Specifically, the process of the third training for the neural network and the second training for the second quantitative model may be:

In an alternative embodiment of the present specification, a loss minJ for the second pending neural network to process the sample may be defined₃Comprises the following steps:

and adjusting the trained second quantitative model and the second undetermined neural network to obtain an intermediate second quantitative model and an intermediate second undetermined neural network by taking the minimization of the loss of the second undetermined neural network when the sample is processed as a target.

if so, determining the middle second undetermined neural network as a third undetermined neural network; otherwise, quantizing at least part of parameters of the intermediate second to-be-determined neural network by adopting an intermediate second quantization model to obtain a quantized intermediate second to-be-determined neural network, and continuing to train the intermediate second quantization model and the intermediate second to-be-determined neural network according to the quantized intermediate second to-be-determined neural network.

The third undetermined neural network obtained through at least part of the training process has better data processing capacity. And the third to-be-determined neural network is quantized through the first quantization model or the trained second quantization model, so that the data processing capacity of the obtained quantization model is not obviously influenced because of the quantization, and the data processing effect of the quantization model is ensured when the quantization model is used on line.

Alternatively, the process "training for the neural network for the second time" and the process "training for the neural network for the third time, and training for the second quantization model for the third time" may be selected according to an actual usage scenario, or may be performed in a predicted order.

Secondly, the design of at least part of the quantitative model involved in the data processing process of the specification.

Since the process in this specification relates to the design of the quantization model, mainly to the second quantization model, several parts that may be involved in the second quantization model will be separately described below.

1) And designing a model for quantifying the weight of the neural network in the second quantification model.

As can be seen from the foregoing, the second quantization model at least has a quantization function for at least part of parameters of the neural network, including but not limited to weight values of the neural network. The "neural network" described herein may be at least one of a pre-trained neural network and a second neural network.

In an alternative embodiment of the present specification, the second quantization model includes: a quantized first sub-model for quantizing at least part of the weights of the neural network, as shown in fig. 3b and 3 c. The process of quantizing the neural network using the second quantization model may be a process of quantizing at least part of the parameters of the neural network using the quantization first sub-model.

Optionally, the neural network in this specification is a deep convolutional neural network, and when the neural network is used to solve a multi-classification, regression problem, the neural network may include a cascade of convolutional networks and fully-connected networks. The quantization first submodel is for quantizing at least a portion of the weights of the convolutional network.

As can be seen from the foregoing, the quantization first sub-model belonging to the second quantization model plays an important role in the training process for the second to-be-determined neural network. The quantization first sub-model can not only be similar to the first quantization model in function in the aspect of quantization aiming at the weight, but also can ensure the validity and the authenticity of the gradient in the process of back propagation aiming at the neural network.

Optionally, the quantizing the first sub-model includes at least one weight quantizing unit, such as a kth weight quantizing unit k shown in fig. 3b and 3c, where the weight quantizing units correspond to weights of the neural network one to one.

Taking the example of obtaining the first to-be-fixed neural network by quantizing at least part of parameters of the pre-trained neural network by using the second quantization model to be trained, the process may specifically be: for at least part of weights of the pre-trained neural network, the weights can be input to a weight quantization unit corresponding to the weights, and quantized weights output by the weight quantization unit can be obtained.

In an alternative embodiment of the present description, the first submodel Q is quantized_wIt can be characterized by the following formula:

in the formula: w is aⁱIs the weight corresponding to the output end of the ith layer of the neural network;

the weight quantization unit is a weight quantization unit (hereinafter referred to as the ith weight quantization unit) which is used for quantizing the weight corresponding to the output end of the ith layer of the neural network in the quantization first sub-model; n is the total number of layers that the weights of the neural network need to quantify;

the parameter is a parameter related to the value of i and can be obtained through training;

is the quantization equation of the ith weight quantization unit.

Specifically, the weight w corresponding to the output end of the ith layer of the neural network may be setⁱInput into equation (6) to obtain the weight wⁱResult of quantization

Further, the quantization equation of the ith weight quantization unit

Any one of the formula (7) to the formula (9) may be used.

In the formula:

is a quantitative equation

Value of neutralization iThe related parameters can be obtained through training.

Optionally, parameters

May be obtained by the aforementioned "first training for the second quantization model" and/or "second training for the second quantization model".

In particular, the parameters may be subjected to the aforementioned training for the second quantization model according to the obtained loss

Until the respective conditions are met.

2) And designing a model for quantifying the activation value generated by the neural network in the second quantification model.

In an alternative embodiment of the present specification, the second quantization model comprises: a quantized second submodel for quantizing the activation values generated by the neural network, as shown in fig. 3b and 3 c.

Taking as an example that at least part of the activation values generated by the first to-be-localized neural network is quantized by using the second quantization model to be trained to obtain quantized activation values, the process may specifically be:

-inputting said sample into said first network to be neuropsychiatric.

And quantizing the activation value output by the layer by adopting the quantized second submodel aiming at each layer of the first to-be-localized neural network to obtain the quantized activation value output by the layer.

Optionally, the quantizing second sub-model includes at least one activation value quantizing unit, such as a kth activation value quantizing unit k shown in fig. 3b and 3c, and the activation value quantizing units correspond to the layers of the neural network one to one. The process of quantizing the activation value generated by the first to-be-determined neural network using the quantized second submodel may be: and aiming at each layer of the first to-be-fixed neural network, inputting the activation value generated by the layer into an activation value quantization unit corresponding to the layer to obtain a quantized activation value output by the activation value quantization unit.

And thirdly, obtaining a processing result output by the first to-be-fixed neural network according to the quantized activation value output by each layer.

In an alternative embodiment of the present description, the second quantization model Q_aIt can be characterized by the following formula:

in the formula: a is^jIs the activation value corresponding to the j layer output end of the neural network;

an active value quantization unit (hereinafter referred to as j-th active value quantization unit) for quantizing an active value corresponding to an output end of a j-th layer of the neural network in the quantization second sub-model; m is the total number of layers that the activation value of the neural network needs to be quantized;

is a parameter related to the value of j and can be obtained through training;

is the quantization equation of the jth activation value quantization unit.

Corresponding activation value a to the output end of the j layer of the neural network^jInput into equation (10) to obtain the activation value a^jResult of quantization

Further, the quantization equation of the jth activation value quantization unit

Any one of formula (11) to formula (13) may be used.

In the formula:

is a quantitative equation

The value-related parameters of the intermediate and j can be obtained through training.

Optionally, parameters

In particular, the parameters may be derived from the resulting loss during the aforementioned training for the second quantization model

Until the respective conditions are met.

Through at least part of the training steps, the third undetermined neural network still having better data processing capacity after being quantized by the trained second quantization model can be obtained. Due to the fact that the training process of the second quantization model is aimed at, the quantization effect of the trained second quantization model is equal to that of the first quantization model, the first quantization model is adopted to quantize the third to-be-determined neural network, and the obtained quantization neural network also has good data processing capacity.

Alternatively, the quantized Neural network may be any one of Binary Neural Networks (BNNs) and Ternary Neural Networks (TNNs).

In an alternative embodiment of the present specification, the first quantization model for quantizing the third candidate neural network may include: at least one of a weight quantizer model and an activation value quantizer model. The weight quantization submodel is used for quantizing at least part of weights of the third undetermined neural network; the activation value quantization submodel is used for quantizing at least part of the activation values of the third pending neural network.

Taking the quantized neural network as an example of a binary neural network, a weight quantization submodel P adopted by the quantized neural network is obtained_wIt can be characterized by the following equation (14) and equation (15):

in the formula:

the weight binarization unit is used for binarizing the weight corresponding to the output end of the ith layer of the third to-be-determined neural network in the weight quantization submodel;

is the weight corresponding to the output end of the ith layer of the third pending neural network.

Specifically, the weight corresponding to the output end of the ith layer of the third candidate neural network may be set

Input into a weight binarization unit shown in formula (15) to obtain the weight

Result of quantization

Further, an activation value quantization submodel P adopted by the quantization neural network is obtained_wIt can be characterized by the following equation (16) and equation (17):

in the formula:

the activation value binarization unit is used for binarizing the activation value corresponding to the output end of the jth layer of the third to-be-determined neural network in the activation value quantization submodel;

is the activation value of the output of the ith layer of the third pending neural network.

In particular, an activation value of an output of an i-th layer of the third pending neural network may be set

Inputting the weight to an activation value binarization unit shown in formula (17)

Result of quantization

As can be seen, through a data processing process in the embodiment of the present specification, the second quantization model is trained, and the obtained trained second quantization model has a quantization effect equivalent to that of the first quantization model; and at least part of parameters of the neural network are quantized through the trained second quantization model, so that the loss obtained in the adjustment (training) process aiming at the neural network has a certain gradient. Further, it is possible to avoid a phenomenon that a gradient cannot be obtained due to quantization when the neural network is adjusted. In addition, in the process of the present specification, in the adjusting process, the real loss of the neural network can be obtained, and the adjusted neural network obtained by adjusting according to the real loss also has a better data processing capability, which is beneficial to ensuring the accuracy of the data processing result.

That is, the parameters and the complexity of data processing of the quantized neural network obtained by the method and apparatus in this specification can be effectively compressed without significantly affecting the data processing capability and data processing effect. The data processing process in this specification is more suitable for being loaded in a hardware environment with limited storage space and high requirements on computing performance and computing quality, such as an unmanned vehicle and an intelligent terminal.

Optionally, when the pre-trained neural network in this specification is a deep neural network, at least part of the parameters of the neural network may be normalized in a Batch normalization (Batch Norm) manner during the training process.

The data processing process provided by the specification can be specifically applied to the field of delivery by using an unmanned vehicle, for example, in the delivery scene of express delivery, takeaway and the like by using the unmanned vehicle. Specifically, in the above-described scenario, delivery may be performed using an autonomous vehicle fleet configured with a plurality of unmanned vehicles.

Based on the same idea, the embodiments of the present specification further provide a data processing apparatus corresponding to the process shown in fig. 1, and the data processing apparatus is shown in fig. 4.

Fig. 4 is a schematic structural diagram of a data processing apparatus provided in an embodiment of the present specification, where the data processing apparatus includes:

a preparation module 400 for determining a pre-trained neural network, a preset sample and a first quantitative model;

a first to-be-localized neural network determining module 402, configured to quantize at least part of the parameters of the pre-trained neural network by using a second quantization model to be trained, so as to obtain a first to-be-localized neural network; the second quantization model to be trained is generated according to the first quantization model;

an intermediate result determining module 404, configured to input the sample into the first to-be-localized neural network, and obtain a processing result output by the first to-be-localized neural network as an intermediate result;

an intermediate loss determining module 406, configured to determine, according to the intermediate result and the label corresponding to the sample, a loss of the first to-be-localized neural network as an intermediate loss;

a second quantization model training module 408, configured to train the second quantization model to be trained to obtain a trained second quantization model, with the intermediate loss minimized and the gradient of the intermediate loss maximized as a training target;

a second undetermined neural network determining module 410, configured to quantize at least part of the parameters of the pre-trained neural network by using the trained second quantization model, so as to obtain a second undetermined neural network;

a third undetermined neural network determining module 412, configured to adjust the second undetermined neural network to obtain a third undetermined neural network according to the preset sample, with a minimum loss in processing the sample by using the second undetermined neural network as a training target;

and the data processing module 414 is configured to, when data to be processed needs to be processed, quantize the third to-be-determined neural network by using the first quantization model to obtain a quantized neural network, and input the data to be processed into the quantized neural network to obtain a processing result of the data to be processed.

Optionally, the first to-be-determined neural network determining module 402 may include: and a weight quantization submodule.

And the weight quantization submodule is used for quantizing the weight of the pre-trained neural network by adopting the quantized first submodel to be trained to obtain the neural network with the quantized weight, and the neural network is used as a first neural network to be fixed.

Optionally, the intermediate result determining module 404 may include: the value quantization sub-module is activated. The activation value quantization sub-module may include: the device comprises an input unit, an activation value quantization unit and a first processing result generation unit.

An input unit for inputting the sample into the first to-be-localized neural network.

And the activation value quantization unit is used for quantizing the activation value output by the layer by adopting the quantization second sub-model aiming at each layer of the first to-be-localized neural network to obtain the quantized activation value output by the layer.

And the first processing result generating unit is used for obtaining a processing result output by the first to-be-fixed neural network according to the quantized activation value output by each layer.

Optionally, the second quantitative model training module 408 may include: a reference result determining sub-module, a reference loss determining sub-module, a quantization loss determining sub-module and a second quantization model training sub-module.

And the reference result determining submodule is used for inputting the sample into the pre-trained neural network to obtain a processing result output by the pre-trained neural network as a reference result.

And the reference loss determining submodule is used for determining the reference loss when the pre-trained neural network processes the sample according to the reference result and the label corresponding to the sample.

And the quantization loss determining submodule is used for determining the quantization loss caused by quantizing the pre-trained neural network through the second quantization model to be trained according to the reference loss and the intermediate loss.

And the second quantization model training submodule is used for training the second quantization model to be trained by taking the minimization of the quantization loss and the maximization of the gradient of the intermediate loss as a training target to obtain the trained second quantization model.

Optionally, the quantization loss determination sub-module may include: a reference loss gradient determination unit, a difference determination unit and a quantization loss determination unit.

And the reference loss gradient determining unit is used for determining the gradient of the reference loss according to the reference loss.

A difference determination unit for determining a difference between the gradient of the reference loss and the gradient of the intermediate loss.

And the quantization loss determining unit is used for determining the quantization loss of the neural network quantized by the quantization model to be trained according to the difference and the gradient of the intermediate loss, wherein the quantization loss is positively correlated with the difference, and the quantization loss is negatively correlated with the gradient of the intermediate loss.

Optionally, the third pending neural network determination module 412 may include: a second processing result generation submodule, a loss determination submodule, an adjustment submodule and a judgment submodule.

And the second processing result generation submodule is used for inputting the preset sample into the second undetermined neural network to obtain a processing result output by the second undetermined neural network.

And the loss determining submodule is used for determining the loss of the second undetermined neural network when the second undetermined neural network processes the sample according to the processing result output by the second undetermined neural network and the label corresponding to the sample.

And the adjusting submodule is used for adjusting the trained second quantization model and the second undetermined neural network to obtain an intermediate second quantization model and an intermediate second undetermined neural network by taking the minimization of the loss of the second undetermined neural network in processing the sample as a target.

The judgment submodule is used for judging whether the middle second to-be-determined neural network meets a preset condition or not; if so, determining the middle second undetermined neural network as a third undetermined neural network; otherwise, quantizing at least part of parameters of the intermediate second to-be-determined neural network by adopting an intermediate second quantization model to obtain a quantized intermediate second to-be-determined neural network, and continuing to train the intermediate second quantization model and the intermediate second to-be-determined neural network according to the quantized intermediate second to-be-determined neural network.

Optionally, the data processing apparatus in this specification may further include a pre-training module. The pre-training module may include: the device comprises an acquisition submodule, a quantization submodule and a pre-training submodule.

And the acquisition submodule is used for acquiring the neural network to be trained and a pre-training sample for pre-training the neural network.

And the quantization submodule is used for quantizing at least part of parameters of the neural network to be trained by adopting the first quantization model to obtain the quantized neural network to be trained.

And the pre-training sub-module is used for training the quantified neural network to be trained according to the pre-training sample to obtain the pre-trained neural network.

Optionally, the preparation module 400 may include: a pre-trained neural network determining submodule and a preset sample determining submodule.

And the pre-trained neural network determining submodule is used for determining the pre-trained neural network as the pre-trained neural network.

And the preset sample determining submodule is used for determining the pre-training sample as a preset sample.

Embodiments of the present specification also provide a computer-readable storage medium, which stores a computer program, and the computer program can be used to execute the process of data processing provided in fig. 1.

The embodiment of the present specification further provides a schematic structural diagram of the first electronic device shown in fig. 5. As shown in fig. 5, at the hardware level, the electronic device includes a processor, an internal bus, a network interface, a memory, and a non-volatile memory, but may also include hardware required for other services. The processor reads the corresponding computer program from the non-volatile memory into the memory and then runs the computer program to implement the data processing process described in fig. 1. Of course, besides the software implementation, the present specification does not exclude other implementations, such as logic devices or a combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may be hardware or logic devices.

In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually making an integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Language Description Language), traffic, pl (core unified Programming Language), HDCal, JHDL (Java Hardware Description Language), langue, Lola, HDL, laspam, hardsradware (Hardware Description Language), vhjhd (Hardware Description Language), and vhigh-Language, which are currently used in most common. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.

The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, and an embedded microcontroller, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic for the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may thus be considered a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.

The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functions of the various elements may be implemented in the same one or more software and/or hardware implementations of the present description.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

As will be appreciated by one skilled in the art, embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

This description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The above description is only an example of the present specification, and is not intended to limit the present specification. Various modifications and alterations to this description will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present specification should be included in the scope of the claims of the present specification.

Claims

1. A method of data processing, the method comprising:

2. The method of claim 1, wherein the second quantization model comprises: a quantized first sub-model for quantizing weights of the neural network;

3. The method of claim 2, wherein the second quantization model comprises: a second sub-model for quantifying at least part of the activation values generated by the neural network;

inputting the sample into the first chilled network;

4. The method according to any one of claims 1 to 3, wherein the training of the second quantization model to be trained with the minimization of the intermediate loss and the maximization of the gradient of the intermediate loss as a training target to obtain a trained second quantization model comprises:

5. The method of claim 4, wherein determining quantization losses due to quantization of the pre-trained neural network by the second quantization model to be trained based on the reference losses and the intermediate losses comprises:

determining a gradient of the reference loss according to the reference loss;

6. The method according to any one of claims 1 to 3, wherein the adjusting the second pending neural network to obtain a third pending neural network according to the preset sample with a minimum loss in processing the sample using the second pending neural network as a training target specifically comprises:

judging whether the middle second to-be-determined neural network meets a preset adjusting condition or not;

7. The method of any one of claims 1-3, wherein the pre-trained neural network specifically comprises:

8. A data processing apparatus, characterized in that the apparatus comprises:

9. A computer-readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of any of the preceding claims 1-7.

10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any of claims 1-7 when executing the program.