CN114358280A - Data processing method and device, electronic equipment and computer readable storage medium - Google Patents

Data processing method and device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN114358280A
CN114358280A CN202111642847.6A CN202111642847A CN114358280A CN 114358280 A CN114358280 A CN 114358280A CN 202111642847 A CN202111642847 A CN 202111642847A CN 114358280 A CN114358280 A CN 114358280A
Authority
CN
China
Prior art keywords
model
data model
original data
quantization
error
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111642847.6A
Other languages
Chinese (zh)
Inventor
邱亮
赖俊成
王继铭
邓帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Eswin Computing Technology Co Ltd
Original Assignee
Beijing Eswin Computing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Eswin Computing Technology Co Ltd filed Critical Beijing Eswin Computing Technology Co Ltd
Priority to CN202111642847.6A priority Critical patent/CN114358280A/en
Publication of CN114358280A publication Critical patent/CN114358280A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The embodiment of the application provides a data processing method and device, electronic equipment and a computer readable storage medium, and relates to the technical field of computers. The method comprises the following steps: carrying out model acceleration processing on an original data model of the edge equipment to obtain a target data model; the model acceleration processing comprises model quantization operation and activating functional operator replacement operation; the replacement operation of the activation function operator comprises replacing the activation function operator with a target parameter value recorded in a lookup table according to a preset lookup table; and then processing data based on the target data model. According to the embodiment of the application, the original data model built in the edge device is improved, the target data model obtained through improvement is adopted for data processing, the calculation complexity of the data model in the edge device is effectively reduced, the calculation speed of the data model is increased, and the data processing speed of the edge device is increased.

Description

Data processing method and device, electronic equipment and computer readable storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data processing method, an apparatus, an electronic device, and a computer-readable storage medium.
Background
With the breakthrough of deep learning, artificial intelligence applications and services have been developed vigorously in recent years. Under the drive of artificial intelligence and the Internet of things, the front edge of the artificial intelligence needs to be pushed to the edge of the network, and the potential of edge big data is fully released. To address this trend, the idea of edge-based computing applies computationally intensive artificial intelligence on edge devices.
The edge device is a lightweight edge computing device with artificial intelligence computing capabilities. With the more complex algorithm models and the increasing number of landing applications, the requirements of people on the computing power of edge devices are higher and higher, and the requirements on instantaneity, bandwidth utilization rate, processing capacity and the like need to be met.
At present, the data processing efficiency of the edge device is improved by improving an algorithm model in the edge device. The existing improvement method, such as a model pruning method, has large precision loss and poor effect of accelerating and improving an algorithm model.
Disclosure of Invention
The purpose of the embodiment of the application is to solve the problem that the precision loss is large when an algorithm model in an edge device is improved.
According to an aspect of an embodiment of the present application, there is provided a data processing method, including:
carrying out model acceleration processing on an original data model of the edge equipment to obtain a target data model; the model acceleration processing comprises model quantization operation and activating functional operator replacement operation; the replacement operation of the activation function operator comprises replacing the activation function operator with a target parameter value recorded in a lookup table according to a preset lookup table;
and processing data based on the target data model.
Optionally, performing model acceleration processing on the original data model of the edge device to obtain a target data model, including:
performing model quantization operation on the weight parameters in the original data model of the edge device to obtain a quantized data model;
and determining a target parameter value corresponding to an activation function operator in the quantized data model according to the lookup table, and executing an activation function operator replacement operation on the activation function operator to obtain a target data model.
Optionally, before performing a model quantization operation on the weight parameters in the original data model of the edge device to obtain a quantized data model, the method further includes:
carrying out model pruning operation on the original data model to obtain a compressed original data model;
performing model quantization operation on the weight parameters in the original data model of the edge device to obtain a quantized data model, wherein the model quantization operation comprises the following steps:
and carrying out model quantization operation on the weight parameters in the compressed original data model to obtain a quantized data model.
Optionally, performing model pruning on the original data model to obtain a compressed original data model, including:
performing sparse training on the original data model, and cutting channels below a preset threshold value in the original data model according to the result of the sparse training;
and recovering the precision value of the cut original data model to obtain a compressed original data model.
Optionally, performing sparse training on the original data model, and cutting a channel in the original data model, which is lower than a preset threshold, according to a result of the sparse training, includes:
introducing a scale factor into each channel of a batch standardized BN layer of an original data model, applying L1 regular constraint to the scale factor, and performing sparse training; wherein the scale factor comprises a weight of the channel;
and sequencing the scale factors after sparse training, and cutting channels corresponding to the scale factors lower than a preset threshold value.
Optionally, performing model quantization operation on the weight parameter in the original data model of the edge device to obtain a quantized data model, including:
converting the precision of the weight parameters in the original data model into target precision, and performing model quantization operation;
and determining a quantization error and a truncation error corresponding to the model quantization operation, and performing error correction operation based on the quantization error and the truncation error to obtain a quantized data model.
Optionally, determining a quantization error and a truncation error corresponding to the model quantization operation, and performing an error correction operation based on the quantization error and the truncation error, including:
calculating the difference value of the mean values of the weight parameters before and after the model quantization operation and the divisor of the variance value;
and determining the weight of the weight parameter according to the difference value and the divisor, and performing error correction operation based on the weight.
Optionally, determining a quantization error and a truncation error corresponding to the model quantization operation, and performing an error correction operation based on the quantization error and the truncation error, further includes:
counting tensor distribution of each layer in an original data model, and calculating quantization parameters of each layer;
and determining a truncation error based on the tensor distribution and the quantization parameter, and performing error correction operation based on the truncation error.
According to another aspect of embodiments of the present application, there is provided a data processing apparatus including:
the acceleration module is used for carrying out model acceleration processing on an original data model of the edge equipment to obtain a target data model; the model acceleration processing comprises model quantization operation and activating functional operator replacement operation; the replacement operation of the activation function operator comprises replacing the activation function operator with a target parameter value recorded in a lookup table according to a preset lookup table;
and the processing module is used for processing data based on the target data model.
Optionally, the acceleration module comprises:
the model quantization module is used for performing model quantization operation on the weight parameters in the original data model of the edge device to obtain a quantized data model;
and the replacing module is used for determining a target parameter value corresponding to an activation function operator in the quantized data model according to the lookup table, and executing activation function operator replacing operation on the activation function operator to obtain the target data model.
Optionally, the apparatus further comprises:
the model pruning module is used for carrying out model pruning operation on the original data model to obtain a compressed original data model;
the model quantization module is specifically configured to:
and carrying out model quantization operation on the weight parameters in the compressed original data model to obtain a quantized data model.
Optionally, the model pruning module comprises:
the training module is used for carrying out sparse training on the original data model and cutting channels lower than a preset threshold value in the original data model according to the result of the sparse training;
and the recovery module is used for recovering the precision value of the cut original data model to obtain a compressed original data model.
Optionally, the training module comprises:
the training submodule is used for introducing a scale factor into each channel of the batch standard BN layer of the original data model, applying L1 regular constraint to the scale factor and carrying out sparse training; wherein the scale factor comprises a weight of the channel;
and the cutting module is used for sequencing the scale factors after the sparse training and cutting channels corresponding to the scale factors lower than a preset threshold value.
Optionally, the model quantization module comprises:
the conversion module is used for converting the precision of the weight parameters in the original data model into target precision and carrying out model quantization operation;
and the correction module is used for determining the quantization error and the truncation error corresponding to the model quantization operation, and performing error correction operation based on the quantization error and the truncation error to obtain a quantized data model.
Optionally, the correction module comprises:
the first calculation module is used for calculating the difference value of the mean values of the weight parameters before and after the model quantization operation and the divisor of the variance value;
and the first correction submodule is used for determining the weight of the weight parameter according to the difference value and the divisor and performing error correction operation based on the weight.
Optionally, the correction module further comprises:
the second calculation module is used for counting tensor distribution of each layer in the original data model and calculating quantization parameters of each layer;
and the second correction submodule is used for determining a truncation error based on the tensor distribution and the quantization parameter and performing error correction operation based on the truncation error.
According to another aspect of embodiments of the present application, there is provided an electronic device, which includes a memory, a processor, and a computer program stored on the memory, wherein the processor executes the computer program to implement the steps of the data processing method of any one of the above aspects.
According to a further aspect of embodiments of the present application, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the data processing method of any of the above aspects.
The technical scheme provided by the embodiment of the application has the following beneficial effects:
the original data model built in the edge device is improved, and the target data model obtained through improvement is adopted for data processing, so that the computational complexity of the data model in the edge device is effectively reduced, the computational speed of the data model is increased, and the data processing speed of the edge device is further increased.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below.
Fig. 1 is a schematic flow chart of a data processing method according to an embodiment of the present application;
fig. 2 is a second schematic flowchart of a data processing method according to an embodiment of the present application;
fig. 3 is a third schematic flowchart of a data processing method according to an embodiment of the present application;
fig. 4 is a fourth schematic flowchart of a data processing method according to an embodiment of the present application;
fig. 5 is a fifth flowchart illustrating a data processing method according to an embodiment of the present application;
fig. 6 is a sixth schematic flowchart of a data processing method according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of an electronic device for data processing according to an embodiment of the present application.
Detailed Description
Embodiments of the present application are described below in conjunction with the drawings in the present application. It should be understood that the embodiments set forth below in connection with the drawings are exemplary descriptions for explaining technical solutions of the embodiments of the present application, and do not limit the technical solutions of the embodiments of the present application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should be further understood that the terms "comprises" and/or "comprising," when used in this specification in connection with embodiments of the present application, specify the presence of stated features, information, data, steps, operations, elements, and/or components, but do not preclude the presence or addition of other features, information, data, steps, operations, elements, components, and/or groups thereof, as embodied in the art. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein indicates at least one of the items defined by the term, e.g., "a and/or B" may be implemented as "a", or as "B", or as "a and B".
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
The technical solutions of the embodiments of the present application and the technical effects produced by the technical solutions of the present application will be described below through descriptions of several exemplary embodiments. It should be noted that the following embodiments may be referred to, referred to or combined with each other, and the description of the same terms, similar features, similar implementation steps and the like in different embodiments is not repeated.
An embodiment of the present application provides a data processing method, as shown in fig. 1, including:
s101, performing model acceleration processing on an original data model of the edge equipment to obtain a target data model; the model acceleration processing comprises model quantization operation and activating functional operator replacement operation; and the replacement operation of the activation function operator comprises replacing the activation function operator with a target parameter value recorded in a lookup table according to a preset lookup table.
The edge device is a lightweight edge computing device with artificial intelligence computing capability, and an original data model, for example, a neural network model for image recognition, a neural network model for voice recognition, and the like, is built in the edge device, which is not limited to this.
And determining an original data model in the edge device, and performing model acceleration processing on the original data model to obtain a target data model. The data processing efficiency of the existing edge device is poor, and in order to improve the data processing efficiency of the edge device, the original data model built in the edge device is optimized and improved to obtain the target data model combining model quantization operation and activating functional operator replacement operation, so that the calculation speed of the original data model is improved.
In an application scenario, a binary script file for model acceleration processing, namely a binary script file including model quantization operation and activation function operator replacement operation, is generated and preset in an edge device. When the edge device is started to process data, the binary script file of the accelerated model processing is synchronously operated, and the optimization and improvement of the original data model are realized.
Specifically, the model quantization operation includes precision conversion of the weight parameters. And the replacement operation of the activation function operator comprises replacing the activation function operator with a target parameter value recorded in a lookup table according to a preset lookup table.
The model quantization operation is an operation of converting parameters in the neural network model from high precision to low precision, and the model quantization operation can reduce the size and the calculation amount of the neural network model, so that the storage pressure and the power consumption of the edge device are reduced. In the embodiment of the present application, the model quantization operation includes performing precision conversion on the weight parameter weight.
The activation function operator refers to an activation function operator in the neural network model. In the embodiment of the application, target parameter values corresponding to activating function operators in an original data model are calculated in advance, and a preset lookup table is generated to store the target parameter values. When the edge device is started to perform data processing, if the step of calculating the activation function operator is executed, the replacement operation of the activation function operator can be directly executed, and the corresponding target parameter value is searched from the preset lookup table for replacement. Therefore, the calculation complexity of the original data model in the edge device is reduced, and the data processing efficiency is improved.
Step S102, data processing is carried out based on the target data model.
And processing data based on the optimized and improved target data model. In the embodiment of the application, the data processing process is implemented in the edge device, and the internet of things device, the gateway and the computing infrastructure are placed as close as possible to the data source and as close as possible to the system and personnel needing to make data-driven decisions. Therefore, the target data model built in the edge device is high in calculation speed, task automation can be achieved, and better user experience is provided.
Carrying out model acceleration processing on an original data model of the edge device by applying the data processing method provided by the embodiment of the application to obtain a target data model; the model acceleration processing comprises model quantization operation and activating functional operator replacement operation; the replacement operation of the activation function operator comprises replacing the activation function operator with a target parameter value recorded in a lookup table according to a preset lookup table; and then processing data based on the target data model.
According to the embodiment of the application, the original data model built in the edge device is improved, the target data model obtained through improvement is adopted for data processing, the calculation complexity of the data model in the edge device is effectively reduced, the calculation speed of the data model is increased, and the data processing speed of the edge device is further increased.
An embodiment of the present application provides a data processing method, as shown in fig. 2, including:
step S201, performing model quantization operation on the weight parameters in the original data model of the edge device to obtain a quantized data model.
The original data model is an initial neural network model built in edge equipment, and the edge equipment is light-weight edge computing equipment with artificial intelligence computing capability. Edge devices are typically used for data recognition and detection, such as face recognition, video detection, vehicle detection, and the like, without limitation. According to the data processing method and device, the original data model is optimized and improved, and the data processing efficiency of the edge device is improved.
The model quantization operation is an operation of converting parameters in the neural network model from high precision to low precision, and the model quantization operation can reduce the size and the calculation amount of the neural network model, so that the storage pressure and the power consumption of the edge device are reduced. In the embodiment of the present application, the model quantization operation includes performing precision conversion on the weight parameter weight. Specifically, model quantization operation is performed on weight parameters in an original data model of the edge device, so that a quantized data model is obtained.
In a preferred embodiment of the present application, as shown in fig. 3, step S201 includes:
step S2011, the precision of the weight parameter in the original data model is converted into a target precision, and a model quantization operation is performed.
Currently, the parameters and intermediate operating data in the neural network model are usually floating point type data of 32bit precision or floating point type data of 16bit precision. For the edge device, the high precision data may cause memory pressure when processing, and there is a limit in power consumption.
Therefore, the precision of the weight parameter in the original data model is converted into the target precision, namely 8bit precision, and the model quantization operation is carried out.
Step S2012, determining a quantization error and a truncation error corresponding to the model quantization operation, and performing an error correction operation based on the quantization error and the truncation error to obtain a quantized data model.
In the process of carrying out model quantization on the original data model, because the precision of the weight parameter is converted into the precision of 8 bits, the mean value and the variance of the weight parameter after the precision conversion and the weight parameter before the precision conversion have deviation, which is the quantization error caused by the model quantization operation. Quantization errors cause a loss of model accuracy of the original data model.
Therefore, after the precision conversion operation is performed on the weight parameter, the Bias-Correction method and the ACIQ (Integer Quantization analysis Clipping) method are adopted to reduce the Quantization error corresponding to the model Quantization operation.
The Bias-Correction method is mainly used for compensating the offset of the weight parameter before and after the model quantization operation and correcting the weight parameter.
The ACIQ method is a quantization threshold selection method, which assumes that a tensor distribution obeys laplacian distribution or gaussian distribution, and then quantizes values of the tensor obeying the tensor distribution into discrete intervals. In the embodiment of the application, the quantization error is intercepted by an ACIQ method, the quantization error is determined and reduced, and the precision loss of an original data model caused by model quantization operation is avoided. When the quantization error is truncated, a corresponding truncation error is also generated.
Therefore, a quantization error and a truncation error corresponding to the model quantization operation are determined, and an error correction operation is performed based on the quantization error and the truncation error to obtain a quantized data model.
In a preferred embodiment of the present application, as shown in fig. 4, step S2012 includes:
step S20121, calculating a difference value of the mean values and a divisor of the variance value of the weight parameters before and after the model quantization operation.
And calculating the mean value of the weight parameter before the model quantization operation and the mean value of the weight parameter after the model quantization operation by adopting a Bias-Correction method, then calculating the difference value of the front mean value and the back mean value, further calculating the variance value of the weight parameter before the model quantization operation and the variance value after the model quantization operation, and finally calculating the divisor of the front variance value and the back variance value.
And step S20122, determining the weight of the weight parameter according to the difference value and the divisor, and performing error correction operation based on the weight.
And updating the weight after the model quantization operation, namely the weight of the weight according to the difference value and the divisor, and performing error correction operation on the weight parameter based on the weight, thereby reducing the quantization error of the original data model in the model quantization operation.
Step S20123, the tensor distribution of each layer in the original data model is counted, and the quantization parameter of each layer is calculated.
The method comprises the steps of obtaining a preset test set in the edge device, training an original data model by adopting partial data in the test set, counting tensor distribution of output of each layer in the original data model, and calculating an output-scale parameter output by each layer by using an ACIQ method.
And step S20124, determining a truncation error based on the tensor distribution and the quantization parameter, and performing error correction operation based on the truncation error.
And intercepting the quantization error by an ACIQ method to minimize quantization loss, then determining the truncation error introduced by model quantization operation based on tensor distribution and quantization parameters, and performing error correction operation on the truncation error, thereby further ensuring the accuracy of the original data model.
Step S202, according to the lookup table, determining a target parameter value corresponding to an activation function operator in the quantized data model, and executing an activation function operator replacement operation on the activation function operator to obtain a target data model.
The activation function operator refers to an activation function operator in the neural network model. In the embodiment of the application, target parameter values corresponding to activating function operators in an original data model are calculated in advance, and a preset lookup table is generated to store the target parameter values. When the edge device is started to perform data processing, if the step of calculating the activation function operator is executed, the replacement operation of the activation function operator can be directly executed, and the corresponding target parameter value is searched from the preset lookup table for replacement. Therefore, the calculation complexity of the original data model in the edge device is reduced, and the data processing efficiency is improved.
In a preferred embodiment of the present application, before step S201, the method further includes:
and carrying out model pruning operation on the original data model to obtain the compressed original data model.
In other words, in the embodiment of the present application, a model quantization operation and an activation function operator replacement operation may be directly performed on an original data model to obtain a target data model, or a model pruning operation on the original data model may be performed before the model quantization operation and the activation function operator replacement operation are performed, so as to finally obtain the target data model. Model pruning is an optional optimization and improvement made before model quantization is performed on the original data model.
First, a model pruning operation is performed on the original data model. In general, model pruning refers to removing a large number of redundant parameters existing in the neural network model from the convolutional layer to the fully-connected layer, i.e., a large number of neurons with activation values approaching 0. After the neurons are removed, the neural network model can still show the same model expression capacity.
In the embodiment of the application, the model pruning operation on the original data model is realized by removing a large number of channels between neurons with the activation values approaching 0. Before the target data model is obtained, the original data model is subjected to sparse training to obtain a compressed original data model, then model quantization operation is carried out on the compressed original data model to obtain a quantized data model, and the target data model is further obtained.
In a preferred embodiment of the present application, as shown in fig. 5, performing a model pruning operation on an original data model to obtain a compressed original data model includes:
step S501, sparse training is conducted on the original data model, and channels lower than a preset threshold value in the original data model are cut according to the result of the sparse training.
Specifically, performing sparse training on the original data model refers to performing sparse regularization training on the original data model, where the sparse regularization training may adopt an L1 regularization mode, and by reducing the absolute value of the weight, more parameters approach to 0.
Furthermore, according to the result of the sparse training, namely parameters in the original data model after the sparse training, channels among neurons in the original data model are cut, channels lower than a preset threshold are removed, and the original data model with sparse parameters is obtained.
In a preferred embodiment of the present application, as shown in fig. 6, step S501 includes:
step S5011, introducing a scale factor into each channel of a batch standardized BN layer of an original data model, applying L1 regular constraint to the scale factor, and performing sparse training; wherein the scale factor comprises a weight of the channel.
Before a BN (Batch Normalization) layer of the original data model is generally used for an activation layer, the convergence speed of the original data model during model training can be increased, so that the model training process is more stable, and the situation of gradient explosion or gradient disappearance is avoided.
A scale factor is introduced at each channel of the BN layer, where the scale factor includes the weight of the channel. In the embodiment of the present application, the scale factor γ is positively correlated with the output of each channel, that is, the importance level of each channel can be determined based on γ.
And applying L1 regularization constraint to gamma, and reducing the weight of a channel corresponding to the gamma, thereby realizing sparse training of the original data model.
And S5012, sequencing the scale factors after sparse training, and cutting channels corresponding to the scale factors lower than a preset threshold value.
And sequencing all gammas in the BN layer after sparse training, determining the gammas lower than a preset threshold value, removing channels corresponding to the gammas lower than the preset threshold value, and completing channel pruning operation of the original data model.
The preset threshold may be set by itself as required, or a conventional threshold may be referred to, without limitation.
And step S502, restoring the precision value of the cut original data model to obtain a compressed original data model.
The clipped original data model has the problem of reduced precision, so the clipped original data model is subjected to precision value recovery to obtain a compressed original data model.
In the embodiment of the application, the trimmed original data model can be further trained by adopting a finetune tuning method. And under the condition that the model precision value expected to be recovered is known, taking the clipped original data model as a pre-training model, and carrying out fine adjustment on the basis of the pre-training model to finally obtain the compressed original data model with the recovered precision value.
Step S203, data processing is performed based on the target data model.
And obtaining a target data model based on the model quantization operation and the activation function operator replacement operation, or obtaining the target data model based on the model pruning operation, the model quantization operation and the activation function operator replacement operation, and performing data processing.
The target data model is obtained by optimizing and improving the original data model. The existing edge device has poor data processing efficiency, and in order to improve the data processing efficiency of the edge device, an original data model built in the edge device is optimized and improved, so that the calculation speed of the original data model is improved.
In an application scenario, a binary script file of model quantization operation and activating function operator replacement operation is generated and preset in edge equipment. When the edge device is started to process data, the binary script file is synchronously operated, and the original data model is optimized and improved. In addition, a binary script file of the model pruning operation, the model quantization operation and the activation function operator replacement operation can also be generated and preset in the edge device, which is not described any further.
And processing data based on the optimized and improved target data model. In the embodiment of the application, the data processing process is implemented in the edge device, and the internet of things device, the gateway and the computing infrastructure are placed as close as possible to the data source and as close as possible to the system and personnel needing to make data-driven decisions. In this way, the edge device can achieve task automation and provide better user experience.
By applying the data processing method provided by the embodiment of the application, model quantization operation is performed on the weight parameters in the original data model of the edge device to obtain a quantized data model, then target parameter values corresponding to activation function operators in the quantized data model are determined according to the lookup table, the activation function operators are subjected to activation function operator replacement operation to obtain a target data model, and then data processing is performed based on the target data model.
According to the embodiment of the application, the original data model built in the edge device is improved, the target data model obtained through improvement is adopted for data processing, the calculation complexity of the data model in the edge device is effectively reduced, the calculation speed of the data model is increased, and the data processing speed of the edge device is further increased.
An embodiment of the present application provides a data processing apparatus, as shown in fig. 7, including:
the acceleration module 701 is used for performing model acceleration processing on an original data model of the edge device to obtain a target data model; the model acceleration processing comprises model quantization operation and activating functional operator replacement operation; the replacement operation of the activation function operator comprises replacing the activation function operator with a target parameter value recorded in a lookup table according to a preset lookup table;
and a processing module 702, configured to perform data processing based on the target data model.
In a preferred embodiment of the present application, the acceleration module 701 includes:
the model quantization module is used for performing model quantization operation on the weight parameters in the original data model of the edge device to obtain a quantized data model;
and the replacing module is used for determining a target parameter value corresponding to an activation function operator in the quantized data model according to the lookup table, and executing activation function operator replacing operation on the activation function operator to obtain the target data model.
In a preferred embodiment of the present application, the apparatus further comprises:
the model pruning module is used for carrying out model pruning operation on the original data model to obtain a compressed original data model;
the model quantization module is specifically configured to:
and carrying out model quantization operation on the weight parameters in the compressed original data model to obtain a quantized data model.
In a preferred embodiment of the present application, the model pruning module includes:
the training module is used for carrying out sparse training on the original data model and cutting channels lower than a preset threshold value in the original data model according to the result of the sparse training;
and the recovery module is used for recovering the precision value of the cut original data model to obtain a compressed original data model.
In a preferred embodiment of the present application, the training module includes:
the training submodule is used for introducing a scale factor into each channel of the batch standard BN layer of the original data model, applying L1 regular constraint to the scale factor and carrying out sparse training; wherein the scale factor comprises a weight of the channel;
and the cutting module is used for sequencing the scale factors after the sparse training and cutting channels corresponding to the scale factors lower than a preset threshold value.
In a preferred embodiment of the present application, the model quantization module includes:
the conversion module is used for converting the precision of the weight parameters in the original data model into target precision and carrying out model quantization operation;
and the correction module is used for determining the quantization error and the truncation error corresponding to the model quantization operation, and performing error correction operation based on the quantization error and the truncation error to obtain a quantized data model.
In a preferred embodiment of the present application, the correction module includes:
the first calculation module is used for calculating the difference value of the mean values of the weight parameters before and after the model quantization operation and the divisor of the variance value;
and the first correction submodule is used for determining the weight of the weight parameter according to the difference value and the divisor and performing error correction operation based on the weight.
In a preferred embodiment of the present application, the modification module further includes:
the second calculation module is used for counting tensor distribution of each layer in the original data model and calculating quantization parameters of each layer;
and the second correction submodule is used for determining a truncation error based on the tensor distribution and the quantization parameter and performing error correction operation based on the truncation error.
The data processing device provided by the embodiment of the application is applied to carry out model acceleration processing on the original data model of the edge equipment to obtain a target data model; the model acceleration processing comprises model quantization operation and activating functional operator replacement operation; the replacement operation of the activation function operator comprises replacing the activation function operator with a target parameter value recorded in a lookup table according to a preset lookup table; and then processing data based on the target data model.
According to the embodiment of the application, the original data model built in the edge device is improved, the target data model obtained through improvement is adopted for data processing, the calculation complexity of the data model in the edge device is effectively reduced, the calculation speed of the data model is increased, and the data processing speed of the edge device is further increased.
The embodiment of the present application provides an electronic device (computer apparatus/device/system), which includes a memory, a processor, and a computer program stored on the memory, wherein the processor executes the computer program to implement the steps of the data processing method, and compared with the related art, the method can implement: the original data model built in the edge device is improved, and the target data model obtained through improvement is adopted for data processing, so that the computational complexity of the data model in the edge device is effectively reduced, the computational speed of the data model is increased, and the data processing speed of the edge device is further increased.
In an alternative embodiment, an electronic device is provided, as shown in FIG. 8, the electronic device 8000 shown in FIG. 8 including: a processor 8001 and memory 8003. Processor 8001 is coupled to memory 8003, such as via bus 8002. Optionally, the electronic device 8000 may further include a transceiver 8004, and the transceiver 8004 may be used for data interaction between the electronic device and other electronic devices, such as transmission of data and/or reception of data. In addition, the transceiver 8004 is not limited to one in practical applications, and the structure of the electronic device 8000 does not limit the embodiment of the present application.
Processor 8001 may be a CPU (Central Processing Unit), general purpose Processor, DSP (Digital Signal Processor), ASIC (Application Specific Integrated Circuit), FPGA (Field Programmable Gate Array), or other Programmable logic device, transistor logic device, hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. Processor 8001 may also be a combination that implements computing functionality, e.g., comprising one or more microprocessor combinations, DSP and microprocessor combinations, and so forth.
Bus 8002 may include a path to transfer information between the aforementioned components. The bus 8002 may be a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus 8002 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 8, but this is not intended to represent only one bus or type of bus.
The Memory 8003 may be a ROM (Read Only Memory) or other types of static storage devices that can store static information and instructions, a RAM (Random Access Memory) or other types of dynamic storage devices that can store information and instructions, an EEPROM (Electrically Erasable Programmable Read Only Memory), a CD-ROM (Compact Disc Read Only Memory) or other optical Disc storage, optical Disc storage (including Compact Disc, laser Disc, optical Disc, digital versatile Disc, blu-ray Disc, etc.), a magnetic Disc storage medium, other magnetic storage devices, or any other medium that can be used to carry or store a computer program and that can be Read by a computer, without limitation.
The memory 8003 is used to store computer programs for executing the embodiments of the present application, and is controlled by the processor 8001 to execute the programs. The processor 8001 is used to execute computer programs stored in the memory 8003 to implement the steps shown in the foregoing method embodiments.
Embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, and when being executed by a processor, the computer program may implement the steps and corresponding contents of the foregoing method embodiments.
Embodiments of the present application further provide a computer program product, which includes a computer program, and when the computer program is executed by a processor, the steps and corresponding contents of the foregoing method embodiments can be implemented.
The terms "first," "second," "third," "fourth," "1," "2," and the like in the description and in the claims of the present application and in the above-described drawings (if any) are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in other sequences than illustrated or otherwise described herein.
It should be understood that, although each operation step is indicated by an arrow in the flowchart of the embodiment of the present application, the implementation order of the steps is not limited to the order indicated by the arrow. In some implementation scenarios of the embodiments of the present application, the implementation steps in the flowcharts may be performed in other sequences as desired, unless explicitly stated otherwise herein. In addition, some or all of the steps in each flowchart may include multiple sub-steps or multiple stages based on an actual implementation scenario. Some or all of these sub-steps or stages may be performed at the same time, or each of these sub-steps or stages may be performed at different times, respectively. In a scenario where execution times are different, an execution sequence of the sub-steps or the phases may be flexibly configured according to requirements, which is not limited in the embodiment of the present application.
The foregoing is only an optional implementation manner of a part of implementation scenarios in this application, and it should be noted that, for those skilled in the art, other similar implementation means based on the technical idea of this application are also within the protection scope of the embodiments of this application without departing from the technical idea of this application.

Claims (11)

1. A data processing method, comprising:
carrying out model acceleration processing on an original data model of the edge equipment to obtain a target data model; wherein the model acceleration processing comprises a model quantization operation and an activation function operator replacement operation; the replacement operation of the activation function operator comprises replacing the activation function operator with a target parameter value recorded in a preset lookup table according to the preset lookup table;
and processing data based on the target data model.
2. The data processing method of claim 1, wherein performing model acceleration on the raw data model of the edge device to obtain the target data model comprises:
performing model quantization operation on the weight parameters in the original data model of the edge device to obtain a quantized data model;
and determining a target parameter value corresponding to the activation function operator in the quantized data model according to the lookup table, and executing the replacement operation of the activation function operator on the activation function operator to obtain a target data model.
3. The data processing method according to claim 2, wherein before performing model quantization operation on the weight parameters in the original data model of the edge device to obtain a quantized data model, the method further comprises:
carrying out model pruning operation on the original data model to obtain a compressed original data model;
the performing model quantization operation on the weight parameters in the original data model of the edge device to obtain a quantized data model includes:
and carrying out model quantization operation on the weight parameters in the compressed original data model to obtain a quantized data model.
4. The data processing method of claim 3, wherein performing model pruning operations on the original data model to obtain a compressed original data model comprises:
carrying out sparse training on an original data model, and cutting channels lower than a preset threshold value in the original data model according to the result of the sparse training;
and recovering the precision value of the cut original data model to obtain a compressed original data model.
5. The data processing method of claim 4, wherein the sparsely training the original data model and clipping channels of the original data model that are lower than a preset threshold according to a result of the sparsely training comprises:
introducing a scale factor into each channel of the batch standardized BN layer of the original data model, applying an L1 regular constraint to the scale factor, and performing sparse training; wherein the scale factor comprises a weight of the channel;
and sequencing the scale factors after the sparse training, and cutting channels corresponding to the scale factors lower than a preset threshold value.
6. The data processing method according to claim 2 or 3, wherein performing model quantization operation on the weight parameters in the original data model of the edge device to obtain a quantized data model comprises:
converting the precision of the weight parameters in the original data model into target precision, and carrying out model quantization operation;
and determining a quantization error and a truncation error corresponding to the model quantization operation, and performing error correction operation based on the quantization error and the truncation error to obtain a quantized data model.
7. The data processing method of claim 6, wherein the determining quantization error and truncation error corresponding to the model quantization operation, and performing an error correction operation based on the quantization error and the truncation error comprises:
calculating a difference value of the mean values and a divisor of the variance value of the weight parameters before and after the model quantization operation;
and determining the weight of the weight parameter according to the difference value and the divisor, and performing error correction operation based on the weight.
8. The data processing method of claim 6, wherein the determining quantization error and truncation error corresponding to the model quantization operation, and performing an error correction operation based on the quantization error and the truncation error further comprises:
counting tensor distribution of each layer in the original data model, and calculating quantization parameters of each layer;
and determining the truncation error based on the tensor distribution and the quantization parameter, and performing an error correction operation based on the truncation error.
9. A data processing apparatus, comprising:
the acceleration module is used for carrying out model acceleration processing on an original data model of the edge equipment to obtain a target data model; wherein the model acceleration processing comprises a model quantization operation and an activation function operator replacement operation; the replacement operation of the activation function operator comprises replacing the activation function operator with a target parameter value recorded in a preset lookup table according to the preset lookup table;
and the processing module is used for processing data based on the target data model.
10. An electronic device, characterized in that the electronic device comprises:
one or more processors;
a memory;
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to: performing the data processing method according to any one of claims 1 to 8.
11. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the data processing method of any one of claims 1 to 8.
CN202111642847.6A 2021-12-29 2021-12-29 Data processing method and device, electronic equipment and computer readable storage medium Pending CN114358280A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111642847.6A CN114358280A (en) 2021-12-29 2021-12-29 Data processing method and device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111642847.6A CN114358280A (en) 2021-12-29 2021-12-29 Data processing method and device, electronic equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN114358280A true CN114358280A (en) 2022-04-15

Family

ID=81103560

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111642847.6A Pending CN114358280A (en) 2021-12-29 2021-12-29 Data processing method and device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN114358280A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115146775A (en) * 2022-07-04 2022-10-04 同方威视技术股份有限公司 Edge device reasoning acceleration method and device and data processing system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115146775A (en) * 2022-07-04 2022-10-04 同方威视技术股份有限公司 Edge device reasoning acceleration method and device and data processing system

Similar Documents

Publication Publication Date Title
CN107451658B (en) Fixed-point method and system for floating-point operation
CN110347873B (en) Video classification method and device, electronic equipment and storage medium
US20220164666A1 (en) Efficient mixed-precision search for quantizers in artificial neural networks
US11450096B2 (en) Systems and methods for progressive learning for machine-learned models to optimize training speed
CN114358280A (en) Data processing method and device, electronic equipment and computer readable storage medium
US20240071070A1 (en) Algorithm and method for dynamically changing quantization precision of deep-learning network
CN116957024A (en) Method and device for reasoning by using neural network model
US20220405561A1 (en) Electronic device and controlling method of electronic device
CN116403569A (en) Speech recognition method, device, computer equipment and medium based on artificial intelligence
CN112288032B (en) Method and device for quantitative model training based on generation of confrontation network
CN111614358B (en) Feature extraction method, system, equipment and storage medium based on multichannel quantization
CN114842838A (en) Audio recognition method, device, electronic apparatus, medium, and program product
US11410036B2 (en) Arithmetic processing apparatus, control method, and non-transitory computer-readable recording medium having stored therein control program
US11689693B2 (en) Video frame interpolation method and device, computer readable storage medium
EP3742283A1 (en) Arithmetic processing device, method for controlling arithmetic processing device, and program for controlling arithmetic processing device
CN111767204B (en) Spill risk detection method, device and equipment
CN114065913A (en) Model quantization method and device and terminal equipment
Du et al. Model quantization and hardware acceleration for vision transformers: A comprehensive survey
US20210110213A1 (en) Deep learning model embodiments and training embodiments for faster training
WO2024060727A1 (en) Method and apparatus for training neural network model, and device and system
CN104320659A (en) Background modeling method, device and apparatus
CN111767980B (en) Model optimization method, device and equipment
CN113160795B (en) Language feature extraction model training method, device, equipment and storage medium
CN115658307B (en) Intelligent load processing method and system based on compressed data direct calculation
EP4336412A1 (en) Method and apparatus for quantizing neural network model, and computing device and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 100176 Room 101, 1f, building 3, yard 18, Kechuang 10th Street, Beijing Economic and Technological Development Zone, Beijing

Applicant after: Beijing yisiwei Computing Technology Co.,Ltd.

Address before: 100176 Room 101, 1f, building 3, yard 18, Kechuang 10th Street, Beijing Economic and Technological Development Zone, Beijing

Applicant before: Beijing yisiwei Computing Technology Co.,Ltd.