CN116776924A

CN116776924A - Model quantization method, device, equipment, storage medium and product

Info

Publication number: CN116776924A
Application number: CN202310745288.4A
Authority: CN
Inventors: 张光艳
Original assignee: Shenzhen Huantai Technology Co Ltd
Current assignee: Shenzhen Huantai Technology Co Ltd
Priority date: 2023-06-21
Filing date: 2023-06-21
Publication date: 2023-09-19

Abstract

The application provides a model quantization method, a model quantization device, model quantization equipment, a model quantization storage medium and a model quantization product, and belongs to the technical field of deep learning. The method comprises the following steps: based on a target label corresponding to the deep learning model and Gaussian noise, generating fake data through a generator, wherein the label of the fake data is the target label; determining proxy data from a proxy data set based on a plurality of first normalized BN layers in the deep learning model, wherein the proxy data set comprises a plurality of proxy data which are open-source real data; mixing the fake data and the proxy data to obtain first sample data; the deep learning model is quantized based on the first sample data. The method and the device can improve the precision of the quantized deep learning model.

Description

Model quantization method, device, equipment, storage medium and product

Technical Field

The application relates to the technical field of deep learning, in particular to a model quantization method, a device, equipment, a storage medium and a product.

Background

At present, the deep learning model has taken the dominant application in various industries, however, in order to improve the precision of the deep learning model, the deep learning model is often relatively large, so that the deep learning model is difficult to deploy on the edge equipment with limited resources (such as a mobile phone or a smart watch, etc.); therefore, it is necessary to quantize the deep learning model.

Disclosure of Invention

The embodiment of the application provides a model quantization method, a device, equipment, a storage medium and a product, which can improve the precision of a quantized deep learning model. The technical scheme is as follows:

in one aspect, a method for model quantization is provided, the method comprising:

based on a target label corresponding to the deep learning model and Gaussian noise, generating fake data through a generator, wherein the label of the fake data is the target label;

determining proxy data from a proxy data set based on a plurality of first normalized BN layers in the deep learning model, wherein the proxy data set comprises a plurality of proxy data which are open-source real data;

mixing the fake data and the proxy data to obtain first sample data;

the deep learning model is quantized based on the first sample data.

In another aspect, there is provided a model quantization apparatus, the apparatus comprising:

the generation module is used for generating fake data through a generator based on a target label corresponding to the deep learning model and Gaussian noise, wherein the label of the fake data is the target label;

The first determining module is used for determining proxy data from a proxy data set based on a plurality of first normalization BN layers in the deep learning model, wherein the proxy data set comprises a plurality of proxy data which are open-source real data;

the mixing module is used for mixing the fake data and the proxy data to obtain first sample data;

and the quantization module is used for quantizing the deep learning model based on the first sample data.

In another aspect, a computer device is provided, the computer device including a processor and a memory, the memory storing at least one program code, the at least one program code loaded and executed by the processor to implement the model quantization method described above.

In another aspect, a computer readable storage medium having stored therein at least one program code loaded and executed by a processor to implement the model-obtained quantization method described above is provided.

In another aspect, a computer program product is provided, the computer program product storing at least one program code for execution by a processor to implement the model quantization method described above.

In the embodiment of the application, the first sample data comprises fake data and proxy data; the imitation data is generated based on the label corresponding to the deep learning model, so that the imitation data is data conforming to the label requirements; the proxy data is open source data which is obtained truly, and the open source data has rich characteristics; proxy data is therefore advantageous in terms of data characteristics; therefore, based on the first sample data, the advantage of the imitation data and the proxy data is combined, and the accuracy of the quantized deep learning model can be improved.

Drawings

FIG. 1 illustrates a schematic diagram of an implementation environment of a model quantization method according to an exemplary embodiment of the present application;

FIG. 2 illustrates a flow chart of a model quantization method according to an exemplary embodiment of the present application;

FIG. 3 illustrates a schematic diagram of a model quantization method according to an exemplary embodiment of the present application;

FIG. 4 illustrates a flow chart of a model quantization method according to an exemplary embodiment of the present application;

FIG. 5 illustrates a flow chart of a model quantization method according to an exemplary embodiment of the present application;

FIG. 6 illustrates a flow chart of a model quantization method according to an exemplary embodiment of the present application;

FIG. 7 illustrates a flow chart of a model quantization method according to an exemplary embodiment of the present application;

FIG. 8 illustrates a block diagram of a model quantization apparatus according to an exemplary embodiment of the present application;

FIG. 9 illustrates a block diagram of a terminal shown in accordance with an exemplary embodiment of the present application;

FIG. 10 illustrates a block diagram of a server in accordance with an exemplary embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings.

References herein to "a plurality" means two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship.

It should be noted that, the information (including but not limited to user equipment information, user personal information, etc.), data (including but not limited to data for analysis, stored data, presented data, etc.), and signals related to the present application are all authorized by the user or are fully authorized by the parties, and the collection, use, and processing of the related data is required to comply with the relevant laws and regulations and standards of the relevant countries and regions. For example, the falsification data, the proxy data, and the like, which are referred to in the present application, are acquired with sufficient authorization.

Referring to fig. 1, a schematic diagram of an implementation environment of a model quantization method according to an exemplary embodiment of the present application is shown. Referring to fig. 1, the implementation environment includes: a computer device 101 and a terminal 102. The computer device 101 is configured to quantize the deep learning model, and deploy the quantized deep learning model into the terminal 102. The embodiment of the application is a model data quantization method, namely a method for quantizing the original sample data independent of training the deep learning model. The computer device 101 may be a terminal or a server; in fig. 1, a computer device 101 is illustrated as a server. While the terminal 102 may be a resource-constrained edge terminal, for example, the terminal 102 may be a cell phone or a wearable device, etc.

In some embodiments, the deep learning model may be an image recognition model, that is, the computer device 101 quantizes the image recognition model, and deploys the quantized image recognition model into the terminal 102, so that the terminal 102 performs image recognition through the quantized image recognition model. In some embodiments, the deep learning model may be a speech recognition model, that is, the computer device 101 quantizes the speech recognition model, and deploys the quantized speech recognition model into the terminal 102, so that the terminal 102 performs speech recognition through the quantized speech recognition model; for example, the terminal 102 recognizes a voice control instruction of a voice signal, thereby performing an operation corresponding to the voice control instruction, and further achieving the purpose of controlling the terminal 102 through the voice signal. In some embodiments, the deep learning model may be an image classification model, i.e., the computer device 101 quantizes the image classification model, and deploys the quantized image classification model into the terminal 102, such that the terminal 102 performs image classification by the quantized image classification model. In addition, the model quantization method provided by the embodiment of the application can quantize any deep learning model, and the above is just an example of the deep learning model and does not limit the deep learning model.

Referring to fig. 2, a flow chart of a model quantization method according to an exemplary embodiment of the present application is shown. Referring to fig. 2, the method includes:

step 201: based on the target label corresponding to the deep learning model and Gaussian noise, generating fake data through a generator, wherein the label of the fake data is the target label.

The fake data is training data corresponding to the deep learning model; in some embodiments, where the deep learning model is an image recognition model, the counterfeit data may be a counterfeit image; in the case where the deep learning model is a speech recognition model, the falsified data may be a falsified speech signal; in the case where the deep learning model is an image classification model, the falsified data may be a falsified image.

In some embodiments, this step may be achieved by the following steps (1) to (3), comprising:

(1) And determining a target label from a label set, wherein labels of second sample data are stored in the label set, and the second sample data are sample data for training a deep learning model.

For example, from the tag set {0,1, …, N-1}, one tag y is sampled, which is the target tag, and N is the number of tags. Wherein {0,1, …, N-1} in the tag set is a digital identifier corresponding to the tag of the second sample data; for example, the deep learning model is an image recognition model, and the labels of the image recognition model are "dog", "cat" and "rabbit", respectively, and the label set is {0,1,2,3}, where 0 corresponds to the label "dog", 1 corresponds to the label "cat", 2 corresponds to the label "rabbit", and 3 corresponds to the label "mouse".

(2) Gaussian noise is determined.

The computer equipment randomly selects one Gaussian noise from normal distribution parameters of the tag; for example, the computer device samples a random vector z, z.epsilon.R from N (0, 1) ^1×N R represents a natural number.

(3) Based on the target tag and Gaussian noise, counterfeit data is generated by a generator, and the tag of the counterfeit data is the target tag.

The computer equipment inputs the target tag and Gaussian noise into the generator to output the forged data, and the high-speed noise is used for increasing the difference degree of the generated plurality of forged data, namely the number of the forged data is a plurality of, the tags of the plurality of forged data are all target tags, and the plurality of forged data are different. Wherein the generator generates the counterfeit data based on the target tag and the gaussian noise can be realized by the following formula one: equation one: i _FD =g (z|y), z to N (0, 1); wherein I is _FD Representing counterfeit data; g (-) represents the generator, y represents the target tag, and z represents Gaussian noise. For example, referring to FIG. 3, the target tag and Gaussian noise selected by the computer device are y and z, respectively, and y+z is input into the generator G to output the counterfeit data I _FD ，I _FD ∈R ^c×h×w The method comprises the steps of carrying out a first treatment on the surface of the Wherein c represents I _FD And h represents the height and w represents the width 。

Step 202: and determining proxy data from a proxy data set based on a plurality of first normalized BN layers in the deep learning model, wherein the proxy data set comprises a plurality of proxy data which are open-source real data.

The agent data is training data corresponding to the deep learning model; in some embodiments, where the deep learning model is an image recognition model, the proxy data may be an open-source real image; in the case that the deep learning model is a speech recognition model, the proxy data may be an open-source real speech signal; in the case where the deep learning model is an image classification model, the proxy data may be an open-source real image.

When not all open source data are used as proxy data, the final precision of the data-free quantization method can be finally improved; therefore, how to pick the appropriate proxy data is a critical point. In the embodiment of the application, the agent data is selected by means of the BN layer, so that the final precision of the data quantization method is improved.

Step 203: mixing the fake data and the proxy data to obtain first sample data.

The number of the fake data and the proxy data is multiple; in some embodiments, the computer device may directly mix the falsified data and the proxy data determined in step 201 to obtain the first sample data. In other embodiments, the computer device may control the mix ratio of the falsified data and the proxy data; accordingly, step 202 may be: determining a first super-parameter, wherein the first super-parameter is used for restricting the mixing proportion of the fake data and the proxy data; mixing the imitation data and the proxy data based on the first super parameter to obtain first sample data.

In some embodiments, the first superparameter is a weight of the spurious data; the step of mixing the dummy data and the proxy data based on the first super parameter to obtain the first sample data may be implemented by the following formula two: formula II:wherein (1)>Represents the first sample data, gamma represents the first superparameter, I _FD Representing counterfeit data, I _PD Representing proxy data, concat represents a blending function. The first hyper-parameter may be configured into the deep learning model in advance; the first super parameter may be set and changed according to the need, and in the embodiment of the present application, the first super parameter is not specifically limited.

In the embodiment of the application, the proxy data can be inserted into the data quantization method for use with minimum cost, so that the embodiment of the application can reduce the cost of model quantization.

Step 204: based on the first sample data, the deep learning model is quantized.

The computer equipment determines a quantization model based on the deep learning model, wherein the quantization model is a model obtained by quantizing the deep learning model, and performs multi-round optimization on the quantization model based on the first sample data until the iteration stop condition is met, so that the quantization process of the deep learning model is finally completed.

Since the first sample data includes falsification data and proxy data; the imitation data is generated based on the label corresponding to the deep learning model, so that the imitation data is data conforming to the label requirements; the proxy data is open source data which is obtained truly, and the open source data has rich characteristics; proxy data is therefore advantageous in terms of data characteristics; therefore, based on the first sample data, the advantage of the imitation data and the proxy data is combined, and the accuracy of the quantized deep learning model can be improved.

Referring to fig. 4, a flow chart of a model quantization method according to an exemplary embodiment of the present application is shown. Referring to fig. 4, the method includes:

step 401: the computer equipment generates fake data through a generator based on the target label corresponding to the deep learning model and Gaussian noise, and the label of the fake data is the target label.

In some embodiments, this step is the same as step 201, and will not be described here again.

Step 402: the computer device determines a proxy data set that includes a plurality of candidate proxy data.

Candidate proxy data is open source data; accordingly, the computer device determines a proxy data set from the open source data set. In some embodiments, the computer device determines open source data of a data type corresponding to the deep learning model from the set of open source data, and constructs the determined open source data into the proxy data set. For example, the deep learning model is an image recognition model, and the computer device determines from the open source data set that the plurality of images comprise a proxy data set; for another example, if the deep learning model is a speech recognition model, the computer device determines that a plurality of speech signals form a proxy data set from the open source data set; for another example, where the deep learning model is an image classification model, the computer device determines from the open source data set that the plurality of images comprise a proxy data set.

In some embodiments, the computer device randomly determines a plurality of open source data from the open source data set, and forms the determined plurality of open source data into the proxy data set, so that randomness of the data can be ensured, and further, accuracy of model quantization can be improved.

Step 403: the computer device determines normalized distances corresponding to a plurality of candidate agent data in the agent data set based on a plurality of first batch normalized BN layers in the deep learning model, the normalized distances corresponding to the candidate agent data being used to characterize a correlation between the candidate agent data and second sample data, the second sample data being sample data training the deep learning model.

The larger the normalized distance corresponding to the candidate proxy data is indicative of the smaller the correlation between the candidate proxy data and the second sample data, and the smaller the normalized distance corresponding to the candidate proxy data is indicative of the larger the correlation between the candidate proxy data and the second sample data. It should be noted that, in quantizing the deep learning model, the second sample data is not actually acquired, but the correlation with the second sample data is only characterized by normalizing the distance.

(1) For any candidate agent data, the computer device enters the candidate agent data into a deep learning model.

The computer device may input a plurality of candidate agent data in the agent data set into the deep learning model at once, and the deep learning model performs step (2) on the plurality of candidate agent data in sequence.

(2) The computer device determines, through a plurality of first BN layers in the deep learning model, a first mean and a first variance of the candidate agent data at the plurality of BN layers.

The deep learning model comprises a plurality of first BN layers, and the BN layers are used for determining normalized distances.

(3) The computer device determines a normalized distance corresponding to the candidate agent data based on a first mean and a first variance of the candidate agent data at the plurality of first BN layers and a second mean and a second variance pre-stored at the plurality of first BN layers.

This step can be achieved by the following steps (3-1) to (3-2), comprising:

(3-1) for any one of the first BN layers, the computer apparatus determining a first normalized distance based on a first mean value of the candidate agent data at the first BN layer and a second mean value pre-stored at the first BN layer, and determining a second normalized distance based on a first variance of the candidate agent data at the first BN layer and a second variance pre-stored at the first BN layer, determining a sum of the first normalized distance and the second normalized distance, resulting in a third normalized distance.

(3-2) the computer apparatus determining an average value of the third normalized distances corresponding to the plurality of first BN layers, to obtain a normalized distance corresponding to the candidate agent data.

In some embodiments, step (3-1) and step (3-2) may be implemented by the following equation three:

and (3) a formula III:

wherein d _BN Representing the normalized distance corresponding to the candidate proxy data,and->Mean and variance, μ calculated for candidate agent data of the ith first BN layer, respectively ₁ Sum sigma ₁ The corresponding mean and variance at the ith first BN layer are stored separately for the deep learning model. L1 represents the total number of first BN layers in the deep learning model. Wherein (1)>

Step 404: the computer device determines, from the proxy data set, proxy data for which the corresponding normalized distances satisfy a condition, based on the normalized distances for the plurality of candidate proxy data in the proxy data set.

The larger the normalized distance corresponding to the candidate proxy data is indicative of the smaller the correlation between the candidate proxy data and the second sample data, and the smaller the normalized distance corresponding to the candidate proxy data is indicative of the larger the correlation between the candidate proxy data and the second sample data. Accordingly, the computer device determines, from the set of proxy data, a plurality of proxy data that have a minimum corresponding normalized distance based on the normalized distances corresponding to the plurality of candidate proxy data in the set of proxy data; or determining a plurality of proxy data with the corresponding normalized distance smaller than the preset distance from the proxy data set.

Step 405: the computer device mixes the falsified data and the proxy data to obtain first sample data.

In some embodiments, this step is the same as step 203, and will not be described here again.

Step 406: the computer device quantizes the deep learning model based on the first sample data.

In some embodiments, this step is the same as step 204, and will not be described here again.

In the embodiment of the application, as not all open source data are used as proxy data, the final precision of the data quantization method can be finally improved; therefore, how to pick the appropriate proxy data is a critical point. In the embodiment of the application, the agent data is selected by means of the BN layer, so that the final precision of the data quantization method is improved.

Referring to fig. 5, a flowchart of a model quantization method according to an exemplary embodiment of the present application is shown. Referring to fig. 5, the method includes:

step 501: the computer equipment generates fake data through a generator based on the target label corresponding to the deep learning model and Gaussian noise, and the label of the fake data is the target label.

Step 502: the computer equipment determines proxy data from a proxy data set based on a plurality of first normalized BN layers in the deep learning model, wherein the proxy data set comprises a plurality of proxy data which are open-source real data.

In some embodiments, this step may be implemented by steps 402-404, which are not described in detail herein.

Step 503: the computer device marks the proxy data with a tag.

Since the fake data is data generated based on the target label of the deep learning model, the fake data and the original data (second sample data for training the deep learning model) have the same label, and the corresponding label is marked in the fake data; the proxy data is open source data obtained from the nature, and the open source data does not necessarily have the same label as the original data (second sample data for training the deep learning model); it is therefore necessary to construct a tag of the proxy data and then label its corresponding tag in the proxy data.

In some embodiments, the computer device inputs the proxy data into the deep learning model, outputs a tag of the proxy data, and tags the proxy data in the proxy data. The deep learning model is quantized through proxy data; therefore, the deep learning model can determine one tag from the corresponding plurality of tags as the tag of the proxy data, and the determined tag has the same tag as the original data (the second sample data for training the deep learning model), so that the accuracy of subsequent quantization can be improved.

In some embodiments, the deep learning model generating tags for proxy data may be implemented by the following equation four: equation four:wherein (1)>Tags representing proxy data, F (·) representing the original deep learning model, I _PD Representing proxy data. For example, with continued reference to FIG. 3, the computer device will I _PD Tag of output proxy data in input original deep learning model +.>

Step 504: the computer device mixes the falsified data and the proxy data to obtain first sample data.

In some embodiments, this step is the same as step 201, and will not be described here again. The process of quantifying the deep learning model by the computer device is: the method comprises the steps that a computer device determines a quantization model, wherein the quantization model is obtained by quantizing a deep learning model, and then the quantization model is optimized through a multi-round iterative optimization process so as to finally finish quantization of the deep learning model; for each round of iterative optimization, this is achieved by steps 505-508. In some embodiments, the process of the computer device determining the quantization model may be: the computer device quantizes floating point data in the deep learning model into fixed point data, resulting in a quantized model. The computer device may also determine the quantization model by other quantization methods, and in the embodiment of the present application, the process of determining the quantization model by the computer device is not specifically limited.

Step 505: for either round of iterative optimization, the computer device determines a first loss function through the quantization model based on the first sample data and the labels of the first sample data.

The quantization model is a model obtained by quantizing the deep learning model; the computer device determines a first loss function from the cross entropy loss function by a quantization model based on the first sample data and the labels of the first sample data. For example, the computer device inputs the labeled first sample data into the quantization model, determines a first loss function by cross entropy loss function, and the implementation can be achieved by the formula five: formula five:wherein L is _CE (Q) represents a first loss function, Q (& gt) represents a quantization model, & lt/H>Representing the first sample data, +.>The label representing the first sample data, CE (·, ·) represents the cross entropy loss function.

Step 506: the computer device determines a second loss function based on the first sample data through a quantization model and a deep learning model.

The computer device determines a second loss function from the relative entropy function by a quantization model and a deep learning model based on the first sample data. For example, the computer device inputs the first sample data into the quantization model and the deep learning model, respectively, and determines a second loss function through the relative entropy function, and the implementation process can be implemented through the formula six: formula six: Wherein L is _KD (Q) represents a second loss function, Q (& gt) represents a quantization model, & lt/H>Representing first sample data, F (·) represents a deep learning model.

Step 507: the computer device performs weighted summation on the first loss function and the second loss function based on a second super parameter, so as to obtain a first total loss function, wherein the second super parameter is used for restraining the weight of the second loss function.

In some embodiments, the computer device weight sums the first loss function and the second loss function based on the second hyper-parameter to obtain a first total loss function by the following equation seven. Formula seven: l (L) _Q ＝L _CE (Q)+β·L _KD (Q); wherein L is _Q Represents a first total loss function, beta represents a second super-parameter, L _KD (Q) represents a second loss function, L _CE (Q) represents a first loss function.

In some embodiments, the computer device performs a weighted summation of the first loss function and the second loss function based on a second hyper-parameter and a fourth hyper-parameter, by the following equation eight, to obtain a first total loss function, the second hyper-parameter is used to constrain the weight of the second loss function, and the fourth hyper-parameter is used to constrain the weight of the first loss function. Formula eight: l (L) _Q ＝εL _CE (Q)+β·L _KD (Q); wherein L is _Q Represents a first total loss function, beta represents a second super-parameter, epsilon represents a fourth super-parameter, L _KD (Q) represents a second loss function, L _CE (Q) represents a first loss function.

In the embodiment of the application, the total loss function is obtained by weighting and summing the loss functions obtained in two modes, and the influence of the two loss functions on the whole loss function can be weighed, so that the accuracy of model quantization based on the total loss function is improved.

Step 508: the computer device performs a round of optimization of the quantization model based on the first total loss function.

Under the condition that the corresponding value of the first total loss function is smaller than a first preset threshold value, optimizing the quantization model is finished; at the first total lossAnd under the condition that the corresponding value of the loss function is not smaller than a first preset threshold value, adjusting the parameter value of the quantization model, and then carrying out the optimization process of the next round. Or if the difference between the first total loss function and the value corresponding to the first total loss function of the previous round is smaller than a second preset threshold value, optimizing the quantization model is finished; and under the condition that the difference value between the values corresponding to the first total loss function of the previous round and the first total loss function of the previous round is not smaller than a second preset threshold value, then carrying out the optimization process of the next round until the optimization of the quantization model is finished. For example, with continued reference to FIG. 3, the computer device will proxy data I _PD And falsification data I _FD And inputting the data into a quantization model to optimize the quantization model.

In the embodiment of the application, when the first loss function and the second loss function are determined, the first sample data combined with the proxy data and the fake data is directly used for determination, so that the method of the application can be combined into other non-data quantization methods at zero cost, and further the other non-data quantization methods can be helped to improve quantization precision. In addition, the proxy data-based implementation method for the data quantization can solve the problem that the model quantization precision is further degraded due to the lack of an original data set in the data quantization method. At the same time, the embodiment of the application verifies that it is unnecessary to rely entirely on imitation data sets, and also verifies that proxy data can be cost-effectively incorporated into other methods of quantifying data.

Referring to fig. 6, a flow chart of a model quantization method according to an exemplary embodiment of the present application is shown. Referring to fig. 6, the method includes:

step 601: the computer equipment generates fake data through a generator based on the target label corresponding to the deep learning model and Gaussian noise, and the label of the fake data is the target label.

Step 602: the computer equipment determines proxy data from a proxy data set based on a plurality of first normalized BN layers in the deep learning model, wherein the proxy data set comprises a plurality of proxy data which are open-source real data.

Step 603: the computer device mixes the falsified data and the proxy data to obtain first sample data.

Step 604: the computer device quantizes the deep learning model based on the first sample data.

In some embodiments, this step may be implemented by steps 505-508, which are not described in detail herein.

Step 605: the computer device determines a third loss function of the generator based on the counterfeit data and the target label by a deep learning model.

The computer device determines a third loss function from the cross entropy loss function through a deep learning model based on the counterfeit data and the target label. For example, the computer device inputs the counterfeit data labeled with the target label into the deep learning model, determines a third loss function by cross entropy loss function, and the implementation process can be implemented by the formula nine: formula nine: l (L) _CE (G) CE (F (G (z|y), y); wherein L is _Ce (G) Representing a third loss function, F (·) representing a deep learning model, G (z|y) representing counterfeit data, y representing a label of the counterfeit data, and CE (·, ·) representing a cross entropy loss function.

Step 606: the computer device determines a fourth loss function of the generator through the plurality of second BN layers in the deep learning model based on the imitation data.

The computer equipment introduces a BN loss function to constrain the generator so that the counterfeit data is more nearly truly distributed, thereby improving the accuracy of data quantification. Accordingly, the computer apparatus inputs the falsified data into the generator, determines a fourth loss function of the generator through the plurality of second BN layers in the deep learning model, and the implementation process may be implemented by the formula ten: formula ten: wherein, the liquid crystal display device comprises a liquid crystal display device,and->Mean and variance, μ calculated for the ith second BN layer imitation data, respectively ₂ Sum sigma ₂ The corresponding mean and variance at the ith second BN layer are stored separately for the master model. L2 represents the total number of second BN's in the network. Wherein->In some embodiments, the second BN layer and the first BN layer may be the same BN layer, or may be different BN layers in a deep learning model.

Step 607: the computer device performs weighted summation on the third loss function and the fourth loss function based on a third super parameter, so as to obtain a second total loss function, wherein the third super parameter is used for restraining the weight of the fourth loss function.

In some embodiments, the computer device weight sums the third loss function and the fourth loss function based on the third hyper-parameter by the following equation eleven to obtain a second total loss function. Formula eleven: l (G) =l _CE (G)+α·L _BN (G) The method comprises the steps of carrying out a first treatment on the surface of the Wherein L (G) represents a second total loss function, alpha represents a third super-parameter, L _CE (G) Represents a third loss function, L _NN (G) Representing a fourth loss function.

In some embodiments, the computer device weight sums the third loss function and the fourth loss function based on the third hyper-parameter by the following equation twelve, resulting in a second total loss function. Formula twelve: l (G) = (1- α) L _CE (G)+α·L _BN (G) The method comprises the steps of carrying out a first treatment on the surface of the Wherein L (G) represents a second total loss function, alpha represents a third super-parameter, L _CE (G) Represents a third loss function, L _BN (G) Representing a fourth loss function.

Step 608: the computer device updates the generator based on the second total loss function.

Updating the generator under the condition that the corresponding value of the second total loss function is smaller than a third preset threshold value; and under the condition that the corresponding value of the second total loss function is not smaller than a third preset threshold value, adjusting the parameter value of the generator, and then carrying out the updating process of the next round. Or, updating the generator when the difference between the values corresponding to the second total loss function and the second total loss function of the previous round is smaller than a fourth preset threshold value; and under the condition that the difference value between the second total loss function and the value corresponding to the second total loss function of the previous round is not smaller than a fourth preset threshold value, then carrying out the updating process of the next round until the generator updating is finished.

In some embodiments, after the computer device updates the generator, a new target tag is selected from the set of tags corresponding to the deep learning model, new counterfeit data is generated by the updated generator based on the new target tag and the redetermined gaussian noise, and then a subsequent model quantization process is performed. In the embodiment of the application, after the primary forged data is generated, the generator is updated once, so that the generator can be restrained, the forged data generated by the generator is more close to real distribution, and the accuracy of data quantization is further improved.

Referring to fig. 7, a flow chart of a model quantization method according to an exemplary embodiment of the present application is shown. Referring to fig. 7, the method includes:

step 701: the computer equipment generates fake data through a generator based on the target label corresponding to the deep learning model and Gaussian noise, and the label of the fake data is the target label.

Step 702: the computer equipment determines proxy data from a proxy data set based on a plurality of first normalized BN layers in the deep learning model, wherein the proxy data set comprises a plurality of proxy data which are open-source real data.

Step 703: the computer device mixes the falsified data and the proxy data to obtain first sample data.

Step 704: the computer device performs a round of optimization of the quantization model based on the first sample data.

Step 705: the computer device updates the generator for a round, and generates new counterfeit data by the updated generator.

In some embodiments, the process of updating the generator by the computer device may be implemented in steps 605-608, and the process of generating new counterfeit data by the computer device through the updated generator is similar to step 201, except that the obtained target label is different, and of course, the gaussian noise may also be different, and specific processes are not repeated here.

Step 706: the computer device mixes the new falsified data with the proxy data to obtain new first sample data.

Step 707: the computer device performs a further round of optimization of the quantization model based on the new first sample data.

In some embodiments, this step may be implemented by steps 504-507, which are not described in detail herein.

Step 708: the computer device updates the generator again.

In some embodiments, this step may be implemented by steps 504-507, which are not described in detail herein. After the computer updates the generator again, new fake data is generated based on the updated generator, and then the new fake data and the proxy data are mixed again, that is, steps 706-708 are executed until the quantization model optimization is completed.

In the embodiment of the application, the generator and the quantization model are updated alternately in each training period, so that fake data can be generated based on the latest generator when the quantization model is optimized, the accuracy of the generated fake data can be improved, and the accuracy of the quantization model is further improved.

Referring to fig. 8, which illustrates a model quantization apparatus according to an exemplary embodiment of the present application, the apparatus includes:

the generating module 801 is configured to generate, based on the target tag and gaussian noise corresponding to the deep learning model, counterfeit data by using a generator, where the label of the counterfeit data is the target tag;

a first determining module 802, configured to determine proxy data from a proxy data set based on a plurality of first normalized BN layers in the deep learning model, where the proxy data set includes a plurality of proxy data, and the proxy data is open-source real data;

a mixing module 803, configured to mix the falsified data and the proxy data to obtain first sample data;

a quantization module 804, configured to quantize the deep learning model based on the first sample data.

In some embodiments, a first determining module 802 is configured to determine a set of proxy data, the set of proxy data comprising a plurality of candidate proxy data; based on a plurality of first batch normalization BN layers in the deep learning model, determining normalization distances corresponding to a plurality of candidate agent data in the agent data set, wherein the normalization distances corresponding to the candidate agent data are used for representing correlation between the candidate agent data and second sample data, and the second sample data is sample data for training the deep learning model; and determining the proxy data of which the corresponding normalized distance meets the condition from the proxy data set based on the normalized distances corresponding to the candidate proxy data in the proxy data set.

In some embodiments, a first determination module 802 is configured to input candidate agent data into a deep learning model for any candidate agent data; determining a first mean and a first variance of candidate agent data at a plurality of BN layers through the plurality of first BN layers in the deep learning model; and determining the normalized distance corresponding to the candidate agent data based on the first mean and the first variance of the candidate agent data in the plurality of first BN layers and the second mean and the second variance pre-stored in the plurality of first BN layers.

In some embodiments, the first determining module 802 is configured to determine, for any first BN layer, a first normalized distance based on a first mean value of the candidate agent data at the first BN layer and a second mean value pre-stored at the first BN layer, and determine, based on a first variance of the candidate agent data at the first BN layer and a second variance pre-stored at the first BN layer, a second normalized distance, determine a sum of the first normalized distance and the second normalized distance, and obtain a third normalized distance; and determining the average value of the third normalized distances corresponding to the plurality of first BN layers to obtain the normalized distance corresponding to the candidate agent data.

In some embodiments, the mixing module 803 is configured to determine a first superparameter, where the first superparameter is configured to constrain a mixing ratio of the spurious data and the proxy data; mixing the imitation data and the proxy data based on the first super parameter to obtain first sample data.

In some embodiments, the quantization module 804 is configured to determine, for any round of iterative optimization, a first loss function through a quantization model based on the first sample data and a label of the first sample data, where the quantization model is a model obtained by quantizing the deep learning model; determining a second loss function by a quantization model and a deep learning model based on the first sample data; carrying out weighted summation on the first loss function and the second loss function based on a second super parameter to obtain a first total loss function, wherein the second super parameter is used for restraining the weight of the second loss function; the quantization model is optimized based on the first total loss function.

In some embodiments, the apparatus further comprises:

the labeling module is used for inputting the proxy data into the deep learning model and outputting the label of the proxy data; the tag of the proxy data is marked in the proxy data.

In some embodiments, the apparatus further comprises:

a second determining module for determining a third loss function of the generator through a deep learning model based on the counterfeit data and the target label;

a third determining module for determining a fourth loss function of the generator through the plurality of second BN layers in the deep learning model based on the imitation data;

The weighting module is used for carrying out weighted summation on the third loss function and the fourth loss function based on a third super parameter, so as to obtain a second total loss function, wherein the third super parameter is used for restraining the weight of the fourth loss function;

and the updating module is used for updating the generator based on the second total loss function.

It should be noted that, in the model quantization apparatus provided in the foregoing embodiment, only the division of the functional modules is used for illustration, and in practical application, the foregoing functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the terminal is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the model quantization apparatus and the model quantization method embodiment provided in the foregoing embodiments belong to the same concept, and detailed implementation processes of the model quantization apparatus and the model quantization method embodiment are shown in the method embodiment, and are not repeated here.

In some embodiments, the computer device may be a terminal; referring to fig. 9, a block diagram of a terminal 900 according to an exemplary embodiment of the present application is shown. The terminal 900 of the present application may include one or more of the following components: a processor 910, a memory 920, a display 930, and a computer device communication module 940.

The processor 910 is electrically connected to the computer device communication module 940 through a bus, and the processor 910 communicates with the computer device through the computer device communication module 940 to obtain the indication information.

Processor 910 may include one or more processing cores. The processor 910 connects various parts within the overall terminal 900 using various interfaces and lines, performs various functions of the terminal 900 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 920, and invoking data stored in the memory 920. Alternatively, the processor 910 may be implemented in hardware in at least one of digital signal processing (Digital Signal Processing, DSP), field programmable gate array (Field-Programmable Gate Array, FPGA), programmable logic array (Programmable Logic Array, PLA). The processor 910 may integrate one or a combination of several of a central processing unit (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU), a Neural network processor (Neural-network Processing Unit, NPU), a modem, etc. The CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display 930; the NPU is used to implement artificial intelligence (Artificial Intelligence, AI) functionality; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the processor 910 and may be implemented by a single chip.

The Memory 920 may include a random access Memory (Random Access Memory, RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 920 includes a non-transitory computer readable medium (non-transitory computer-readable storage medium). Memory 920 may be used to store instructions, programs, code, sets of codes, or instruction sets. The memory 920 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described below, etc.; the storage data area may store data (e.g., audio data, phonebook) created according to the use of the terminal 900, etc.

The display 930 is a display component for displaying a user interface. Alternatively, the display 930 is a display with a touch function, through which a user may perform a touch operation on the display 930 using any suitable object such as a finger, a stylus, or the like.

The display 930 is typically provided at the front panel of the terminal 900. The display screen 930 may be designed as a full screen, a curved screen, a contoured screen, a double-sided screen, or a folded screen. The display 930 may also be designed as a combination of a full screen and a curved screen, a combination of a special-shaped screen and a curved screen, etc., which is not limited in this embodiment.

In addition, those skilled in the art will appreciate that the structure of terminal 900 illustrated in the above-described figures does not constitute a limitation of terminal 900, and terminal 900 may include more or less components than illustrated, or may combine certain components, or may have a different arrangement of components. For example, the terminal 900 further includes an audio acquisition device, a speaker, a radio frequency circuit, an input unit, a sensor, an audio circuit, a wireless fidelity (Wireless Fidelity, wi-Fi) module, a power supply, a bluetooth module, and the like, which are not described herein.

In some embodiments, the computer device may be a server; referring to fig. 10, a block diagram of a server 1010 is shown in accordance with an exemplary embodiment of the present application. The server 1010 may be configured or configured differently, and may include a processor (central processing units, CPU) 1001 and a memory 1002, where the memory 1002 stores at least one program code, and the at least one program code is loaded and executed by the processor 1001 to implement the methods provided in the above-described method embodiments. Of course, the server 1010 may also have a wired or wireless network interface, a keyboard, an input/output interface, and other components for implementing the functions of the device, which are not described herein.

Embodiments of the present application also provide a computer readable medium storing at least one program code loaded and executed by a processor to implement the model quantization method as shown in the above embodiments.

Embodiments of the present application also provide a computer program product storing at least one program code that is loaded and executed by a processor to implement the model quantization method as shown in the various embodiments above.

In some embodiments, a computer program product according to embodiments of the present application may be deployed to be executed on one computer device or on multiple computer devices at one site or on multiple computer devices distributed across multiple sites and interconnected by a communication network, where the multiple computer devices distributed across multiple sites and interconnected by a communication network may constitute a blockchain system.

Those skilled in the art will appreciate that in one or more of the examples described above, the functions described in the embodiments of the present application may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, these functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.

The foregoing description of the preferred embodiments of the present application is not intended to limit the application, but rather, the application is to be construed as limited to the appended claims.

Claims

1. A method of model quantization, the method comprising:

mixing the fake data and the proxy data to obtain first sample data;

the deep learning model is quantized based on the first sample data.

2. The method of claim 1, wherein the determining proxy data from a proxy data set based on a plurality of first normalized BN layers in the deep learning model comprises:

determining a proxy data set, the proxy data set comprising a plurality of candidate proxy data;

Determining normalized distances corresponding to a plurality of candidate agent data in the agent data set based on a plurality of first batch normalized BN layers in the deep learning model, the normalized distances corresponding to the candidate agent data being used to characterize correlation between the candidate agent data and second sample data, the second sample data being sample data for training the deep learning model;

and determining the proxy data of which the corresponding normalized distance meets the condition from the proxy data set based on the normalized distances corresponding to the candidate proxy data in the proxy data set.

3. The method of claim 2, wherein the determining normalized distances for a plurality of candidate proxy data in the proxy data set based on a plurality of first normalized BN layers in the deep learning model comprises:

for any candidate agent data, inputting the candidate agent data into the deep learning model;

determining, by a plurality of first BN layers in the deep learning model, a first mean and a first variance of the candidate agent data at the plurality of BN layers;

and determining the normalized distance corresponding to the candidate agent data based on the first mean and the first variance of the candidate agent data in the plurality of first BN layers and the second mean and the second variance which are prestored in the plurality of first BN layers.

4. The method of claim 3, wherein the determining the normalized distance corresponding to the candidate proxy data based on the first mean and first variance of the candidate proxy data at the plurality of first BN layers and the second mean and second variance pre-stored at the plurality of first BN layers comprises:

for any first BN layer, determining a first normalized distance based on a first mean value of the candidate agent data in the first BN layer and a second mean value prestored in the first BN layer, determining a second normalized distance based on a first variance of the candidate agent data in the first BN layer and a second variance prestored in the first BN layer, and determining a sum of the first normalized distance and the second normalized distance to obtain a third normalized distance;

and determining the average value of the third normalized distances corresponding to the plurality of first BN layers to obtain the normalized distance corresponding to the candidate agent data.

5. The method of claim 1, wherein the mixing the counterfeit data and the proxy data to obtain first sample data comprises:

determining a first super-parameter, wherein the first super-parameter is used for restricting the mixing proportion of the fake data and the proxy data;

Mixing the imitation data and the proxy data based on the first super parameter to obtain the first sample data.

6. The method of claim 1, wherein the quantizing the deep learning model based on the first sample data comprises:

for any round of iterative optimization, determining a first loss function through a quantization model based on the first sample data and the label of the first sample data, wherein the quantization model is a model obtained by quantizing the deep learning model;

determining a second loss function based on the first sample data by the quantization model and the deep learning model;

carrying out weighted summation on the first loss function and the second loss function based on a second super parameter to obtain a first total loss function, wherein the second super parameter is used for restraining the weight of the second loss function;

optimizing the quantization model based on the first total loss function.

7. The method of claim 1, wherein the mixing of the counterfeit data and the proxy data results in the first sample data, the method further comprising:

Inputting the proxy data into the deep learning model, and outputting a label of the proxy data;

and labeling the label of the proxy data in the proxy data.

8. The method according to any one of claims 1-7, further comprising:

determining a third loss function of the generator through a deep learning model based on the counterfeit data and the target tag;

determining a fourth loss function of the generator by a plurality of second BN layers in the deep learning model based on the imitation data;

carrying out weighted summation on the third loss function and the fourth loss function based on a third super parameter to obtain a second total loss function, wherein the third super parameter is used for restraining the weight of the fourth loss function;

updating the generator based on the second total loss function.

9. A model quantization apparatus, the apparatus comprising:

10. A terminal comprising a processor and a memory, wherein the memory has stored therein at least one program code that is loaded and executed by the processor to implement the model quantization method of any one of claims 1 to 8.

11. A computer readable storage medium having stored therein at least one program code, the at least one program code being loaded and executed by a processor to implement the model quantization method of any one of claims 1 to 8.

12. A computer program product, characterized in that the computer program product stores at least one program code for execution by a processor for implementing the model quantization method according to any one of claims 1 to 8.