CN117974845A

CN117974845A - Data processing method, device, medium and electronic equipment

Info

Publication number: CN117974845A
Application number: CN202410154029.9A
Authority: CN
Inventors: 李慧霞; 杨潇; 郑侠武; 凌峰; 吴捷; 肖学锋; 晁飞; 纪荣嵘
Original assignee: Beijing Zitiao Network Technology Co Ltd
Current assignee: Beijing Zitiao Network Technology Co Ltd
Priority date: 2024-02-02
Filing date: 2024-02-02
Publication date: 2024-05-03

Abstract

The disclosure relates to a data processing method, a device, a medium and an electronic device, wherein the method comprises the following steps: acquiring a sampling time step sequence corresponding to a trained diffusion model, wherein the time step sequence comprises a plurality of time steps for data processing of the diffusion model; determining a target subsequence corresponding to the sampling time step sequence, wherein the target subsequence comprises part of time steps in the plurality of time steps; acquiring a calibration data set corresponding to the diffusion model based on the target subsequence; and carrying out quantization processing on the diffusion model based on the calibration data set to obtain a quantization model corresponding to the diffusion model. Therefore, corresponding calibration data sets can be dynamically constructed aiming at different quantization scenes to meet different tasks in a diffusion model or quantization requirements corresponding to different quantization configurations, the image generation precision of the obtained quantization model is guaranteed, quantization processing is carried out based on the calibration data sets, and therefore the image generation efficiency is improved.

Description

Data processing method, device, medium and electronic equipment

Technical Field

The disclosure relates to the technical field of computers, and in particular relates to a data processing method, a data processing device, a medium and electronic equipment.

Background

The Diffusion Model (DM) is a generation model based on a markov chain, and gradually adds noise into data through a forward process, predicts the noise added in each step through a reverse process, and gradually restores the noise removed mode to obtain a noise-free image, thereby realizing image generation. However, the diffusion model typically requires hundreds or even thousands of denoising processes to generate an image from gaussian noise, and the computational burden required is excessive.

Disclosure of Invention

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

In a first aspect, the present disclosure provides a data processing method, the method comprising:

Acquiring a sampling time step sequence corresponding to a trained diffusion model, wherein the time step sequence comprises a plurality of time steps for data processing of the diffusion model;

determining a target subsequence corresponding to the sampling time step sequence, wherein the target subsequence comprises part of time steps in the plurality of time steps;

Acquiring a calibration data set corresponding to the diffusion model based on the target subsequence;

And carrying out quantization processing on the diffusion model based on the calibration data set to obtain a quantization model corresponding to the diffusion model.

In a second aspect, the present disclosure provides a data processing apparatus, the apparatus comprising:

The first acquisition module is used for acquiring a sampling time step sequence corresponding to the diffusion model after training, wherein the time step sequence comprises a plurality of time steps for data processing of the diffusion model;

A determining module, configured to determine a target subsequence corresponding to the sampling time step sequence, where the target subsequence includes a portion of the time steps in the plurality of time steps;

The second acquisition module is used for acquiring a calibration data set corresponding to the diffusion model based on the target subsequence;

And the processing module is used for carrying out quantization processing on the diffusion model based on the calibration data set to obtain a quantization model corresponding to the diffusion model.

In a third aspect, the present disclosure provides a computer readable medium having stored thereon a computer program which when executed by a processing device performs the steps of the method of the first aspect.

In a fourth aspect, the present disclosure provides an electronic device comprising:

A storage device having a computer program stored thereon;

Processing means for executing said computer program in said storage means to carry out the steps of the method of the first aspect.

In the technical scheme, the time step is taken as a search object in the sampling time step sequence corresponding to the diffusion model to determine the target subsequence used for determining the calibration data set, and the corresponding calibration data set is further determined based on the target subsequence, so that the corresponding calibration data set can be dynamically constructed for different quantization scenes to meet different tasks in the diffusion model or quantization requirements corresponding to different quantization configurations, and the image generation precision of the obtained quantization model is ensured. The trained diffusion model is a floating point model, and the floating point model can be converted into a fixed point model with low bit width through quantization processing, namely the quantization model, so that the size of the quantization model corresponding to the diffusion model can be effectively reduced, the reasoning speed of each time step when image generation is carried out based on the quantization model can be effectively improved, the image generation precision of the quantization model can be ensured, and the image generation efficiency can be effectively improved.

Additional features and advantages of the present disclosure will be set forth in the detailed description which follows.

Drawings

The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale. In the drawings:

fig. 1 is a flow chart of a data processing method provided in accordance with one embodiment of the present disclosure.

Fig. 2 is a flowchart of an example implementation of determining a target sub-sequence corresponding to the sequence of sampling time steps provided in accordance with one embodiment of the present disclosure.

Fig. 3A, 3B, and 3C are sequentially generated images obtained based on the original diffusion model, the quantization model obtained by the PTQ4DM, and the quantization model obtained by the scheme of the present disclosure.

Fig. 4 is a block diagram of a data processing apparatus provided in accordance with one embodiment of the present disclosure.

Fig. 5 shows a schematic structural diagram of an electronic device suitable for use in implementing embodiments of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below.

It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.

It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.

The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.

It will be appreciated that prior to using the technical solutions disclosed in the embodiments of the present disclosure, the user should be informed and authorized of the type, usage range, usage scenario, etc. of the personal information related to the present disclosure in an appropriate manner according to the relevant legal regulations.

For example, in response to receiving an active request from a user, a prompt is sent to the user to explicitly prompt the user that the operation it is requesting to perform will require personal information to be obtained and used with the user. Thus, the user can autonomously select whether to provide personal information to software or hardware such as an electronic device, an application program, a server or a storage medium for executing the operation of the technical scheme of the present disclosure according to the prompt information.

As an alternative but non-limiting implementation, in response to receiving an active request from a user, the manner in which the prompt information is sent to the user may be, for example, a popup, in which the prompt information may be presented in a text manner. In addition, a selection control for the user to select to provide personal information to the electronic device in a 'consent' or 'disagreement' manner can be carried in the popup window.

It will be appreciated that the above-described notification and user authorization process is merely illustrative and not limiting of the implementations of the present disclosure, and that other ways of satisfying relevant legal regulations may be applied to the implementations of the present disclosure.

Meanwhile, it can be understood that the data (including but not limited to the data itself, the acquisition or the use of the data) related to the technical scheme should conform to the requirements of the corresponding laws and regulations and related regulations.

Fig. 1 is a flowchart of a data processing method according to an embodiment of the disclosure, where the method may include:

In step 11, a sampling time step sequence corresponding to the diffusion model after training is obtained, wherein the time step sequence comprises a plurality of time steps for carrying out data processing on the diffusion model.

In the embodiments of the present disclosure, the process after completion of the diffusion model training, i.e., PTQ (Post-training quantization, post-training quantification), is directed. The training process of the diffusion model may be based on a general manner in the art, and after its training is completed, a corresponding sampling time step sequence of the diffusion model is determined, where the sampling time step sequence may be used to represent a plurality of time steps of image generation based on the diffusion model.

For example, the diffusion model generates images through T time steps, i.e., T times are required to invoke the noise prediction network in the diffusion model to generate images, then the sequence of sampled time steps may be represented as [ T ₁,t₂,...,t_T ], where T ₁ represents the first time step in the data processing based on the diffusion model.

In step 12, a target sub-sequence corresponding to the sampling time step sequence is determined, wherein the target sub-sequence includes a part of time steps in the plurality of time steps.

The distribution of activation in the diffusion model not only changes with different samples, but also is dynamically updated with different denoising time steps in the image generation process. The state of the model can change along with the change of time steps, so that samples with different time steps are collected to manufacture a calibration data set, and the precision of the follow-up quantification model can be further improved.

In this step, the time step in the sampling time step sequence may be searched to determine an optimal time step sequence, that is, the target subsequence, from the sampling time step sequence, where the number of time steps included in the target subsequence may be set based on the actual application scenario, and the comparison of the present disclosure is not limited.

As an example, the target subsequence S may be represented as follows:

S＝[t'₁,t'₂,...,t'_K]

0≤t'_i+1-t'_i<t_T-t₁

t'_i∈[t₁,t₂,...,t_T](i＝1,2,...,K)

t' _i is used to represent the ith time step in the target subsequence. Each time step in the target subsequence belongs to a sampling time step sequence, and the sequence of time steps in the target subsequence is the same as the sequence of time steps in the sampling time step sequence. K is used to represent the number of time steps in the target subsequence.

In step 13, a calibration data set corresponding to the diffusion model is acquired based on the target subsequence.

Wherein after determining the target subsequence, the same number of activations may be sampled over each time step in the target subsequence to form the calibration data set. Wherein the activation at each time step is an activation layer output in the diffusion model in that time step, which varies with time step. Accordingly, in this embodiment, by searching for time steps in step 12, a plurality of time steps with a greater amount of information in the sampled time step sequence may be obtained, so that the calibration set based on each time step in the target subsequence may more accurately represent the distribution of the data set, providing accurate data support for subsequent quantization of the diffusion model.

As an example, M activations may be sampled for K time steps such as the t' ₁,t'₂,...,t'_K, and the obtained calibration data set may contain m×k calibration data.

In step 14, the diffusion model is quantized based on the calibration data set, and a quantization model corresponding to the diffusion model is obtained.

As an example, after determining the calibration set, model parameters in the diffusion model may be quantized based on the calibration data set, for example, each layer of weight and activation in the network of the diffusion model may be quantized, so as to obtain a quantization model corresponding to the diffusion model. The quantization process represents mapping weights and activations in the diffusion model to a low precision domain, which can reduce memory footprint and computational complexity. Quantization may be performed based on the quantization processing of the PTQ4DM after determining the calibration data set, for example, to obtain a corresponding quantization model. The image generation can be performed based on the quantization model by deploying the quantization model corresponding to the diffusion model, and meanwhile, the efficiency of reasoning the equipment hardware can be improved while the occupation of the memory can be effectively reduced based on the quantization model.

In the technical scheme, the trained diffusion model can be converted into the fixed-point calculation model, namely the finally obtained quantization model, the original model is not required to be trained in the process, the parameters in the quantization model are represented by low bits, the bandwidth requirement can be reduced when the data is moved, and general hardware has higher nominal calculation force on the low bit plastic data in the calculation process, so that the faster reasoning speed can be obtained by processing based on the quantization model.

In the technical scheme, the time step is taken as a search object in the sampling time step sequence corresponding to the diffusion model to determine the target subsequence for determining the calibration data set, and the corresponding calibration data set is further determined based on the target subsequence, so that the corresponding calibration data set can be dynamically constructed for different quantization scenes to meet quantization requirements corresponding to different tasks or different quantization configurations in the diffusion model, and the image generation precision of the obtained quantization model is ensured. The trained diffusion model is a floating point model, and the floating point model can be converted into a fixed point model with low bit width through quantization processing, namely the quantization model, so that the size of the quantization model corresponding to the diffusion model can be effectively reduced, the reasoning speed of each time step when image generation is carried out based on the quantization model can be effectively improved, the image generation precision of the quantization model can be ensured, and the image generation efficiency can be effectively improved.

In a possible embodiment, an example implementation manner of determining the target subsequence corresponding to the sampling time step sequence is as follows, and as shown in fig. 2, the steps may include:

In step 21, a sub-sequence set corresponding to the sampling time step is determined, where the sub-sequence set initially includes a plurality of initial sub-sequences obtained by initializing the sampling time step sequence.

The initial subsequence can be formed by randomly sampling candidates from the sampling time step sequence, namely K time steps can be randomly sampled from the sampling time step sequence, and each initial subsequence can be formed according to the sequence of the K time steps in the sampling time step sequence, so that the initialization processing of the sampling time step sequence is realized.

Thereafter, in step 22, a sequence score corresponding to each sub-sequence in the set of sub-sequences is determined, the sequence score being used to represent a distribution difference between the generated image obtained based on the sub-sequences and the target image, wherein the smaller the sequence score corresponding to one sub-sequence is, the larger the distribution difference corresponding to the sub-sequence is.

After each sub-sequence is obtained, the sub-sequence may be further evaluated to determine the sequence score. As an example, in evaluating a subsequence, the accuracy of quantizing a diffusion model based on the subsequence may be measured by comparing the distribution difference between an image generated based on a quantization model obtained by quantizing the subsequence and a real image.

As an example, the determining a sequence score corresponding to each sub-sequence in the set of sub-sequences may include:

And for each subsequence in the subsequence set, acquiring candidate activation in each time step indicated by the subsequence, and acquiring a candidate calibration set corresponding to the subsequence.

Wherein for each sub-sequence the same number of activations may be sampled as the candidate activations in each time step based on the indication in the sub-sequence, thereby obtaining a candidate calibration set. The number may be set based on the actual scenario, and may be the same as or different from the number of activated samples in the target subsequence, which is not limited.

And carrying out quantization processing on the diffusion model based on the candidate calibration set to obtain a candidate quantization model.

As an example, the diffusion model may be quantized based on the candidate calibration set in this step, e.g., reconstructed based on a PTQ reconstruction algorithm, to obtain a corresponding candidate quantization model.

As another example, the quantizing the diffusion model based on the candidate calibration set may include: and quantifying activation of the diffusion model based on the candidate calibration set to obtain the candidate quantification model. Specifically, in order to increase the efficiency of determining the sequence score in the evaluation process in the process of quantifying the diffusion model, the weighted floating point operation of the diffusion model may be maintained, and the activation of the diffusion model may be quantified, where the bit width corresponding to the quantifying process may be configured based on the actual requirement.

Further, the quantization parameters of each block may be optimized in a portion-by-portion reconstruction manner during the quantization process. As an example, a component of the diffusion model containing residual connections may be defined as a block, and other parts of the diffusion model that do not meet this condition are quantized and calibrated on a per-layer basis, where the defined layers and blocks are parts of the part-by-part reconstruction. For example, the diffusion model sequentially includes layer 1, layer 2, block 1, block 2 and layer 3, and calibration quantization can be performed on the activation of layer 1 based on the candidate calibration set in the quantization process, so as to determine the activation after the quantization of layer 1. And then, carrying out calibration quantization on the activation of the layer 2 based on the candidate calibration set, determining the activation after the quantization of the layer 2, then, carrying out calibration quantization on the activation of the block 1 based on the candidate calibration set, determining the activation after the quantization of the block 1, and carrying out quantization calibration in sequence until all parts in the diffusion model are completely reconstructed, so as to obtain the candidate quantization model.

Therefore, through the technical scheme, the simplified PTQ reconstruction can be performed in the process of determining the candidate quantization model, so that the determination efficiency of the sequence score of the subsequence can be improved, and the search efficiency of the target subsequence can be further improved effectively.

The generated image is then obtained based on the candidate quantization model, and the sequence score is determined based on the generated image and a target image.

As an example, the generated image may be obtained by performing image generation based on test data corresponding to the diffusion model and the candidate quantization model, and the target image may be an image corresponding to the generated image determined from the test data. Further, in this embodiment, the sequence score may be determined by determining a distribution distance between the statistical data of the generated image and the real samples in the test data.

As an example, the sequence score may be a FID score. FID (fre chet Inception Distance) is a metric. Which can be measured by comparing the distance of the distribution between the real image and the generated image. As the FID score, the frechet distance between the two distributions can be calculated, the smaller the distance, the smaller the difference between the distribution of the generated image and the distribution of the real image.

Therefore, through the technical scheme, the sequence score corresponding to each sub-sequence is determined, so that the process of quantifying the diffusion model by using the candidate calibration set generated based on the sub-sequence can be evaluated based on the sequence score, and reliable data support is provided for determining the target sub-sequence in the sampling time step sequence.

Turning back to fig. 2, in step 23, candidate subsequences are determined based on a sequence score corresponding to each subsequence in the set of subsequences.

Accordingly, the top W subsequences may be determined as the candidate subsequences in order of the sequence score from small to large, where W may be set according to the actual application scenario. In the step, a plurality of subsequences with more similar distribution between the generated image and the real image corresponding to the quantized candidate quantization model are selected as candidate subsequences. The generated images corresponding to the quantized candidate quantization models are distributed more similarly to the real images, so that the corresponding candidate subsequences can be characterized to contain more information in the image generation process, and the accuracy loss in the quantization process can be effectively reduced when quantization calibration is further performed based on the candidate subsequences.

In step 24, a generated sub-sequence is obtained based on the candidate sub-sequence, and the generated sub-sequence is added to the sub-sequence set, and the step 21 of determining the sub-sequence set corresponding to the sampling time step is repeated.

As an example, in the initial case, the sequences in the subsequence set are all obtained by random sampling, and then a new subsequence is further generated based on the candidate subsequence after the candidate subsequence is determined, so that the diversity of the subsequence can be improved, the frontal randomness of the generated subsequence can be reduced to a certain extent, and the effectiveness of the newly generated subsequence is improved.

As an example, the obtaining the generated subsequence based on the candidate subsequence may include:

and performing cross processing on time steps in any two candidate subsequences to obtain a first subsequence.

As an example, after obtaining the candidate subsequences, the candidate subsequences may be subjected to cross processing based on an evolutionary algorithm, for the candidate subsequences Q1 and Q2, if the subsequence corresponding to Q1 is denoted as {1,3,5,9,13}, the subsequence corresponding to Q2 is denoted as {1,4,7,10,15}, the time steps may be randomly exchanged in Q1 and Q2 to implement the cross processing, e.g., 3 in Q1 and 4 in Q2 may be exchanged, and the obtained first subsequence is denoted as follows: {1,4,5,9,13},{1,3,7,10,15}. In the cross processing, one or more time steps in each candidate sub-sequence may be randomly exchanged, which may be configured based on a specific scenario, which is not limited in this disclosure.

And carrying out mutation processing on at least one candidate subsequence based on the target probability to obtain a second subsequence, and taking the obtained first subsequence and second subsequence as the generated subsequence.

For another example, candidate subsequences may be randomly selected, e.g., Q1{1,3,5,9,13}, and the target probability may be set according to the actual application scenario as the variation probability in the evolutionary algorithm. Accordingly, in this step, a mutation process is performed based on the target probability, for example, if it is determined that 13 in Q1 is mutated and the mutation is 14, the second subsequence may be represented as {1,3,5,9,14}.

The generated first subsequence and the generated second subsequence can be further added into the subsequence set, so that more possible subsequences can be provided for determination of the target subsequence, meanwhile, a new subsequence can be generated based on a processing method of intersection and variation in an evolutionary algorithm, so that an optimal time step sequence corresponding to a calibration data set is searched and constructed from a search space, the relevance between the newly generated subsequence and the candidate subsequence can be ensured to a certain extent, and due to the fact that the difference between the generated image corresponding to the candidate subsequence and the real image is smaller, the fact that the difference between the generated image corresponding to the newly generated subsequence and the real image is overlarge can be avoided to a certain extent, and the effectiveness and stability of the newly generated subsequence are improved.

In step 25, in case the number of iterations reaches a first preset number, a target sub-sequence is determined from the sub-sequences in the set of sub-sequences.

The first preset number of times may be preset, where the process of generating a sub-sequence may be considered as iterating once. The candidate subsequence is determined and the subsequence is generated through multiple iterations based on an evolutionary algorithm, so that multiple subsequences can be generated in the searching process of the target subsequence, and accurate and rapid performance estimation can be performed on the searching process based on the sequence score.

As an example, in a case where the number of iterations reaches the first preset number of times, a sub-sequence having the smallest corresponding sequence score among the sub-sequences in the sub-sequence set may be determined as the target sub-sequence.

For diffusion models, the model parameters (e.g., activation ranges) of the different layers generally vary drastically over time steps, e.g., the ratio of the maximum to minimum activation ranges over different time steps may be up to 2-fold or even 8-fold. The uniform selection of time step intervals to generate the calibration data set does not provide adequate guidance for learning the quantization function. Therefore, in this embodiment, in determining the calibration data set, the time step may be first used as a search object to search, so as to determine an optimal target subsequence from the sampling time step sequence, and multiple iterative computations are performed to obtain a subsequence with the smallest distance between the generated image and the real image of the corresponding candidate quantization model as the target subsequence, so that the matching degree between the calibration data set and the diffusion model determined based on the target subsequence may be effectively improved, and accurate data reference and support are provided for the subsequent determination of the quantization model, so that the quantization model obtained based on the calibration data set corresponding to the target subsequence may obtain higher generated image quality.

In a possible embodiment, the performing quantization processing on the diffusion model based on the calibration data set, to obtain an exemplary implementation manner of a quantization model corresponding to the diffusion model is as follows, which may include:

And carrying out quantization processing on model parameters of the diffusion model based on quantization parameters to obtain an intermediate quantization model corresponding to the diffusion model, wherein the quantization parameters comprise quantization step length parameters and zero point parameters.

Wherein the zero-point parameter is used to represent a corresponding value of the zero value of the model parameter in the diffusion model in the model parameter in the quantization model. The quantization step size parameter is used to represent the ratio of the range of model parameters in the diffusion model to the range of model parameters in the quantization model, i.e. the quantization step size may be used to represent the mapping of the range of model parameters in the diffusion model to the range of model parameters in the quantization model. The quantization parameter may further include a bit width, and the representation range of the model parameter in the quantization model depends on the setting of the bit width. The model parameters in the diffusion model are floating point representations, and the model parameters in the quantization model are integer representations, so that the floating point representations of the model parameters can be mapped to the integer representations by combining the quantization parameters, and an intermediate quantization model is obtained.

And determining a target loss corresponding to the intermediate quantization model based on the calibration data set.

Wherein the target loss may be used to measure accuracy of the quantization process, and the determining the target loss corresponding to the intermediate quantization model based on the calibration data set may include, as an example:

The target loss is determined based on an output obtained by inputting calibration data in the calibration data set into the diffusion model and an output obtained by inputting the calibration data into the intermediate quantization model.

If the first output can be obtained by inputting the calibration data into the diffusion model after training, and the second output can be obtained by inputting the calibration data into the intermediate quantization model, as an example, the square difference loss corresponding to the first output and the second output can be used as the target loss, so that the difference between the output of the diffusion model after quantization and the model output when not quantized can be compared to adjust the quantization process, and the accuracy of the quantization process can be improved.

And updating the quantization step length parameter and the zero point parameter based on the target loss, taking the updated quantization parameter as a new quantization parameter, and returning to the step of carrying out quantization processing on the model parameter of the diffusion model based on the quantization parameter to obtain an intermediate quantization model corresponding to the diffusion model.

As an example, the quantization step size parameter and the zero point parameter may be updated based on the following:

zp'＝-l/Δ'

wherein, delta' is used for representing the updated quantization step size parameter, delta is used for representing the quantization step size parameter, eta is used for representing the learning rate, and L (delta) is used for representing the gradient corresponding to the quantization step size parameter; l is the minimum value of the floating point representation corresponding to the parameter used for quantization in quantization, zp' is used for representing the updated zero information.

Wherein the calculation is typically performed in a diffusion model using SILU activation functions that cause the output of the corresponding layer to no longer follow the activation of the normal distribution, it is difficult to adapt the activation functions based on a general symmetric quantization process. The applicant research finds that, because the minimum value of activation is usually fixed due to the action of SILU activation functions, in this embodiment, after the quantization step parameters are updated based on the target loss in the learning process, the zero point information can be updated, so as to further ensure accurate mapping from floating point representation to integer representation. According to the technical scheme, when the model parameters of the diffusion model are quantized, the zero point of the model parameters can be dynamically changed based on updating of quantization step length information, so that the minimum value of the floating point range in the quantization process is near the minimum value which can be represented by SILU activation functions, the optimization efficiency of the quantization step length information is further improved, and the accuracy and the efficiency of the model quantization process are improved.

And then, under the condition that the iteration times reach the second preset times, taking the latest intermediate quantization model as the quantization model corresponding to the diffusion model.

The first preset number of times may be set based on an actual application scenario, which is not limited in this disclosure. Under the condition that the iteration times reach the second preset times, the quantization parameters can be considered to be better converged on the calibration data set, and the quantization model is obtained at the moment, namely, the image generation process is realized based on the quantization model.

Therefore, through the technical scheme, the quantization step length parameter and the zero point parameter can be dynamically updated in the quantization process, on one hand, the accuracy of the quantization step length parameter can be improved, and floating point representation and integer representation of the model parameter are more matched. On the other hand, the zero point can be updated based on the updated quantization step size parameters, so that quantization loss caused by activation of quantization bias distribution is avoided to a certain extent, meanwhile, the situation that the zero point is statically limited, so that a quantization model can only be converged to a locally optimal state is avoided, the accuracy of model quantization is improved, the accuracy of an obtained generated image of the quantization model is improved, meanwhile, the efficiency of image generation can be effectively improved by image generation based on the quantization model, occupation of memory and calculation force is reduced, and the deployment application scene of the quantization model is effectively widened.

As an example, the following is a data comparison of a quantization model obtained by quantization based on the data processing method provided by the present disclosure and a quantization model obtained by quantization based on the PTQ4DM method, and the generated image comparison thereof is referred to fig. 3A-3C, wherein fig. 3A, 3B, and 3C are sequentially generated images obtained based on the original diffusion model, the quantization model obtained by the PTQ4DM, and the quantization model obtained by the scheme of the present disclosure. The performance of the quantization model IS evaluated by using FID, sFID and IS (Inception Score) scores, wherein the lower the FID and sFID scores are, the higher the IS scores are, and the better the quality of the generated image of the quantization model corresponding to the diffusion model IS. Wherein W/a is used to represent the quantization bit width to which the weights and activations respectively correspond. The specific comparison is given in the following table:

Mode for carrying out the invention	Bits(W/A)	FID	sFID	IS
					The scheme of the present disclosure	8/8	5.25	38.95	2.24
PTQ4DM	8/8	5.34	38.90	2.23
					The scheme of the present disclosure	4/8	5.68	39.75	2.27
PTQ4DM	4/8	14.00	44.58	2.45
					The scheme of the present disclosure	4/6	6.22	40.39	2.21
PTQ4DM	4/6	16.01	53.83	2.48

Based on the table data and fig. 3A to 3C, it can be known that the quantization model obtained by quantization by the data processing method provided by the present disclosure can effectively retain the image generation quality, which is superior to the existing scheme, and the advantage of the technical scheme of the present disclosure is more obvious when the quantized bit width is lower, and at the same time, the single-step reasoning speed in the image generation process can be effectively improved based on the quantization model.

Based on the same inventive concept, the present disclosure further provides a data processing apparatus, as shown in fig. 4, the apparatus 10 includes:

A first obtaining module 100, configured to obtain a sampling time step sequence corresponding to a diffusion model after training, where the time step sequence includes a plurality of time steps for performing data processing on the diffusion model;

a determining module 200, configured to determine a target subsequence corresponding to the sequence of sampling time steps, where the target subsequence includes a portion of the time steps in the plurality of time steps;

a second obtaining module 300, configured to obtain a calibration data set corresponding to the diffusion model based on the target subsequence;

And the processing module 400 is configured to perform quantization processing on the diffusion model based on the calibration data set, so as to obtain a quantization model corresponding to the diffusion model.

Optionally, the determining module includes:

The first determining sub-module is used for determining a sub-sequence set corresponding to the sampling time step, and the sub-sequence set initially comprises a plurality of initial sub-sequences obtained by initializing the sampling time step sequence;

A second determining sub-module, configured to determine a sequence score corresponding to each sub-sequence in the sub-sequence set, where the sequence score is used to represent a distribution difference between a generated image obtained based on the sub-sequence and a target image, and a smaller sequence score represents a larger distribution difference corresponding to the sub-sequence;

A third determining sub-module, configured to determine a candidate sub-sequence based on a sequence score corresponding to each sub-sequence in the sub-sequence set;

The generation sub-module is used for obtaining a generation sub-sequence based on the candidate sub-sequence, adding the generation sub-sequence into the sub-sequence set, and triggering the first determination sub-module to determine the sub-sequence set corresponding to the sampling time step;

And a fourth determining sub-module, configured to determine the target sub-sequence from the sub-sequences in the sub-sequence set when the iteration number reaches the first preset number.

Optionally, the second determining submodule includes:

The acquisition sub-module is used for acquiring candidate activation in each time step indicated by the subsequence aiming at each subsequence in the subsequence set to acquire a candidate calibration set corresponding to the subsequence;

the first processing submodule is used for carrying out quantization processing on the diffusion model based on the candidate calibration set to obtain a candidate quantization model;

a fifth determination sub-module for obtaining the generated image based on the candidate quantization model and determining the sequence score based on the generated image and a target image.

Optionally, the first processing sub-module is further configured to:

And quantifying activation of the diffusion model based on the candidate calibration set to obtain the candidate quantification model.

Optionally, the generating submodule includes:

The second processing sub-module is used for performing cross processing on time steps in any two candidate sub-sequences to obtain a first sub-sequence;

and the third processing sub-module is used for carrying out mutation processing on at least one candidate sub-sequence based on the target probability to obtain a second sub-sequence, and the obtained first sub-sequence and the obtained second sub-sequence are used as the generation sub-sequence.

Optionally, the sequence score is an FID score.

Optionally, the processing module includes:

A fourth processing sub-module, configured to perform quantization processing on model parameters of the diffusion model based on quantization parameters, to obtain an intermediate quantization model corresponding to the diffusion model, where the quantization parameters include quantization step parameters and zero parameters;

A sixth determining submodule, configured to determine a target loss corresponding to the intermediate quantization model based on the calibration data set;

the updating sub-module is used for updating the quantization step length parameter and the zero point parameter based on the target loss, taking the updated quantization parameter as a new quantization parameter, triggering the fourth processing sub-module to quantize the model parameter of the diffusion model based on the quantization parameter, and obtaining an intermediate quantization model corresponding to the diffusion model;

And a seventh determining submodule, configured to use the latest intermediate quantization model as a quantization model corresponding to the diffusion model when the iteration number reaches a second preset number.

Optionally, the sixth determination submodule is further configured to:

Referring now to fig. 5, a schematic diagram of an electronic device (e.g., a terminal device or server) 600 suitable for use in implementing embodiments of the present disclosure is shown. The terminal devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 5 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.

As shown in fig. 5, the electronic device 600 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 601, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data required for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

In general, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, magnetic tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 5 shows an electronic device 600 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via communication means 609, or from storage means 608, or from ROM 602. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing device 601.

It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.

The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring a sampling time step sequence corresponding to a trained diffusion model, wherein the time step sequence comprises a plurality of time steps for data processing of the diffusion model; determining a target subsequence corresponding to the sampling time step sequence, wherein the target subsequence comprises part of time steps in the plurality of time steps; acquiring a calibration data set corresponding to the diffusion model based on the target subsequence; and carrying out quantization processing on the diffusion model based on the calibration data set to obtain a quantization model corresponding to the diffusion model.

Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including, but not limited to, an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present disclosure may be implemented in software or hardware. The name of the module is not limited to this module, and for example, the first acquisition module may be described as "a module that acquires a sampling time step sequence corresponding to the diffusion model after training".

The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

According to one or more embodiments of the present disclosure, example 1 provides a data processing method, the method comprising:

According to one or more embodiments of the present disclosure, example 2 provides the method of example 1, wherein the determining the target subsequence corresponding to the sequence of sampling time steps includes:

Determining a subsequence set corresponding to the sampling time step, wherein the subsequence set initially comprises a plurality of initial subsequences obtained by initializing the sampling time step sequence;

Determining a sequence score corresponding to each subsequence in the subsequence set, wherein the sequence score is used for representing a distribution difference between a generated image obtained based on the subsequence and a target image, and the smaller the sequence score is, the larger the distribution difference corresponding to the subsequence is;

determining candidate subsequences based on a sequence score corresponding to each subsequence in the set of subsequences;

Obtaining a generated subsequence based on the candidate subsequence, adding the generated subsequence to the subsequence set, and returning to the step of determining the subsequence set corresponding to the sampling time step;

And under the condition that the iteration times reach the first preset times, determining the target subsequence from the subsequences in the subsequence set.

According to one or more embodiments of the present disclosure, example 3 provides the method of example 2, wherein the determining a sequence score corresponding to each sub-sequence in the set of sub-sequences comprises:

for each subsequence in the subsequence set, candidate activation is obtained in each time step indicated by the subsequence, and a candidate calibration set corresponding to the subsequence is obtained;

performing quantization processing on the diffusion model based on the candidate calibration set to obtain a candidate quantization model;

The generated image is obtained based on the candidate quantization model, and the sequence score is determined based on the generated image and a target image.

According to one or more embodiments of the present disclosure, example 4 provides the method of example 3, wherein the quantizing the diffusion model based on the candidate calibration set, obtaining a candidate quantization model, comprises:

According to one or more embodiments of the present disclosure, example 5 provides the method of example 2, wherein the obtaining the generated subsequence based on the candidate subsequence comprises:

Performing cross processing on time steps in any two candidate subsequences to obtain a first subsequence;

Example 6 provides the method of example 2, wherein the sequence score is a FID score, according to one or more embodiments of the present disclosure.

According to one or more embodiments of the present disclosure, example 7 provides the method of example 1, wherein the performing quantization processing on the diffusion model based on the calibration data set to obtain a quantization model corresponding to the diffusion model includes:

Carrying out quantization processing on model parameters of the diffusion model based on quantization parameters to obtain an intermediate quantization model corresponding to the diffusion model, wherein the quantization parameters comprise quantization step length parameters and zero point parameters;

determining a target loss corresponding to the intermediate quantization model based on the calibration dataset;

Updating the quantization step length parameter and the zero point parameter based on the target loss, taking the updated quantization parameter as a new quantization parameter, and returning to the step of carrying out quantization processing on the model parameter of the diffusion model based on the quantization parameter to obtain an intermediate quantization model corresponding to the diffusion model;

And under the condition that the iteration times reach the second preset times, taking the latest intermediate quantization model as the quantization model corresponding to the diffusion model.

According to one or more embodiments of the present disclosure, example 8 provides the method of example 7, wherein the determining, based on the calibration dataset, a target loss corresponding to the intermediate quantization model comprises:

According to one or more embodiments of the present disclosure, example 9 provides a data processing apparatus, the apparatus comprising:

According to one or more embodiments of the present disclosure, example 10 provides a computer-readable medium having stored thereon a computer program which, when executed by a processing device, implements the steps of the method of any of examples 1-8.

Example 11 provides an electronic device according to one or more embodiments of the present disclosure, comprising:

A storage device having a computer program stored thereon;

processing means for executing the computer program in the storage means to implement the steps of the method of any one of examples 1-8.

The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Moreover, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims. The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

Claims

1. A method of data processing, the method comprising:

2. The method of claim 1, wherein the determining the target subsequence corresponding to the sequence of sample time steps comprises:

3. The method of claim 2, wherein the determining a sequence score for each subsequence in the set of subsequences comprises:

4. A method according to claim 3, wherein said quantizing said diffusion model based on said candidate calibration set to obtain candidate quantized models comprises:

5. The method of claim 2, wherein the obtaining the generated subsequence based on the candidate subsequence comprises:

6. The method of claim 2, wherein the sequence score is a FID score.

7. The method according to claim 1, wherein the quantizing the diffusion model based on the calibration data set to obtain a quantized model corresponding to the diffusion model includes:

8. The method of claim 7, wherein the determining a target loss for the intermediate quantization model based on the calibration dataset comprises:

9. A data processing apparatus, the apparatus comprising:

10. A computer readable medium on which a computer program is stored, characterized in that the program, when being executed by a processing device, carries out the steps of the method according to any one of claims 1-8.

11. An electronic device, comprising:

A storage device having a computer program stored thereon;

processing means for executing said computer program in said storage means to carry out the steps of the method according to any one of claims 1-8.