CN115620074A - Image data classification method, device and medium - Google Patents

Image data classification method, device and medium Download PDF

Info

Publication number
CN115620074A
CN115620074A CN202211411839.5A CN202211411839A CN115620074A CN 115620074 A CN115620074 A CN 115620074A CN 202211411839 A CN202211411839 A CN 202211411839A CN 115620074 A CN115620074 A CN 115620074A
Authority
CN
China
Prior art keywords
model
image
training
image data
target model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211411839.5A
Other languages
Chinese (zh)
Inventor
朱克峰
阚宏伟
王彦伟
黄伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Beijing Electronic Information Industry Co Ltd
Original Assignee
Inspur Beijing Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Beijing Electronic Information Industry Co Ltd filed Critical Inspur Beijing Electronic Information Industry Co Ltd
Priority to CN202211411839.5A priority Critical patent/CN115620074A/en
Publication of CN115620074A publication Critical patent/CN115620074A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a classification method, a classification device and a classification medium of image data, which are applied to the technical field of model distillation. The method comprises the steps of constructing a diffusion model in advance, training by using an original data set to obtain an image generator for generating image data, and constructing a model distillation framework to obtain an image classification model to be distilled; and finally, training an image classification model by taking the image data generated by the image generator as parameters to obtain a distilled target model. According to the scheme, in the training process of model distillation, the generated data is used for replacing an original data set, the size of the model is effectively compressed on the premise of keeping high precision, and the normal operation of the model distillation process is ensured. The diffusion model is applied to a model distillation mechanism to replace an original data set required in the traditional model distillation, so that the quality of the generated image is greatly improved, and the scale of the generated image required by the participation of the model distillation is further reduced on the premise of keeping the accuracy of the model distillation.

Description

Image data classification method, device and medium
Technical Field
The present application relates to the field of model distillation technologies, and in particular, to a method, an apparatus, and a medium for classifying image data.
Background
With the rapid development and application of artificial intelligence and deep neural network models, falling artificial intelligence classification models to edge ends (such as vehicle-mounted systems, mobile ends, etc.) often requires smaller and more efficient models, so how to effectively simplify and compress models becomes an important issue on the premise of keeping model performance. The model distillation technology is an important means for compressing the classification model, and the model distillation refers to a process of training and optimizing a designed joint loss function by using a conventional pre-training model (Teacher) and a randomly initialized target small model (Student) so as to obtain the target small model (Student) with the performance equivalent to that of the conventional pre-training model (Teacher). For example, in an image classification scene, the original models are too large to be applied in some special scenes, and the models need to be distilled.
However, the training process of the standard model distillation requires the participation of the original data set used for the existing model training, which is often difficult to realize in real-world scenes, such as the reason of data privacy and the problem of overlarge original data set. Without the participation of the original data set, the model distillation effect is poor, the obtained target model does not meet the requirements, and the image classification effect cannot be guaranteed.
Therefore, how to guarantee the effect of image classification is an urgent problem to be solved by those skilled in the art.
Disclosure of Invention
The application aims to provide a method, a device and a medium for classifying image data so as to ensure the effect of image classification.
In order to solve the above technical problem, the present application provides a method for classifying image data, including:
acquiring image data to be classified;
calling the distilled target model; wherein the obtaining of the target model comprises: constructing a diffusion model in advance, and training through an original data set to obtain an image generator; generating image training data with the image generator; training an image classification model by taking the image training data generated by the image generator as parameters to obtain the distilled target model;
and carrying out image classification on the image data to be classified by adopting the distilled target model.
Preferably, the training an image classification model by using the image training data generated by the image generator as parameters to obtain the distilled target model comprises:
and training the image classification model according to a preset iteration number to obtain the distilled target model.
Preferably, after training an image classification model by using the image training data generated by the image generator as a parameter to obtain the distilled target model, the method further comprises:
and verifying the target model obtained by distillation.
Preferably, the verifying the target model obtained by distillation comprises:
acquiring the accuracy of the target model;
judging whether the accuracy rate reduction range of the target model is within an acceptable range;
and if so, judging that the target model is distilled successfully.
Preferably, the pre-constructing a diffusion model and then training through a raw data set to obtain an image generator includes:
training the pre-constructed diffusion model by taking ImageNet data set as a training set to obtain the image generator.
Preferably, the obtaining the accuracy of the target model comprises:
and obtaining the accuracy of the target model by utilizing an ImageNet verification set.
Preferably, the weights of the object model are initialized with random numbers.
In order to solve the above technical problem, the present application further provides an image data classification apparatus, including:
the acquisition module is used for acquiring image data to be classified;
the calling module is used for calling the distilled target model; wherein the obtaining of the target model comprises: constructing a diffusion model in advance, and training through an original data set to obtain an image generator; generating image training data with the image generator; training an image classification model by taking the image training data generated by the image generator as parameters to obtain the distilled target model;
and the classification module is used for carrying out image classification on the image data to be classified by adopting the distilled target model.
Preferably, the apparatus for classifying image data further includes: and the verification module is used for training the image classification model by taking the image data generated by the image generator as parameters to obtain a distilled target model and then verifying the distilled target model.
In order to solve the above technical problem, the present application further provides an image data classification apparatus, including: a memory for storing a computer program;
and the processor is used for realizing the steps of the image data classification method when executing the computer program.
In order to solve the above technical problem, the present application further provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps of the image data classification method.
The image data classification method comprises the steps of constructing a diffusion model in advance, obtaining an image generator by using original data set training, then generating image data by adopting the image generator constructed based on the diffusion model, and constructing a model distillation framework to obtain an image classification model to be distilled; and after the model construction and the acquisition of parameters required by training are completed, training an image classification model by taking the image data generated by the image generator as parameters to obtain a distilled target model. After the target model is obtained, the target model is called to classify the image data to be classified, and the image classification effect is guaranteed. According to the scheme, firstly, a diffusion model is constructed and trained on the basis of an original data set to serve as an image generation module, and the module is responsible for generating a high-quality image data set with the same size as the original training set. Then, in the training process of model distillation, the generated data is used for replacing an original data set, so that the distillation of the model from the Teacher to the Student is realized, namely the size of the model is effectively compressed on the premise of keeping high precision, and the normal operation of the model distillation process is ensured. The diffusion model is applied to a model distillation mechanism to replace an original data set required in the traditional model distillation, so that the quality of the generated image is greatly improved, and the scale of the generated image required by the participation of the model distillation is further reduced on the premise of keeping the model distillation accuracy.
The application also provides a device and a medium for classifying image data, which correspond to the method, so that the method has the same beneficial effects as the method.
Drawings
In order to more clearly illustrate the embodiments of the present application, the drawings needed for the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained by those skilled in the art without inventive effort.
Fig. 1 is a flowchart of a method for classifying image data according to an embodiment of the present disclosure;
FIG. 2 is a general idea diagram of an image classification model distillation method based on a diffusion model according to an embodiment of the present application;
fig. 3 is a block diagram of an apparatus for classifying image data according to another embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without any creative effort belong to the protection scope of the present application.
The core of the application is to provide a method, a device and a medium for classifying image data so as to ensure the effect of image classification.
In order that those skilled in the art will better understand the disclosure, the following detailed description is given with reference to the accompanying drawings.
The model distillation technology is an important means for compressing a classification model, and the model distillation refers to a process of training and optimizing a designed joint loss function by using a conventional pre-training model (Teacher) and a randomly initialized target small model (Student) so as to obtain the target small model (Student) with the performance equivalent to that of the conventional pre-training model (Teacher). Given a trained model p T And a data set
Figure BDA0003938966650000041
The goal is to find the weight parameters W of the small model of the target (i.e., the target model) that satisfies the following equation S
Figure BDA0003938966650000042
KL (. Cndot.) here means Kullback-Leibler divergence (which can be understood as a loss function), p T (. And p) S (. Cndot.) represents the distribution function of the output results of the Teacher model and the Student model, respectively. The training process of the standard model distillation requires the participation of the original data set used for the existing model training, which is often difficult to realize in real-world scenes, such as the reason of data privacy and the problem of overlarge original data set. Therefore, how to solve the problem that the original data set must participate in the model distillation becomes one of the research focuses in the field, and the application is also a new proposal aiming at solving the problem.
Fig. 1 is a flowchart of a method for classifying image data according to an embodiment of the present disclosure; as shown in fig. 1, the method for classifying image data provided by this embodiment includes the following steps:
s10: image data to be classified is acquired.
S11: the distilled target model was called.
S12: and carrying out image classification on the image data to be classified by adopting the distilled target model.
Wherein the obtaining of the target model comprises: constructing a diffusion model in advance, and training through an original data set to obtain an image generator; generating image training data with an image generator; and training an image classification model by taking the image training data generated by the image generator as parameters to obtain a distilled target model. The new method steps provided in the embodiments of the present application are described by taking a distillation process based on the ResNet50 model trained by ImageNet as an example, but the practical application is not limited to the steps provided in the embodiments.
Step 1: and constructing a diffusion model for ImageNet, and defining an image with model input and output of 224 × 224 and any category defined by an ImageNet data set, wherein the input image is a noise image, and the output image is a target generation image.
Step 2: and (3) training the diffusion model defined in the step 1 by taking the ImageNet data set as a training set according to the description of the diffusion model training process, and obtaining the high-quality image generator after the training is finished.
And step 3: image data generation is started by using an image generator constructed based on a diffusion model to prepare for model distillation, a noise image is initialized, a category is randomly extracted from the categories of ImageNet, and the extracted category is input into the image generator, so that a high-quality image sample similar to the original ImageNet image is generated.
And 4, step 4: iteration step 3 is repeated to obtain a considerable number of generated images, e.g. 50000, constituting a high quality generated data set.
And 5: constructing a model distillation framework, obtaining a high-accuracy image classification model to be distilled, namely a Teacher model (the model is a pre-training image classification model obtained by training from ImageNet data set, such as a ResNet50 classification model, the input of which is 224 images and the output of which is a class serial number), initializing a distillation target model, namely a Student model (the model is also a model for ImageNet image classification, the input of which is 224 images and the output of which is a class serial number, and model weights are initialized by random numbers), constructing a joint loss function of model distillation, and initializing model distillation parameters, such as T.
Step 6: and (5) taking the high-quality generated image data set constructed in the step (4) as the parameters of the model distillation frame initialized in the step (5), and starting training of model distillation.
And 7: and repeating the iteration step 6, finishing training according to the set iteration number (for example, epoch = 200), and finally obtaining the distilled Student model.
And step 8: and (3) verifying the Student model obtained by distillation, obtaining the Student model accuracy by using an ImageNet verification set, and judging whether the accuracy reduction range is within an acceptable range (for example, within 3%).
FIG. 2 is a general idea diagram of an image classification model distillation method based on a diffusion model according to an embodiment of the present application; as shown in fig. 2, the basic idea of the present application is to first construct and train a diffusion model based on an original data set as an image generation module, which is responsible for generating a high-quality image data set with the same size as the original training set. Then, in the training process of model distillation, the generated data is used for replacing an original data set, and the distillation of the model from Teacher (Teacher model) to Student (Student model) is realized, namely the size of the model is effectively compressed on the premise of keeping high precision.
Aiming at the problem of dependence on an original data set in the distillation training process of a traditional image classification model, the embodiment of the application provides a classification model distillation method using a diffusion model as a sample image generator. The diffusion model has more excellent image generation quality as a generation model at the leading edge of the industry, and the model distillation method combined with the technology can effectively improve the quality and efficiency of model distillation. On the basis of the embodiment, the diffusion model can be further analyzed and disassembled, and the diffusion model can be more efficiently integrated into a framework of model distillation.
The image data classification method provided by the embodiment of the application comprises the steps of constructing a diffusion model in advance, training an original data set to obtain an image generator, generating image data by adopting the image generator constructed based on the diffusion model, and constructing a model distillation framework to obtain an image classification model to be distilled; and after the model construction and the acquisition of parameters required by training are completed, training an image classification model by taking the image data generated by the image generator as parameters to obtain a distilled target model. After the target model is obtained, the image data to be classified is classified by adopting the target model, and the image classification effect is ensured. According to the scheme, firstly, a diffusion model is constructed and trained on the basis of an original data set to serve as an image generation module, and the module is responsible for generating a high-quality image data set with the same size as the original training set. Then, in the training process of model distillation, the generated data is used for replacing an original data set, so that the distillation of the model from the Teacher to the Student is realized, namely the size of the model is effectively compressed on the premise of keeping high precision, and the normal operation of the model distillation process is ensured. The diffusion model is applied to a model distillation mechanism to replace an original data set required in the traditional model distillation, so that the quality of the generated image is greatly improved, and the scale of the generated image required by the participation of the model distillation is further reduced on the premise of keeping the accuracy of the model distillation.
This embodiment explains a Diffusion Model, which is a novel Model in the field of deep learning image generation. The diffusion Model is different from a structure of a countermeasure neural Network Model (GAN), a variational automatic encoder Model (VAE) and a Flow-based Model (Flow Model), and effectively solves the problems of unstable training, poor diversity, design of a substitution loss function and the like of the three generation models.
The diffusion model is inspired by non-equilibrium thermodynamics, firstly a Markov chain of diffusion steps is defined, random noise is gradually added to data, then the inverse diffusion process is learned, and required data samples are constructed from the noise. Unlike VAE or flow models, diffusion models are learned with a fixed program and hidden variables have high dimensionality (same as the original data).
The diffusion model first requires a defined forward process to sample data points x from the true data distribution 0 Obeying to the distribution q (x), as T gradually increases to T, a small amount of gaussian noise is added to the sample at each step, thereby generating a series of noise samples x 1 ,...,x T . The diffusion intensity of each step is represented by variance
Figure BDA0003938966650000071
Controlling, the distribution of each layer satisfies the following formula:
Figure BDA0003938966650000072
as T of the step size gradually increases, it gradually loses recognizable features, eventually as T approaches infinity, x T Becomes a general gaussian distribution (random noise).
A forward diffusion process is constructed, and then the reverse diffusion process is a process from random noise to real image data. For the conditional probability distribution of each step, q (x) t-1 |x t ) It is generally difficult to estimate, since it requires the entire data set to be obtained, and therefore we need to learn a model p θ To approximate these conditional probabilities in order to run the back-diffusion process.
P in this case θ The joint probability distribution of T from 0 to T is:
Figure BDA0003938966650000073
Figure BDA0003938966650000074
wherein mu θ And sigma θ The parameterized mean and variance. Wherein with x 0 Conditional, the inverse conditional probability is readily available:
Figure BDA0003938966650000075
wherein
Figure BDA0003938966650000081
The method is obtained by derivation and simplification through Bayes formula and the like:
Figure BDA0003938966650000082
Figure BDA0003938966650000083
wherein alpha is t =1-β t
Figure BDA0003938966650000084
The reasoning process of the summary diffusion model is as follows: 1) Each time step passes x t And t to predict the Gaussian noise z θ (x t T) is then obtained according to the above formula
Figure BDA0003938966650000085
2) Obtaining an approximate variance by
Figure BDA0003938966650000086
3) Q (x) is obtained from the above formula (4) t-1 |x t ) Further get x t-1
The training process of the diffusion model comprises the following steps:
the training process seeks to obtain a μ that represents a training data set θ (x t T) and ∑ θ (x t T). For this purpose, a reasonable loss function needs to be designed for iterative optimization. And (3) obtaining a target loss function by integrating the cross entropy and the accumulation of a plurality of KL divergence and by approximation and derivation of a variational lower limit method and a Jensen inequality:
Figure BDA0003938966650000087
L T =D KL (q(x T |x 0 )||p θ (x T ))
L t =D KL (q(x t |x t-1 ,x 0 )||p θ (x t |x t+1 ));1≤t≤T-1
L 0 =-log p θ (x 0 |x 1 ). (8)
wherein L is T Can be ignored as a constant, and L t The derivation can be finally simplified as:
Figure BDA0003938966650000088
based on the above loss function, the training process of the diffusion model can be regarded as: for each training sample picture x 0 1) after obtaining the input, randomly sampling a T from 1 … T; 2) Sampling a noise from a Gaussian distribution
Figure BDA0003938966650000089
3) And (3) minimizing:
Figure BDA00039389666500000810
by constructing and training the diffusion model through the process, a high-quality image generation model can be obtained.
In the above embodiment, it is mentioned that completing training according to the set iteration number to finally obtain the distilled target model, that is, training the image classification model by using the image data generated by the image generator as a parameter to obtain the distilled target model includes: and finishing training the image classification model according to preset iteration times to obtain a distilled target model. The specific number of iterations is not limited, and may be set to 200, and may be adjusted according to the distillation model and other parameters in practical application.
In addition, after the model distillation is completed, whether the target model meets the requirements needs to be confirmed, so that the target model needs to be verified, namely, after the image classification model is trained by using the image data generated by the image generator as parameters to obtain the distilled target model, the distilled target model is verified. The specific mode of verifying the target model obtained by distillation is not limited, the accuracy of the target model can be obtained, and then whether the accuracy reduction range of the target model is within an acceptable range or not is judged; if yes, judging that the target model is successfully distilled. Specifically, a pre-constructed diffusion model may be trained using the ImageNet data set as a training set to obtain an image generator, and then the accuracy of the target model obtained by the ImageNet verification set is used to determine whether the accuracy reduction range is within an acceptable range (e.g., within 3%).
For the problem of "the original data set must participate" in model distillation, there are two ideas: the first idea is to start from a pre-training model and try to generate high-quality samples directly and reversely, and then the high-quality samples are used as a substitute of an original data set to complete the model distillation process. The method has the advantages that the method can be completely separated from the original data set, only sends out from the pre-training model, and can complete the generation of the data sample and further complete the distillation of the model. But the disadvantage is also very obvious, 1. The generated data samples are essentially antagonistic samples, the recovery degree of the original data set is poor, and the quality of the generated sample image is low. 2. The slightly effective Deep Inversion method is more strict in requirements on a pre-training model, and needs to be a convolutional neural network model with a Batch Normalization (BN) layer. 3. Several times more generated samples than the original data set are required to participate in model distillation training. The second concept is a method of combining generative models trained based on the original data set with model distillation. The basic idea of such a method, which is represented by GAN + KD, is to generate a sample data set by using a generated model (GAN) trained on an original data set, and then use the generated data set to participate in the training of model distillation. The advantage of this type of method is that the model generation is much better than the first approach, and the number of generated samples that need to be used in the training of model distillation is relatively small. But also has disadvantages: the nature of the GAN method itself determines that it has poor diversity of generation, and the performance of the method on large size images is not ideal. On the basis of the thought, the new method provided by the patent effectively solves the problem in the aspect.
The scheme provided by the application applies the diffusion model to the mechanism of model distillation, replaces the original data set necessary in the traditional model distillation and the generator of the GAN model based on the current method. Compared with the GAN-based model distillation, the image data classification method provided by the application overcomes the problems of diversity and the like in the GAN image generator, greatly improves the quality of the generated image, and further reduces the scale of the generated image required by the participation of the model distillation on the premise of keeping the model distillation accuracy.
In the above embodiments, the method for classifying image data is described in detail, and the present application also provides embodiments corresponding to the image data classification apparatus. It should be noted that the present application describes the embodiments of the apparatus portion from two perspectives, one from the perspective of the function module and the other from the perspective of the hardware.
Based on the angle of the functional module, the present embodiment provides an apparatus for classifying image data, the apparatus including:
the acquisition module is used for acquiring image data to be classified;
the calling module is used for calling the distilled target model; wherein the obtaining of the target model comprises: constructing a diffusion model in advance, and training through an original data set to obtain an image generator; generating image training data with an image generator; training an image classification model by taking image training data generated by an image generator as parameters to obtain a distilled target model;
and the classification module is used for carrying out image classification on the image data to be classified by adopting the distilled target model.
Since the embodiments of the apparatus portion and the method portion correspond to each other, please refer to the description of the embodiments of the method portion for the embodiments of the apparatus portion, which is not repeated here.
As a preferred embodiment, the image data classification device further includes: and the verification module is used for training the image classification model by taking the image data generated by the image generator as parameters so as to obtain a distilled target model and then verifying the distilled target model.
The image data classification device provided in this embodiment pre-constructs a diffusion model, trains an image generator using an original data set, generates image data using the image generator constructed based on the diffusion model, and constructs a model distillation framework to obtain an image classification model to be distilled; and after the model construction and the acquisition of parameters required by training are completed, training an image classification model by taking the image data generated by the image generator as parameters to obtain a distilled target model. After the target model is obtained, the image data to be classified is classified by adopting the target model, and the image classification effect is ensured. According to the scheme, firstly, a diffusion model is constructed and trained on the basis of an original data set to serve as an image generation module, and the module is responsible for generating a high-quality image data set with the same size as the original training set. Then, in the training process of model distillation, the generated data is used for replacing an original data set, so that the distillation of the model from the Teacher to the Student is realized, namely the size of the model is effectively compressed on the premise of keeping high precision, and the normal operation of the model distillation process is ensured. The diffusion model is applied to a model distillation mechanism to replace an original data set required in the traditional model distillation, so that the quality of the generated image is greatly improved, and the scale of the generated image required by the participation of the model distillation is further reduced on the premise of keeping the model distillation accuracy.
Based on the hardware perspective, the present embodiment provides another image data classification apparatus, and fig. 3 is a structural diagram of the image data classification apparatus according to another embodiment of the present application, as shown in fig. 3, the image data classification apparatus includes: a memory 20 for storing a computer program;
a processor 21 for implementing the steps of the classification method of image data as mentioned in the above embodiments when executing the computer program.
The processor 21 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The Processor 21 may be implemented in hardware using at least one of a Digital Signal Processor (DSP), a Field-Programmable Gate Array (FPGA), and a Programmable Logic Array (PLA). The processor 21 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 21 may be integrated with a Graphics Processing Unit (GPU) which is responsible for rendering and drawing the content required to be displayed by the display screen. In some embodiments, the processor 21 may further include an Artificial Intelligence (AI) processor for processing computational operations related to machine learning.
The memory 20 may include one or more computer-readable storage media, which may be non-transitory. Memory 20 may also include high speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In this embodiment, the memory 20 is at least used for storing the following computer program 201, wherein after being loaded and executed by the processor 21, the computer program can implement the relevant steps of the classification method of image data disclosed in any one of the foregoing embodiments. In addition, the resources stored in the memory 20 may also include an operating system 202, data 203, and the like, and the storage manner may be a transient storage manner or a permanent storage manner. Operating system 202 may include, among others, windows, unix, linux, and the like. The data 203 may include, but is not limited to, data related to a classification method of image data, and the like.
In some embodiments, the image data classifying device may further include a display 22, an input/output interface 23, a communication interface 24, a power supply 25, and a communication bus 26.
It will be appreciated by those skilled in the art that the configurations shown in the figures do not constitute a limitation of the classification means of the image data and may comprise more or less components than those shown.
The image data classification device provided by the embodiment of the application comprises a memory and a processor, wherein when the processor executes a program stored in the memory, the following method can be realized: a method for classifying image data.
The image data classification device provided in this embodiment pre-constructs a diffusion model, trains an image generator using an original data set, generates image data using the image generator constructed based on the diffusion model, and constructs a model distillation framework to obtain an image classification model to be distilled; and after the model construction and the acquisition of parameters required by training are completed, training an image classification model by taking the image data generated by the image generator as parameters to obtain a distilled target model. After the target model is obtained, the image data to be classified is classified by adopting the target model, and the image classification effect is ensured. According to the scheme, firstly, a diffusion model is constructed and trained on the basis of an original data set to serve as an image generation module, and the module is responsible for generating a high-quality image data set with the same size as the original training set. Then, in the training process of model distillation, the generated data is used for replacing an original data set, so that the distillation of the model from the Teacher to the Student is realized, namely the size of the model is effectively compressed on the premise of keeping high precision, and the normal operation of the model distillation process is ensured. The diffusion model is applied to a model distillation mechanism to replace an original data set required in the traditional model distillation, so that the quality of the generated image is greatly improved, and the scale of the generated image required by the participation of the model distillation is further reduced on the premise of keeping the model distillation accuracy.
Finally, the application also provides a corresponding embodiment of the computer readable storage medium. The computer-readable storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps as set forth in the above-mentioned method embodiments.
It is understood that, if the method in the above embodiments is implemented in the form of software functional units and sold or used as a stand-alone product, it can be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially or partially implemented in the form of a software product, which is stored in a storage medium and performs all or part of the steps of the methods described in the embodiments of the present application, or all or part of the technical solution. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The computer-readable storage medium provided by the embodiment corresponds to the method, and therefore has the same beneficial effects as the method.
The method, apparatus, and medium for classifying image data provided by the present application are described in detail above. The embodiments are described in a progressive mode in the specification, the emphasis of each embodiment is on the difference from the other embodiments, and the same and similar parts among the embodiments can be referred to each other. The device disclosed in the embodiment corresponds to the method disclosed in the embodiment, so that the description is simple, and the relevant points can be referred to the description of the method part. It should be noted that, for those skilled in the art, it is possible to make several improvements and modifications to the present application without departing from the principle of the present application, and such improvements and modifications also fall within the scope of the claims of the present application.
It should also be noted that, in this specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the same element.

Claims (10)

1. A method of classifying image data, comprising:
acquiring image data to be classified;
calling a distilled target model; wherein the obtaining of the target model comprises: constructing a diffusion model in advance, and training through an original data set to obtain an image generator; generating image training data with the image generator; training an image classification model by taking the image training data generated by the image generator as parameters to obtain the distilled target model;
and carrying out image classification on the image data to be classified by adopting the distilled target model.
2. The method for classifying image data according to claim 1, wherein the training of an image classification model using the image training data generated by the image generator as a parameter to obtain the distilled target model comprises:
and finishing training on the image classification model according to a preset iteration number to obtain the distilled target model.
3. The method for classifying image data according to claim 1, wherein after training an image classification model using the image training data generated by the image generator as a parameter to obtain the distilled target model, the method further comprises:
and verifying the target model obtained by distillation.
4. The method for classifying image data according to claim 3, wherein the verifying the distilled object model includes:
acquiring the accuracy of the target model;
judging whether the accuracy rate reduction range of the target model is within an acceptable range;
and if so, judging that the target model is distilled successfully.
5. The method for classifying image data according to claim 4, wherein training the pre-constructed diffusion model through the original data set to obtain the image generator comprises:
training the pre-constructed diffusion model with ImageNet data set as training set to obtain the image generator.
6. The method for classifying image data according to claim 5, wherein the obtaining the accuracy of the target model comprises:
and acquiring the accuracy of the target model by using an ImageNet verification set.
7. The method according to claim 1, wherein the weight of the target model is initialized with a random number.
8. An apparatus for classifying image data, comprising:
the acquisition module is used for acquiring image data to be classified;
the calling module is used for calling the distilled target model; wherein the obtaining of the target model comprises: constructing a diffusion model in advance, and training through an original data set to obtain an image generator; generating image training data with the image generator; training an image classification model by taking the image training data generated by the image generator as parameters to obtain the distilled target model;
and the classification module is used for carrying out image classification on the image data to be classified by adopting the distilled target model.
9. An apparatus for classifying image data, comprising a memory for storing a computer program;
a processor for implementing the steps of the method of classifying image data according to any one of claims 1 to 7 when executing said computer program.
10. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, carries out the steps of the method of classification of image data according to one of claims 1 to 7.
CN202211411839.5A 2022-11-11 2022-11-11 Image data classification method, device and medium Pending CN115620074A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211411839.5A CN115620074A (en) 2022-11-11 2022-11-11 Image data classification method, device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211411839.5A CN115620074A (en) 2022-11-11 2022-11-11 Image data classification method, device and medium

Publications (1)

Publication Number Publication Date
CN115620074A true CN115620074A (en) 2023-01-17

Family

ID=84878972

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211411839.5A Pending CN115620074A (en) 2022-11-11 2022-11-11 Image data classification method, device and medium

Country Status (1)

Country Link
CN (1) CN115620074A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116542321A (en) * 2023-07-06 2023-08-04 中科南京人工智能创新研究院 Image generation model compression and acceleration method and system based on diffusion model

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116542321A (en) * 2023-07-06 2023-08-04 中科南京人工智能创新研究院 Image generation model compression and acceleration method and system based on diffusion model
CN116542321B (en) * 2023-07-06 2023-09-01 中科南京人工智能创新研究院 Image generation model compression and acceleration method and system based on diffusion model

Similar Documents

Publication Publication Date Title
US20220014807A1 (en) Method, apparatus, device and medium for generating captioning information of multimedia data
KR102593440B1 (en) Method, apparatus, devices and media for generating captioning information of multimedia data
CN109871781B (en) Dynamic gesture recognition method and system based on multi-mode 3D convolutional neural network
Liu et al. Learning converged propagations with deep prior ensemble for image enhancement
Kolesnikov et al. PixelCNN models with auxiliary variables for natural image modeling
Tieleman Optimizing neural networks that generate images
CN108021908B (en) Face age group identification method and device, computer device and readable storage medium
US20220156987A1 (en) Adaptive convolutions in neural networks
US20220101121A1 (en) Latent-variable generative model with a noise contrastive prior
CN116363261A (en) Training method of image editing model, image editing method and device
CN116564338B (en) Voice animation generation method, device, electronic equipment and medium
CN111563161B (en) Statement identification method, statement identification device and intelligent equipment
CN115620074A (en) Image data classification method, device and medium
Zablotskaia et al. Unsupervised video decomposition using spatio-temporal iterative inference
CN110633735B (en) Progressive depth convolution network image identification method and device based on wavelet transformation
CN116452706A (en) Image generation method and device for presentation file
Wistuba Bayesian optimization combined with incremental evaluation for neural network architecture optimization
Rosenbaum et al. The return of the gating network: Combining generative models and discriminative training in natural image priors
US20220101145A1 (en) Training energy-based variational autoencoders
CN114565080A (en) Neural network compression method and device, computer readable medium and electronic equipment
CN114004974A (en) Method and device for optimizing images shot in low-light environment
CN110659561A (en) Optimization method and device of internet riot and terrorist video identification model
US20240169500A1 (en) Image and object inpainting with diffusion models
CN112685558B (en) Training method and device for emotion classification model
CN113344181B (en) Neural network structure searching method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination