CN116385328A - Image data enhancement method and device based on noise addition to image - Google Patents

Image data enhancement method and device based on noise addition to image Download PDF

Info

Publication number
CN116385328A
CN116385328A CN202310364619.XA CN202310364619A CN116385328A CN 116385328 A CN116385328 A CN 116385328A CN 202310364619 A CN202310364619 A CN 202310364619A CN 116385328 A CN116385328 A CN 116385328A
Authority
CN
China
Prior art keywords
image
noise
diffusion
target image
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310364619.XA
Other languages
Chinese (zh)
Inventor
暴宇健
汪骞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Longzhi Digital Technology Service Co Ltd
Original Assignee
Beijing Longzhi Digital Technology Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Longzhi Digital Technology Service Co Ltd filed Critical Beijing Longzhi Digital Technology Service Co Ltd
Priority to CN202310364619.XA priority Critical patent/CN116385328A/en
Publication of CN116385328A publication Critical patent/CN116385328A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)

Abstract

The disclosure relates to the technical field of image processing, and provides an image data enhancement method and device based on noise addition to an image. The method comprises the following steps: acquiring an image data set to be data enhanced; continuously adding noise for a target image in an image data set for multiple times by utilizing a diffusion process of an image diffusion model to obtain a first noise image corresponding to the target image; predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises in the first noise image to obtain a restored first denoising image corresponding to the target image; a data-enhanced image dataset is generated using the target image and the first denoised image. By adopting the technical means, the problem that the image obtained by the traditional data enhancement method is lack of reality in the prior art is solved.

Description

Image data enhancement method and device based on noise addition to image
Technical Field
The disclosure relates to the technical field of image processing, and in particular relates to an image data enhancement method and device based on noise addition to an image.
Background
In the field of computer vision, image data enhancement technology is a commonly used method for enriching training data sets and improving generalization capability of models. Existing image data enhancement methods typically generate new image data by performing a series of affine transformations on the original image. Common affine transformations include random rotation, flipping, cropping, etc. For example, the existing image data enhancement method randomly selects an area from an original image to clip, randomly rotates, slightly stretches or overturns the clipped image, and adds the converted image into a training data set. The main disadvantage of this approach is the lack of realism, which is due to the fact that random transformations do not reproduce exactly the changes of the image in practice, and do not effectively simulate visual and environmental changes in practice, such as changes in light, viewing angle, etc., and therefore the resulting image is often not authentic (the lack of realism of the resulting image is understood to be an image that does not correspond to the changes of the image in a real application scenario for data enhancement).
In the process of implementing the disclosed concept, the inventor finds that at least the following technical problems exist in the related art: the image obtained by the traditional data enhancement method lacks the problem of authenticity.
Disclosure of Invention
In view of the above, embodiments of the present disclosure provide an image data enhancement method, an apparatus, an electronic device, and a computer readable storage medium based on adding noise to an image, so as to solve the problem that the image obtained by the conventional data enhancement method lacks of reality in the prior art.
In a first aspect of the embodiments of the present disclosure, there is provided an image data enhancement method based on adding noise to an image, including: acquiring an image data set to be data enhanced; continuously adding noise for a target image in an image data set for multiple times by utilizing a diffusion process of an image diffusion model to obtain a first noise image corresponding to the target image; predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises in the first noise image to obtain a restored first denoising image corresponding to the target image; a data-enhanced image dataset is generated using the target image and the first denoised image.
In a second aspect of the embodiments of the present disclosure, there is provided an image data enhancement apparatus based on adding noise to an image, including: an acquisition module configured to acquire an image dataset to be data enhanced; the diffusion module is configured to continuously add noise to the target image in the image data set for multiple times by utilizing the diffusion process of the image diffusion model to obtain a first noise image corresponding to the target image; the inverse diffusion module is configured to predict a plurality of noises added in the diffusion process by utilizing the inverse diffusion process of the image diffusion model, and sequentially remove the predicted plurality of noises in the first noise image to obtain a restored first denoising image corresponding to the target image; an enhancement module configured to generate a data-enhanced image dataset using the target image and the first denoised image.
In a third aspect of the disclosed embodiments, an electronic device is provided, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the above method when executing the computer program.
In a fourth aspect of the disclosed embodiments, a computer-readable storage medium is provided, which stores a computer program which, when executed by a processor, implements the steps of the above-described method.
Compared with the prior art, the embodiment of the disclosure has the beneficial effects that: because the embodiments of the present disclosure enhance by acquiring an image dataset to be data enhanced; continuously adding noise for a target image in an image data set for multiple times by utilizing a diffusion process of an image diffusion model to obtain a first noise image corresponding to the target image; predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises in the first noise image to obtain a restored first denoising image corresponding to the target image; the target image and the first denoising image are utilized to generate the image dataset with enhanced data, so that the problem that an image obtained by a traditional data enhancement method in the prior art lacks reality can be solved by adopting the technical means, and the image obtained by data enhancement accords with the change of the image in an actual application scene.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings that are required for the embodiments or the description of the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings may be obtained according to these drawings without inventive effort for a person of ordinary skill in the art.
Fig. 1 is a scene schematic diagram of an application scene of an embodiment of the present disclosure;
FIG. 2 is a flow chart of an image data enhancement method based on adding noise to an image provided by an embodiment of the present disclosure;
fig. 3 is a schematic structural view of an image data enhancement device based on adding noise to an image according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system configurations, techniques, etc. in order to provide a thorough understanding of the disclosed embodiments. However, it will be apparent to one skilled in the art that the present disclosure may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present disclosure with unnecessary detail.
An image data enhancing method and apparatus based on adding noise to an image according to embodiments of the present disclosure will be described in detail with reference to the accompanying drawings.
Fig. 1 is a scene diagram of an application scene of an embodiment of the present disclosure. The application scenario may include terminal devices 101, 102, and 103, server 104, and network 105.
The terminal devices 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, and 103 are hardware, they may be various electronic devices having a display screen and supporting communication with the server 104, including but not limited to smartphones, tablets, laptop and desktop computers, etc.; when the terminal devices 101, 102, and 103 are software, they may be installed in the electronic device as above. Terminal devices 101, 102, and 103 may be implemented as multiple software or software modules, or as a single software or software module, as embodiments of the present disclosure are not limited in this regard. Further, various applications, such as a data processing application, an instant messaging tool, social platform software, a search class application, a shopping class application, and the like, may be installed on the terminal devices 101, 102, and 103.
The server 104 may be a server that provides various services, for example, a background server that receives a request transmitted from a terminal device with which communication connection is established, and the background server may perform processing such as receiving and analyzing the request transmitted from the terminal device and generate a processing result. The server 104 may be a server, a server cluster formed by a plurality of servers, or a cloud computing service center, which is not limited in the embodiments of the present disclosure.
The server 104 may be hardware or software. When the server 104 is hardware, it may be various electronic devices that provide various services to the terminal devices 101, 102, and 103. When the server 104 is software, it may be a plurality of software or software modules providing various services to the terminal devices 101, 102, and 103, or may be a single software or software module providing various services to the terminal devices 101, 102, and 103, which is not limited by the embodiments of the present disclosure.
The network 105 may be a wired network using coaxial cable, twisted pair wire, and optical fiber connection, or may be a wireless network that can implement interconnection of various communication devices without wiring, for example, bluetooth (Bluetooth), near field communication (Near Field Communication, NFC), infrared (Infrared), etc., which are not limited by the embodiments of the present disclosure.
The user can establish a communication connection with the server 104 via the network 105 through the terminal devices 101, 102, and 103 to receive or transmit information or the like. It should be noted that the specific types, numbers and combinations of the terminal devices 101, 102 and 103, the server 104 and the network 105 may be adjusted according to the actual requirements of the application scenario, which is not limited by the embodiment of the present disclosure.
Fig. 2 is a flow chart of an image data enhancement method based on adding noise to an image according to an embodiment of the present disclosure. The image data enhancement method of fig. 2 based on adding noise to an image may be performed by the computer or server of fig. 1, or software on the computer or server. As shown in fig. 2, the image data enhancement method based on adding noise to an image includes:
s201, acquiring an image dataset to be data enhanced;
s202, continuously adding noise for a target image in an image data set for multiple times by utilizing a diffusion process of an image diffusion model to obtain a first noise image corresponding to the target image;
s203, predicting a plurality of noises added in the diffusion process by using the back diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises from the first noise image to obtain a restored first denoising image corresponding to the target image;
s204, generating a data-enhanced image data set by using the target image and the first denoising image.
The image diffusion model has two processes, namely a diffusion process and a back diffusion process. The diffusion process is to continuously add noise for a target image in the image data set for a plurality of times (the added noise is obtained through calculation), so as to obtain a first noise image corresponding to the target image; the back diffusion process predicts a plurality of noises added in the diffusion process, and sequentially removes the predicted plurality of noises from the first noise image to obtain a restored first denoising image.
The image dataset comprises a plurality of target images, the target images can be regarded as one for facilitating understanding, the first noise images corresponding to the target images are obtained through a diffusion process, the restored first noise images corresponding to the target images are obtained through a back diffusion process, the first noise images corresponding to each target image are obtained according to the method, and then the image dataset with enhanced data is formed by all the target images and the corresponding first noise images.
The diffusion model is mainly a denoising model in structure, can be a U-Net structure and consists of a plurality of convolution layers and deconvolution layers, and the input and output shapes of the U-Net are identical and are used for predicting noise of each step. The disclosed embodiments are for image data enhancement using a trained image diffusion model.
According to the technical scheme provided by the embodiment of the disclosure, an image dataset to be enhanced by data is obtained; continuously adding noise for a target image in an image data set for multiple times by utilizing a diffusion process of an image diffusion model to obtain a first noise image corresponding to the target image; predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises in the first noise image to obtain a restored first denoising image corresponding to the target image; the target image and the first denoising image are utilized to generate the image dataset with enhanced data, so that the problem that an image obtained by a traditional data enhancement method in the prior art lacks reality can be solved by adopting the technical means, and the image obtained by data enhancement accords with the change of the image in an actual application scene.
Continuously adding noise to a target image in an image data set for a plurality of times by utilizing a diffusion process of an image diffusion model to obtain a first noise image corresponding to the target image, wherein the method comprises the following steps: the target image after each noise addition is calculated by: and calculating the target image after the noise is added according to a noise calculation formula based on the target image after the noise is added last time and the basic noise obtained by the last time, wherein the basic noise obtained by the last time is obtained by sampling from Gaussian noise when the noise is added last time, and the Gaussian noise is noise meeting Gaussian distribution.
The noise calculation formula is:
Figure BDA0004166997250000061
x t is the target image after the t time of adding noise, x t-1 Is the target image after adding noise for t-1 time, beta t Is constant, beta t The value range is between 0 and 1, E t-1 Is the basic noise obtained by the t-1 th sampling, when t is equal to 1, x 0 Is a target image, when the number of times of adding noise is N, x N Is the first noise image corresponding to the target image.
x N Is the target image after adding noise for the nth time, namely the first noise image corresponding to the target image.
The back diffusion process is the reverse process of the diffusion process, the trained image diffusion model predicts a plurality of noises added in the diffusion process in the back diffusion process, then the predicted plurality of noises are sequentially removed from the first noise image, and finally the restored first denoising image corresponding to the target image is obtained. The image diffusion model learns and stores the corresponding relation between the noise added in the diffusion process and the noise predicted in the inverse diffusion process through training.
In an alternative embodiment, the size of the data-enhanced image dataset is controlled by controlling the number of times the target image is processed using the image diffusion model, and controlling the number of first de-noised images corresponding to the target image.
Each time the image diffusion model is used for processing the target image, a first denoising image corresponding to the target image is obtained, when the image diffusion model is used for processing the target image for a plurality of times, a large number of first denoising images corresponding to the target image are obtained (because the basic noise obtained by sampling is random, each time the image diffusion model is used for processing the target image, the finally obtained first denoising images are different, and therefore, the image data enhancement can be realized by using the image diffusion model).
The method further comprises the steps of before continuously adding noise to the target image in the image data set for multiple times by utilizing the diffusion process of the image diffusion model to obtain a first noise image corresponding to the target image: acquiring a training data set; continuously adding noise for the training images in the training data set for multiple times by utilizing the diffusion process of the image diffusion model to obtain second noise images corresponding to the training images; predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises in the second noise image to obtain a restored second denoising image corresponding to the target image; calculating a total loss between the plurality of noise added during diffusion and the plurality of noise predicted during inverse diffusion; and updating model parameters of the image diffusion model according to the total loss to complete training of the image diffusion model.
The training image is identical to the target image, the second noise image is identical to the first noise image, and the names are different only to distinguish model training from model use.
The calculation of the loss between the plurality of noise added during diffusion and the plurality of noise predicted during inverse diffusion may be the calculation of the similarity between two corresponding noise (e.g., the first added noise corresponds to the predicted first added noise, and the second added noise corresponds to the predicted second added noise).
The method further comprises the steps of before continuously adding noise to the training images in the training data set for multiple times by utilizing the diffusion process of the image diffusion model to obtain second noise images corresponding to the training images: the training image after each noise addition is calculated by: based on the training image after the last noise addition and the basic noise obtained by the last sampling, the training image after the current noise addition is obtained through calculation of a noise calculation formula, wherein the basic noise obtained by the last sampling is obtained by sampling from Gaussian noise when the noise is added last time, the Gaussian noise is noise meeting Gaussian distribution, and the training image after the noise addition for many times is a second noise image corresponding to the training image.
Calculating a total loss between the plurality of noise added during diffusion and the plurality of noise predicted during back diffusion, comprising: calculating a loss between each noise added in the diffusion process and the noise predicted in the inverse diffusion process corresponding to the noise; the sum of all losses calculated is taken as the total loss.
Alternatively, the loss function may be a mean square error.
Optionally, image data enhancement is performed by:
acquiring a training data set; continuously adding noise for the training images in the training data set for multiple times by utilizing the diffusion process of the image diffusion model to obtain second noise images corresponding to the training images; predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises in the second noise image to obtain a restored second denoising image corresponding to the target image; calculating a total loss between the plurality of noise added during the diffusion and the plurality of noise predicted during the back-diffusion; updating model parameters of the image diffusion model according to the total loss to complete training of the image diffusion model; acquiring an image data set to be data enhanced; continuously adding noise for a target image in an image data set for multiple times by utilizing a diffusion process of an image diffusion model to obtain a first noise image corresponding to the target image, and calculating the target image after noise addition each time by the following modes: calculating to obtain a target image after the noise is added according to a noise calculation formula based on the target image after the noise is added last time and the basic noise obtained by the last time, wherein the basic noise obtained by the last time is obtained by sampling from Gaussian noise when the noise is added last time, the Gaussian noise is the noise meeting Gaussian distribution, and the target image after the noise is added for many times is the first noise image; predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises in the first noise image to obtain a restored first denoising image corresponding to the target image; generating an image dataset with enhanced data by using a target image and a first denoising image, and controlling the number of the first denoising images corresponding to the target image and controlling the scale of the image dataset with enhanced data by using the times of processing the target image by using the image diffusion model.
Any combination of the above optional solutions may be adopted to form an optional embodiment of the present application, which is not described herein in detail.
The following are device embodiments of the present disclosure that may be used to perform method embodiments of the present disclosure. For details not disclosed in the embodiments of the apparatus of the present disclosure, please refer to the embodiments of the method of the present disclosure.
Fig. 3 is a schematic diagram of an image data enhancement device based on adding noise to an image according to an embodiment of the present disclosure. As shown in fig. 3, the image data enhancing apparatus based on adding noise to an image includes:
an acquisition module 301 configured to acquire an image dataset to be data enhanced;
the diffusion module 302 is configured to continuously add noise to the target image in the image data set for multiple times by using the diffusion process of the image diffusion model to obtain a first noise image corresponding to the target image;
a back diffusion module 303, configured to predict a plurality of noises added in a diffusion process by using a back diffusion process of an image diffusion model, and sequentially remove the predicted plurality of noises from the first noise image, so as to obtain a restored first denoising image corresponding to the target image;
the enhancement module 304 is configured to generate a data-enhanced image dataset using the target image and the first denoised image.
The image diffusion model has two processes, namely a diffusion process and a back diffusion process. The diffusion process is to continuously add noise for a target image in the image data set for a plurality of times (the added noise is obtained through calculation), so as to obtain a first noise image corresponding to the target image; the back diffusion process predicts a plurality of noises added in the diffusion process, and sequentially removes the predicted plurality of noises from the first noise image to obtain a restored first denoising image.
The image dataset comprises a plurality of target images, the target images can be regarded as one for facilitating understanding, the first noise images corresponding to the target images are obtained through a diffusion process, the restored first noise images corresponding to the target images are obtained through a back diffusion process, the first noise images corresponding to each target image are obtained according to the method, and then the image dataset with enhanced data is formed by all the target images and the corresponding first noise images.
The disclosed embodiments are for image data enhancement using a trained image diffusion model.
According to the technical scheme provided by the embodiment of the disclosure, an image dataset to be enhanced by data is obtained; continuously adding noise for a target image in an image data set for multiple times by utilizing a diffusion process of an image diffusion model to obtain a first noise image corresponding to the target image; predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises in the first noise image to obtain a restored first denoising image corresponding to the target image; the target image and the first denoising image are utilized to generate the image dataset with enhanced data, so that the problem that an image obtained by a traditional data enhancement method in the prior art lacks reality can be solved by adopting the technical means, and the image obtained by data enhancement accords with the change of the image in an actual application scene.
Optionally, the diffusion module 302 is further configured to calculate the target image after each noise addition by: and calculating the target image after the noise is added according to a noise calculation formula based on the target image after the noise is added last time and the basic noise obtained by the last time, wherein the basic noise obtained by the last time is obtained by sampling from Gaussian noise when the noise is added last time, and the Gaussian noise is noise meeting Gaussian distribution.
The noise calculation formula is:
Figure BDA0004166997250000101
x t is the target image after the t time of adding noise, x t-1 Is the target image after adding noise for t-1 time, beta t Is constant, beta t The value range is between 0 and 1, E t-1 Is the basic noise obtained by the t-1 th sampling, when t is equal to 1, x 0 Is a target image, when the number of times of adding noise is N, x N Is the first noise image corresponding to the target image.
x N Is the Nth added noiseThe target image after sound, namely the first noise image corresponding to the target image.
The back diffusion process is the reverse process of the diffusion process, the trained image diffusion model predicts a plurality of noises added in the diffusion process in the back diffusion process, then the predicted plurality of noises are sequentially removed from the first noise image, and finally the restored first denoising image corresponding to the target image is obtained. The image diffusion model learns and stores the corresponding relation between the noise added in the diffusion process and the noise predicted in the inverse diffusion process through training.
Optionally, the enhancing module 304 is further configured to control the size of the data-enhanced image dataset by controlling the number of first denoising images corresponding to the target image by using the number of times the target image is processed by the image diffusion model.
Each time the image diffusion model is used for processing the target image, a first denoising image corresponding to the target image is obtained, when the image diffusion model is used for processing the target image for a plurality of times, a large number of first denoising images corresponding to the target image are obtained (because the basic noise obtained by sampling is random, each time the image diffusion model is used for processing the target image, the finally obtained first denoising images are different, and therefore, the image data enhancement can be realized by using the image diffusion model).
Optionally, the diffusion module 302 is further configured to obtain a training data set; continuously adding noise for the training images in the training data set for multiple times by utilizing the diffusion process of the image diffusion model to obtain second noise images corresponding to the training images; predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises in the second noise image to obtain a restored second denoising image corresponding to the target image; calculating a total loss between the plurality of noise added during diffusion and the plurality of noise predicted during inverse diffusion; and updating model parameters of the image diffusion model according to the total loss to complete training of the image diffusion model.
The training image is identical to the target image, the second noise image is identical to the first noise image, and the names are different only to distinguish model training from model use.
The calculation of the loss between the plurality of noise added during diffusion and the plurality of noise predicted during inverse diffusion may be the calculation of the similarity between two corresponding noise (e.g., the first added noise corresponds to the predicted first added noise, and the second added noise corresponds to the predicted second added noise).
Optionally, the diffusion module 302 is further configured to calculate the training image after each noise addition by: based on the training image after the last noise addition and the basic noise obtained by the last sampling, the training image after the current noise addition is obtained through calculation of a noise calculation formula, wherein the basic noise obtained by the last sampling is obtained by sampling from Gaussian noise when the noise is added last time, the Gaussian noise is noise meeting Gaussian distribution, and the training image after the noise addition for many times is a second noise image corresponding to the training image.
Optionally, the diffusion module 302 is further configured to calculate a loss between each noise added during diffusion and the noise predicted during inverse diffusion corresponding to the noise; the sum of all losses calculated is taken as the total loss.
Alternatively, the loss function may be a mean square error.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic of each process, and should not constitute any limitation on the implementation process of the embodiments of the disclosure.
Fig. 4 is a schematic diagram of an electronic device 4 provided by an embodiment of the present disclosure. As shown in fig. 4, the electronic apparatus 4 of this embodiment includes: a processor 401, a memory 402 and a computer program 403 stored in the memory 402 and executable on the processor 401. The steps of the various method embodiments described above are implemented by processor 401 when executing computer program 403. Alternatively, the processor 401, when executing the computer program 403, performs the functions of the modules/units in the above-described apparatus embodiments.
The electronic device 4 may be a desktop computer, a notebook computer, a palm computer, a cloud server, or the like. The electronic device 4 may include, but is not limited to, a processor 401 and a memory 402. It will be appreciated by those skilled in the art that fig. 4 is merely an example of the electronic device 4 and is not limiting of the electronic device 4 and may include more or fewer components than shown, or different components.
The processor 401 may be a central processing unit (Central Processing Unit, CPU) or other general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like.
The memory 402 may be an internal storage unit of the electronic device 4, for example, a hard disk or a memory of the electronic device 4. The memory 402 may also be an external storage device of the electronic device 4, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like, which are provided on the electronic device 4. Memory 402 may also include both internal storage units and external storage devices of electronic device 4. The memory 402 is used to store computer programs and other programs and data required by the electronic device.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit.
The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present disclosure may implement all or part of the flow of the method of the above-described embodiments, or may be implemented by a computer program to instruct related hardware, and the computer program may be stored in a computer readable storage medium, where the computer program, when executed by a processor, may implement the steps of the method embodiments described above. The computer program may comprise computer program code, which may be in source code form, object code form, executable file or in some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the content of the computer readable medium can be appropriately increased or decreased according to the requirements of the jurisdiction's jurisdiction and the patent practice, for example, in some jurisdictions, the computer readable medium does not include electrical carrier signals and telecommunication signals according to the jurisdiction and the patent practice.
The above embodiments are merely for illustrating the technical solution of the present disclosure, and are not limiting thereof; although the present disclosure has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the disclosure, and are intended to be included in the scope of the present disclosure.

Claims (10)

1. An image data enhancement method based on adding noise to an image, comprising:
acquiring an image data set to be data enhanced;
continuously adding noise for a target image in the image data set for multiple times by utilizing a diffusion process of an image diffusion model to obtain a first noise image corresponding to the target image;
predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises from the first noise image to obtain a restored first denoising image corresponding to the target image;
and generating the data-enhanced image data set by utilizing the target image and the first denoising image.
2. The method according to claim 1, wherein the diffusing process using the image diffusion model adds noise to the target image in the image dataset a plurality of times in succession, to obtain a first noise image corresponding to the target image, comprising:
the target image after each noise addition is calculated by:
and calculating to obtain the target image after the noise is added according to a noise calculation formula based on the target image after the noise is added last time and the basic noise obtained by the last time, wherein the basic noise obtained by the last time is obtained by sampling from Gaussian noise when the noise is added last time, and the Gaussian noise is noise meeting Gaussian distribution.
3. The method of claim 2, wherein the noise calculation formula is:
Figure FDA0004166997240000011
x t is the target image after the t time of adding noise, x t-1 Is the target image after adding noise for t-1 time, beta t Is constant, beta t The value range is between 0 and 1, E t-1 Is obtained by sampling at t-1 th timeBase noise, x when t is equal to 1 0 Is a target image, when the number of times of adding noise is N, x N Is a first noise image corresponding to the target image.
4. The method according to claim 1, characterized in that it comprises:
and controlling the number of times of processing the target image by using the image diffusion model, controlling the number of first denoising images corresponding to the target image, and controlling the scale of the image dataset after data enhancement.
5. The method according to claim 1, wherein the diffusion process using the image diffusion model adds noise to the target image in the image dataset a plurality of times in succession, and before obtaining the first noise image corresponding to the target image, the method further comprises:
acquiring a training data set;
continuously adding noise for the training images in the training data set for multiple times by utilizing the diffusion process of the image diffusion model to obtain second noise images corresponding to the training images;
predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises in the second noise image to obtain a restored second denoising image corresponding to the target image;
calculating a total loss between the plurality of noise added during the diffusion and the plurality of noise predicted during the back-diffusion;
and updating model parameters of the image diffusion model according to the total loss to complete training of the image diffusion model.
6. The method of claim 5, wherein the diffusion process using the image diffusion model adds noise to the training image in the training dataset multiple times in succession, and further comprising, before obtaining the second noise image corresponding to the training image:
the training image after each noise addition is calculated by:
based on the training image after the last noise addition and the basic noise obtained by the last sampling, the training image after the current noise addition is obtained through calculation of a noise calculation formula, wherein the basic noise obtained by the last sampling is obtained by sampling from Gaussian noise when the noise is added last time, the Gaussian noise is noise meeting Gaussian distribution, and the training image after the noise addition for many times is a second noise image corresponding to the training image.
7. The method of claim 5, wherein said calculating a total loss between the plurality of noise added during said diffusing and the plurality of noise predicted during said back-diffusing comprises:
calculating a loss between each noise added in the diffusion process and a noise predicted in the inverse diffusion process corresponding to the noise;
the sum of all the losses calculated is taken as the total loss.
8. An image data enhancement apparatus based on adding noise to an image, comprising:
an acquisition module configured to acquire an image dataset to be data enhanced;
the diffusion module is configured to continuously add noise to a target image in the image data set for multiple times by utilizing the diffusion process of the image diffusion model to obtain a first noise image corresponding to the target image;
a back diffusion module configured to predict a plurality of noises added in the diffusion process by using a back diffusion process of the image diffusion model, and sequentially remove the predicted plurality of noises in the first noise image, so as to obtain a restored first denoising image corresponding to the target image;
an enhancement module configured to generate the data-enhanced image dataset using the target image and the first denoised image.
9. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 7 when the computer program is executed.
10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1 to 7.
CN202310364619.XA 2023-04-07 2023-04-07 Image data enhancement method and device based on noise addition to image Pending CN116385328A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310364619.XA CN116385328A (en) 2023-04-07 2023-04-07 Image data enhancement method and device based on noise addition to image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310364619.XA CN116385328A (en) 2023-04-07 2023-04-07 Image data enhancement method and device based on noise addition to image

Publications (1)

Publication Number Publication Date
CN116385328A true CN116385328A (en) 2023-07-04

Family

ID=86961180

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310364619.XA Pending CN116385328A (en) 2023-04-07 2023-04-07 Image data enhancement method and device based on noise addition to image

Country Status (1)

Country Link
CN (1) CN116385328A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117036197A (en) * 2023-08-18 2023-11-10 杭州食方科技有限公司 Image denoising model generation method, device, equipment and computer readable medium
CN117473397A (en) * 2023-12-25 2024-01-30 清华大学 Diffusion model data enhancement-based emotion recognition method and system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117036197A (en) * 2023-08-18 2023-11-10 杭州食方科技有限公司 Image denoising model generation method, device, equipment and computer readable medium
CN117473397A (en) * 2023-12-25 2024-01-30 清华大学 Diffusion model data enhancement-based emotion recognition method and system
CN117473397B (en) * 2023-12-25 2024-03-19 清华大学 Diffusion model data enhancement-based emotion recognition method and system

Similar Documents

Publication Publication Date Title
CN107633218B (en) Method and apparatus for generating image
US11978245B2 (en) Method and apparatus for generating image
CN109829432B (en) Method and apparatus for generating information
CN108197652B (en) Method and apparatus for generating information
CN116385328A (en) Image data enhancement method and device based on noise addition to image
CN109118456B (en) Image processing method and device
CN110516678B (en) Image processing method and device
CN111831855B (en) Method, apparatus, electronic device, and medium for matching videos
CN108510084B (en) Method and apparatus for generating information
CN110211017B (en) Image processing method and device and electronic equipment
CN110059623B (en) Method and apparatus for generating information
CN111459364B (en) Icon updating method and device and electronic equipment
CN112419179B (en) Method, apparatus, device and computer readable medium for repairing image
CN116403250A (en) Face recognition method and device with shielding
CN111369475A (en) Method and apparatus for processing video
CN108921792B (en) Method and device for processing pictures
CN114066722A (en) Method and device for acquiring image and electronic equipment
CN118042246A (en) Video generation method, device, electronic equipment and readable storage medium
CN117894038A (en) Method and device for generating object gesture in image
CN111669476B (en) Watermark processing method, device, electronic equipment and medium
CN113762279A (en) Target image recognition method, model training method, device, equipment and medium
CN111783731A (en) Method and device for extracting video features
CN116108810A (en) Text data enhancement method and device
CN115953803A (en) Training method and device for human body recognition model
CN116596813A (en) Image data enhancement method and device based on image destruction processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination