CN116385328A - Image data enhancement method and device based on noise addition to image - Google Patents
Image data enhancement method and device based on noise addition to image Download PDFInfo
- Publication number
- CN116385328A CN116385328A CN202310364619.XA CN202310364619A CN116385328A CN 116385328 A CN116385328 A CN 116385328A CN 202310364619 A CN202310364619 A CN 202310364619A CN 116385328 A CN116385328 A CN 116385328A
- Authority
- CN
- China
- Prior art keywords
- image
- noise
- diffusion
- target image
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 54
- 238000009792 diffusion process Methods 0.000 claims abstract description 171
- 238000012545 processing Methods 0.000 claims abstract description 16
- 238000012549 training Methods 0.000 claims description 49
- 238000004364 calculation method Methods 0.000 claims description 19
- 238000005070 sampling Methods 0.000 claims description 18
- 238000004590 computer program Methods 0.000 claims description 17
- 230000008569 process Effects 0.000 claims description 9
- 238000004891 communication Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 230000008859 change Effects 0.000 description 3
- 230000002708 enhancing effect Effects 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 238000000844 transformation Methods 0.000 description 3
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Processing (AREA)
Abstract
The disclosure relates to the technical field of image processing, and provides an image data enhancement method and device based on noise addition to an image. The method comprises the following steps: acquiring an image data set to be data enhanced; continuously adding noise for a target image in an image data set for multiple times by utilizing a diffusion process of an image diffusion model to obtain a first noise image corresponding to the target image; predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises in the first noise image to obtain a restored first denoising image corresponding to the target image; a data-enhanced image dataset is generated using the target image and the first denoised image. By adopting the technical means, the problem that the image obtained by the traditional data enhancement method is lack of reality in the prior art is solved.
Description
Technical Field
The disclosure relates to the technical field of image processing, and in particular relates to an image data enhancement method and device based on noise addition to an image.
Background
In the field of computer vision, image data enhancement technology is a commonly used method for enriching training data sets and improving generalization capability of models. Existing image data enhancement methods typically generate new image data by performing a series of affine transformations on the original image. Common affine transformations include random rotation, flipping, cropping, etc. For example, the existing image data enhancement method randomly selects an area from an original image to clip, randomly rotates, slightly stretches or overturns the clipped image, and adds the converted image into a training data set. The main disadvantage of this approach is the lack of realism, which is due to the fact that random transformations do not reproduce exactly the changes of the image in practice, and do not effectively simulate visual and environmental changes in practice, such as changes in light, viewing angle, etc., and therefore the resulting image is often not authentic (the lack of realism of the resulting image is understood to be an image that does not correspond to the changes of the image in a real application scenario for data enhancement).
In the process of implementing the disclosed concept, the inventor finds that at least the following technical problems exist in the related art: the image obtained by the traditional data enhancement method lacks the problem of authenticity.
Disclosure of Invention
In view of the above, embodiments of the present disclosure provide an image data enhancement method, an apparatus, an electronic device, and a computer readable storage medium based on adding noise to an image, so as to solve the problem that the image obtained by the conventional data enhancement method lacks of reality in the prior art.
In a first aspect of the embodiments of the present disclosure, there is provided an image data enhancement method based on adding noise to an image, including: acquiring an image data set to be data enhanced; continuously adding noise for a target image in an image data set for multiple times by utilizing a diffusion process of an image diffusion model to obtain a first noise image corresponding to the target image; predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises in the first noise image to obtain a restored first denoising image corresponding to the target image; a data-enhanced image dataset is generated using the target image and the first denoised image.
In a second aspect of the embodiments of the present disclosure, there is provided an image data enhancement apparatus based on adding noise to an image, including: an acquisition module configured to acquire an image dataset to be data enhanced; the diffusion module is configured to continuously add noise to the target image in the image data set for multiple times by utilizing the diffusion process of the image diffusion model to obtain a first noise image corresponding to the target image; the inverse diffusion module is configured to predict a plurality of noises added in the diffusion process by utilizing the inverse diffusion process of the image diffusion model, and sequentially remove the predicted plurality of noises in the first noise image to obtain a restored first denoising image corresponding to the target image; an enhancement module configured to generate a data-enhanced image dataset using the target image and the first denoised image.
In a third aspect of the disclosed embodiments, an electronic device is provided, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the above method when executing the computer program.
In a fourth aspect of the disclosed embodiments, a computer-readable storage medium is provided, which stores a computer program which, when executed by a processor, implements the steps of the above-described method.
Compared with the prior art, the embodiment of the disclosure has the beneficial effects that: because the embodiments of the present disclosure enhance by acquiring an image dataset to be data enhanced; continuously adding noise for a target image in an image data set for multiple times by utilizing a diffusion process of an image diffusion model to obtain a first noise image corresponding to the target image; predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises in the first noise image to obtain a restored first denoising image corresponding to the target image; the target image and the first denoising image are utilized to generate the image dataset with enhanced data, so that the problem that an image obtained by a traditional data enhancement method in the prior art lacks reality can be solved by adopting the technical means, and the image obtained by data enhancement accords with the change of the image in an actual application scene.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings that are required for the embodiments or the description of the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings may be obtained according to these drawings without inventive effort for a person of ordinary skill in the art.
Fig. 1 is a scene schematic diagram of an application scene of an embodiment of the present disclosure;
FIG. 2 is a flow chart of an image data enhancement method based on adding noise to an image provided by an embodiment of the present disclosure;
fig. 3 is a schematic structural view of an image data enhancement device based on adding noise to an image according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system configurations, techniques, etc. in order to provide a thorough understanding of the disclosed embodiments. However, it will be apparent to one skilled in the art that the present disclosure may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present disclosure with unnecessary detail.
An image data enhancing method and apparatus based on adding noise to an image according to embodiments of the present disclosure will be described in detail with reference to the accompanying drawings.
Fig. 1 is a scene diagram of an application scene of an embodiment of the present disclosure. The application scenario may include terminal devices 101, 102, and 103, server 104, and network 105.
The terminal devices 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, and 103 are hardware, they may be various electronic devices having a display screen and supporting communication with the server 104, including but not limited to smartphones, tablets, laptop and desktop computers, etc.; when the terminal devices 101, 102, and 103 are software, they may be installed in the electronic device as above. Terminal devices 101, 102, and 103 may be implemented as multiple software or software modules, or as a single software or software module, as embodiments of the present disclosure are not limited in this regard. Further, various applications, such as a data processing application, an instant messaging tool, social platform software, a search class application, a shopping class application, and the like, may be installed on the terminal devices 101, 102, and 103.
The server 104 may be a server that provides various services, for example, a background server that receives a request transmitted from a terminal device with which communication connection is established, and the background server may perform processing such as receiving and analyzing the request transmitted from the terminal device and generate a processing result. The server 104 may be a server, a server cluster formed by a plurality of servers, or a cloud computing service center, which is not limited in the embodiments of the present disclosure.
The server 104 may be hardware or software. When the server 104 is hardware, it may be various electronic devices that provide various services to the terminal devices 101, 102, and 103. When the server 104 is software, it may be a plurality of software or software modules providing various services to the terminal devices 101, 102, and 103, or may be a single software or software module providing various services to the terminal devices 101, 102, and 103, which is not limited by the embodiments of the present disclosure.
The network 105 may be a wired network using coaxial cable, twisted pair wire, and optical fiber connection, or may be a wireless network that can implement interconnection of various communication devices without wiring, for example, bluetooth (Bluetooth), near field communication (Near Field Communication, NFC), infrared (Infrared), etc., which are not limited by the embodiments of the present disclosure.
The user can establish a communication connection with the server 104 via the network 105 through the terminal devices 101, 102, and 103 to receive or transmit information or the like. It should be noted that the specific types, numbers and combinations of the terminal devices 101, 102 and 103, the server 104 and the network 105 may be adjusted according to the actual requirements of the application scenario, which is not limited by the embodiment of the present disclosure.
Fig. 2 is a flow chart of an image data enhancement method based on adding noise to an image according to an embodiment of the present disclosure. The image data enhancement method of fig. 2 based on adding noise to an image may be performed by the computer or server of fig. 1, or software on the computer or server. As shown in fig. 2, the image data enhancement method based on adding noise to an image includes:
s201, acquiring an image dataset to be data enhanced;
s202, continuously adding noise for a target image in an image data set for multiple times by utilizing a diffusion process of an image diffusion model to obtain a first noise image corresponding to the target image;
s203, predicting a plurality of noises added in the diffusion process by using the back diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises from the first noise image to obtain a restored first denoising image corresponding to the target image;
s204, generating a data-enhanced image data set by using the target image and the first denoising image.
The image diffusion model has two processes, namely a diffusion process and a back diffusion process. The diffusion process is to continuously add noise for a target image in the image data set for a plurality of times (the added noise is obtained through calculation), so as to obtain a first noise image corresponding to the target image; the back diffusion process predicts a plurality of noises added in the diffusion process, and sequentially removes the predicted plurality of noises from the first noise image to obtain a restored first denoising image.
The image dataset comprises a plurality of target images, the target images can be regarded as one for facilitating understanding, the first noise images corresponding to the target images are obtained through a diffusion process, the restored first noise images corresponding to the target images are obtained through a back diffusion process, the first noise images corresponding to each target image are obtained according to the method, and then the image dataset with enhanced data is formed by all the target images and the corresponding first noise images.
The diffusion model is mainly a denoising model in structure, can be a U-Net structure and consists of a plurality of convolution layers and deconvolution layers, and the input and output shapes of the U-Net are identical and are used for predicting noise of each step. The disclosed embodiments are for image data enhancement using a trained image diffusion model.
According to the technical scheme provided by the embodiment of the disclosure, an image dataset to be enhanced by data is obtained; continuously adding noise for a target image in an image data set for multiple times by utilizing a diffusion process of an image diffusion model to obtain a first noise image corresponding to the target image; predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises in the first noise image to obtain a restored first denoising image corresponding to the target image; the target image and the first denoising image are utilized to generate the image dataset with enhanced data, so that the problem that an image obtained by a traditional data enhancement method in the prior art lacks reality can be solved by adopting the technical means, and the image obtained by data enhancement accords with the change of the image in an actual application scene.
Continuously adding noise to a target image in an image data set for a plurality of times by utilizing a diffusion process of an image diffusion model to obtain a first noise image corresponding to the target image, wherein the method comprises the following steps: the target image after each noise addition is calculated by: and calculating the target image after the noise is added according to a noise calculation formula based on the target image after the noise is added last time and the basic noise obtained by the last time, wherein the basic noise obtained by the last time is obtained by sampling from Gaussian noise when the noise is added last time, and the Gaussian noise is noise meeting Gaussian distribution.
The noise calculation formula is:
x t is the target image after the t time of adding noise, x t-1 Is the target image after adding noise for t-1 time, beta t Is constant, beta t The value range is between 0 and 1, E t-1 Is the basic noise obtained by the t-1 th sampling, when t is equal to 1, x 0 Is a target image, when the number of times of adding noise is N, x N Is the first noise image corresponding to the target image.
x N Is the target image after adding noise for the nth time, namely the first noise image corresponding to the target image.
The back diffusion process is the reverse process of the diffusion process, the trained image diffusion model predicts a plurality of noises added in the diffusion process in the back diffusion process, then the predicted plurality of noises are sequentially removed from the first noise image, and finally the restored first denoising image corresponding to the target image is obtained. The image diffusion model learns and stores the corresponding relation between the noise added in the diffusion process and the noise predicted in the inverse diffusion process through training.
In an alternative embodiment, the size of the data-enhanced image dataset is controlled by controlling the number of times the target image is processed using the image diffusion model, and controlling the number of first de-noised images corresponding to the target image.
Each time the image diffusion model is used for processing the target image, a first denoising image corresponding to the target image is obtained, when the image diffusion model is used for processing the target image for a plurality of times, a large number of first denoising images corresponding to the target image are obtained (because the basic noise obtained by sampling is random, each time the image diffusion model is used for processing the target image, the finally obtained first denoising images are different, and therefore, the image data enhancement can be realized by using the image diffusion model).
The method further comprises the steps of before continuously adding noise to the target image in the image data set for multiple times by utilizing the diffusion process of the image diffusion model to obtain a first noise image corresponding to the target image: acquiring a training data set; continuously adding noise for the training images in the training data set for multiple times by utilizing the diffusion process of the image diffusion model to obtain second noise images corresponding to the training images; predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises in the second noise image to obtain a restored second denoising image corresponding to the target image; calculating a total loss between the plurality of noise added during diffusion and the plurality of noise predicted during inverse diffusion; and updating model parameters of the image diffusion model according to the total loss to complete training of the image diffusion model.
The training image is identical to the target image, the second noise image is identical to the first noise image, and the names are different only to distinguish model training from model use.
The calculation of the loss between the plurality of noise added during diffusion and the plurality of noise predicted during inverse diffusion may be the calculation of the similarity between two corresponding noise (e.g., the first added noise corresponds to the predicted first added noise, and the second added noise corresponds to the predicted second added noise).
The method further comprises the steps of before continuously adding noise to the training images in the training data set for multiple times by utilizing the diffusion process of the image diffusion model to obtain second noise images corresponding to the training images: the training image after each noise addition is calculated by: based on the training image after the last noise addition and the basic noise obtained by the last sampling, the training image after the current noise addition is obtained through calculation of a noise calculation formula, wherein the basic noise obtained by the last sampling is obtained by sampling from Gaussian noise when the noise is added last time, the Gaussian noise is noise meeting Gaussian distribution, and the training image after the noise addition for many times is a second noise image corresponding to the training image.
Calculating a total loss between the plurality of noise added during diffusion and the plurality of noise predicted during back diffusion, comprising: calculating a loss between each noise added in the diffusion process and the noise predicted in the inverse diffusion process corresponding to the noise; the sum of all losses calculated is taken as the total loss.
Alternatively, the loss function may be a mean square error.
Optionally, image data enhancement is performed by:
acquiring a training data set; continuously adding noise for the training images in the training data set for multiple times by utilizing the diffusion process of the image diffusion model to obtain second noise images corresponding to the training images; predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises in the second noise image to obtain a restored second denoising image corresponding to the target image; calculating a total loss between the plurality of noise added during the diffusion and the plurality of noise predicted during the back-diffusion; updating model parameters of the image diffusion model according to the total loss to complete training of the image diffusion model; acquiring an image data set to be data enhanced; continuously adding noise for a target image in an image data set for multiple times by utilizing a diffusion process of an image diffusion model to obtain a first noise image corresponding to the target image, and calculating the target image after noise addition each time by the following modes: calculating to obtain a target image after the noise is added according to a noise calculation formula based on the target image after the noise is added last time and the basic noise obtained by the last time, wherein the basic noise obtained by the last time is obtained by sampling from Gaussian noise when the noise is added last time, the Gaussian noise is the noise meeting Gaussian distribution, and the target image after the noise is added for many times is the first noise image; predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises in the first noise image to obtain a restored first denoising image corresponding to the target image; generating an image dataset with enhanced data by using a target image and a first denoising image, and controlling the number of the first denoising images corresponding to the target image and controlling the scale of the image dataset with enhanced data by using the times of processing the target image by using the image diffusion model.
Any combination of the above optional solutions may be adopted to form an optional embodiment of the present application, which is not described herein in detail.
The following are device embodiments of the present disclosure that may be used to perform method embodiments of the present disclosure. For details not disclosed in the embodiments of the apparatus of the present disclosure, please refer to the embodiments of the method of the present disclosure.
Fig. 3 is a schematic diagram of an image data enhancement device based on adding noise to an image according to an embodiment of the present disclosure. As shown in fig. 3, the image data enhancing apparatus based on adding noise to an image includes:
an acquisition module 301 configured to acquire an image dataset to be data enhanced;
the diffusion module 302 is configured to continuously add noise to the target image in the image data set for multiple times by using the diffusion process of the image diffusion model to obtain a first noise image corresponding to the target image;
a back diffusion module 303, configured to predict a plurality of noises added in a diffusion process by using a back diffusion process of an image diffusion model, and sequentially remove the predicted plurality of noises from the first noise image, so as to obtain a restored first denoising image corresponding to the target image;
the enhancement module 304 is configured to generate a data-enhanced image dataset using the target image and the first denoised image.
The image diffusion model has two processes, namely a diffusion process and a back diffusion process. The diffusion process is to continuously add noise for a target image in the image data set for a plurality of times (the added noise is obtained through calculation), so as to obtain a first noise image corresponding to the target image; the back diffusion process predicts a plurality of noises added in the diffusion process, and sequentially removes the predicted plurality of noises from the first noise image to obtain a restored first denoising image.
The image dataset comprises a plurality of target images, the target images can be regarded as one for facilitating understanding, the first noise images corresponding to the target images are obtained through a diffusion process, the restored first noise images corresponding to the target images are obtained through a back diffusion process, the first noise images corresponding to each target image are obtained according to the method, and then the image dataset with enhanced data is formed by all the target images and the corresponding first noise images.
The disclosed embodiments are for image data enhancement using a trained image diffusion model.
According to the technical scheme provided by the embodiment of the disclosure, an image dataset to be enhanced by data is obtained; continuously adding noise for a target image in an image data set for multiple times by utilizing a diffusion process of an image diffusion model to obtain a first noise image corresponding to the target image; predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises in the first noise image to obtain a restored first denoising image corresponding to the target image; the target image and the first denoising image are utilized to generate the image dataset with enhanced data, so that the problem that an image obtained by a traditional data enhancement method in the prior art lacks reality can be solved by adopting the technical means, and the image obtained by data enhancement accords with the change of the image in an actual application scene.
Optionally, the diffusion module 302 is further configured to calculate the target image after each noise addition by: and calculating the target image after the noise is added according to a noise calculation formula based on the target image after the noise is added last time and the basic noise obtained by the last time, wherein the basic noise obtained by the last time is obtained by sampling from Gaussian noise when the noise is added last time, and the Gaussian noise is noise meeting Gaussian distribution.
The noise calculation formula is:
x t is the target image after the t time of adding noise, x t-1 Is the target image after adding noise for t-1 time, beta t Is constant, beta t The value range is between 0 and 1, E t-1 Is the basic noise obtained by the t-1 th sampling, when t is equal to 1, x 0 Is a target image, when the number of times of adding noise is N, x N Is the first noise image corresponding to the target image.
x N Is the Nth added noiseThe target image after sound, namely the first noise image corresponding to the target image.
The back diffusion process is the reverse process of the diffusion process, the trained image diffusion model predicts a plurality of noises added in the diffusion process in the back diffusion process, then the predicted plurality of noises are sequentially removed from the first noise image, and finally the restored first denoising image corresponding to the target image is obtained. The image diffusion model learns and stores the corresponding relation between the noise added in the diffusion process and the noise predicted in the inverse diffusion process through training.
Optionally, the enhancing module 304 is further configured to control the size of the data-enhanced image dataset by controlling the number of first denoising images corresponding to the target image by using the number of times the target image is processed by the image diffusion model.
Each time the image diffusion model is used for processing the target image, a first denoising image corresponding to the target image is obtained, when the image diffusion model is used for processing the target image for a plurality of times, a large number of first denoising images corresponding to the target image are obtained (because the basic noise obtained by sampling is random, each time the image diffusion model is used for processing the target image, the finally obtained first denoising images are different, and therefore, the image data enhancement can be realized by using the image diffusion model).
Optionally, the diffusion module 302 is further configured to obtain a training data set; continuously adding noise for the training images in the training data set for multiple times by utilizing the diffusion process of the image diffusion model to obtain second noise images corresponding to the training images; predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises in the second noise image to obtain a restored second denoising image corresponding to the target image; calculating a total loss between the plurality of noise added during diffusion and the plurality of noise predicted during inverse diffusion; and updating model parameters of the image diffusion model according to the total loss to complete training of the image diffusion model.
The training image is identical to the target image, the second noise image is identical to the first noise image, and the names are different only to distinguish model training from model use.
The calculation of the loss between the plurality of noise added during diffusion and the plurality of noise predicted during inverse diffusion may be the calculation of the similarity between two corresponding noise (e.g., the first added noise corresponds to the predicted first added noise, and the second added noise corresponds to the predicted second added noise).
Optionally, the diffusion module 302 is further configured to calculate the training image after each noise addition by: based on the training image after the last noise addition and the basic noise obtained by the last sampling, the training image after the current noise addition is obtained through calculation of a noise calculation formula, wherein the basic noise obtained by the last sampling is obtained by sampling from Gaussian noise when the noise is added last time, the Gaussian noise is noise meeting Gaussian distribution, and the training image after the noise addition for many times is a second noise image corresponding to the training image.
Optionally, the diffusion module 302 is further configured to calculate a loss between each noise added during diffusion and the noise predicted during inverse diffusion corresponding to the noise; the sum of all losses calculated is taken as the total loss.
Alternatively, the loss function may be a mean square error.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic of each process, and should not constitute any limitation on the implementation process of the embodiments of the disclosure.
Fig. 4 is a schematic diagram of an electronic device 4 provided by an embodiment of the present disclosure. As shown in fig. 4, the electronic apparatus 4 of this embodiment includes: a processor 401, a memory 402 and a computer program 403 stored in the memory 402 and executable on the processor 401. The steps of the various method embodiments described above are implemented by processor 401 when executing computer program 403. Alternatively, the processor 401, when executing the computer program 403, performs the functions of the modules/units in the above-described apparatus embodiments.
The electronic device 4 may be a desktop computer, a notebook computer, a palm computer, a cloud server, or the like. The electronic device 4 may include, but is not limited to, a processor 401 and a memory 402. It will be appreciated by those skilled in the art that fig. 4 is merely an example of the electronic device 4 and is not limiting of the electronic device 4 and may include more or fewer components than shown, or different components.
The processor 401 may be a central processing unit (Central Processing Unit, CPU) or other general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like.
The memory 402 may be an internal storage unit of the electronic device 4, for example, a hard disk or a memory of the electronic device 4. The memory 402 may also be an external storage device of the electronic device 4, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like, which are provided on the electronic device 4. Memory 402 may also include both internal storage units and external storage devices of electronic device 4. The memory 402 is used to store computer programs and other programs and data required by the electronic device.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit.
The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present disclosure may implement all or part of the flow of the method of the above-described embodiments, or may be implemented by a computer program to instruct related hardware, and the computer program may be stored in a computer readable storage medium, where the computer program, when executed by a processor, may implement the steps of the method embodiments described above. The computer program may comprise computer program code, which may be in source code form, object code form, executable file or in some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the content of the computer readable medium can be appropriately increased or decreased according to the requirements of the jurisdiction's jurisdiction and the patent practice, for example, in some jurisdictions, the computer readable medium does not include electrical carrier signals and telecommunication signals according to the jurisdiction and the patent practice.
The above embodiments are merely for illustrating the technical solution of the present disclosure, and are not limiting thereof; although the present disclosure has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the disclosure, and are intended to be included in the scope of the present disclosure.
Claims (10)
1. An image data enhancement method based on adding noise to an image, comprising:
acquiring an image data set to be data enhanced;
continuously adding noise for a target image in the image data set for multiple times by utilizing a diffusion process of an image diffusion model to obtain a first noise image corresponding to the target image;
predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises from the first noise image to obtain a restored first denoising image corresponding to the target image;
and generating the data-enhanced image data set by utilizing the target image and the first denoising image.
2. The method according to claim 1, wherein the diffusing process using the image diffusion model adds noise to the target image in the image dataset a plurality of times in succession, to obtain a first noise image corresponding to the target image, comprising:
the target image after each noise addition is calculated by:
and calculating to obtain the target image after the noise is added according to a noise calculation formula based on the target image after the noise is added last time and the basic noise obtained by the last time, wherein the basic noise obtained by the last time is obtained by sampling from Gaussian noise when the noise is added last time, and the Gaussian noise is noise meeting Gaussian distribution.
3. The method of claim 2, wherein the noise calculation formula is:
x t is the target image after the t time of adding noise, x t-1 Is the target image after adding noise for t-1 time, beta t Is constant, beta t The value range is between 0 and 1, E t-1 Is obtained by sampling at t-1 th timeBase noise, x when t is equal to 1 0 Is a target image, when the number of times of adding noise is N, x N Is a first noise image corresponding to the target image.
4. The method according to claim 1, characterized in that it comprises:
and controlling the number of times of processing the target image by using the image diffusion model, controlling the number of first denoising images corresponding to the target image, and controlling the scale of the image dataset after data enhancement.
5. The method according to claim 1, wherein the diffusion process using the image diffusion model adds noise to the target image in the image dataset a plurality of times in succession, and before obtaining the first noise image corresponding to the target image, the method further comprises:
acquiring a training data set;
continuously adding noise for the training images in the training data set for multiple times by utilizing the diffusion process of the image diffusion model to obtain second noise images corresponding to the training images;
predicting a plurality of noises added in the diffusion process by using the inverse diffusion process of the image diffusion model, and sequentially removing the predicted plurality of noises in the second noise image to obtain a restored second denoising image corresponding to the target image;
calculating a total loss between the plurality of noise added during the diffusion and the plurality of noise predicted during the back-diffusion;
and updating model parameters of the image diffusion model according to the total loss to complete training of the image diffusion model.
6. The method of claim 5, wherein the diffusion process using the image diffusion model adds noise to the training image in the training dataset multiple times in succession, and further comprising, before obtaining the second noise image corresponding to the training image:
the training image after each noise addition is calculated by:
based on the training image after the last noise addition and the basic noise obtained by the last sampling, the training image after the current noise addition is obtained through calculation of a noise calculation formula, wherein the basic noise obtained by the last sampling is obtained by sampling from Gaussian noise when the noise is added last time, the Gaussian noise is noise meeting Gaussian distribution, and the training image after the noise addition for many times is a second noise image corresponding to the training image.
7. The method of claim 5, wherein said calculating a total loss between the plurality of noise added during said diffusing and the plurality of noise predicted during said back-diffusing comprises:
calculating a loss between each noise added in the diffusion process and a noise predicted in the inverse diffusion process corresponding to the noise;
the sum of all the losses calculated is taken as the total loss.
8. An image data enhancement apparatus based on adding noise to an image, comprising:
an acquisition module configured to acquire an image dataset to be data enhanced;
the diffusion module is configured to continuously add noise to a target image in the image data set for multiple times by utilizing the diffusion process of the image diffusion model to obtain a first noise image corresponding to the target image;
a back diffusion module configured to predict a plurality of noises added in the diffusion process by using a back diffusion process of the image diffusion model, and sequentially remove the predicted plurality of noises in the first noise image, so as to obtain a restored first denoising image corresponding to the target image;
an enhancement module configured to generate the data-enhanced image dataset using the target image and the first denoised image.
9. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 7 when the computer program is executed.
10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310364619.XA CN116385328A (en) | 2023-04-07 | 2023-04-07 | Image data enhancement method and device based on noise addition to image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310364619.XA CN116385328A (en) | 2023-04-07 | 2023-04-07 | Image data enhancement method and device based on noise addition to image |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116385328A true CN116385328A (en) | 2023-07-04 |
Family
ID=86961180
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310364619.XA Pending CN116385328A (en) | 2023-04-07 | 2023-04-07 | Image data enhancement method and device based on noise addition to image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116385328A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117036197A (en) * | 2023-08-18 | 2023-11-10 | 杭州食方科技有限公司 | Image denoising model generation method, device, equipment and computer readable medium |
CN117473397A (en) * | 2023-12-25 | 2024-01-30 | 清华大学 | Diffusion model data enhancement-based emotion recognition method and system |
-
2023
- 2023-04-07 CN CN202310364619.XA patent/CN116385328A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117036197A (en) * | 2023-08-18 | 2023-11-10 | 杭州食方科技有限公司 | Image denoising model generation method, device, equipment and computer readable medium |
CN117473397A (en) * | 2023-12-25 | 2024-01-30 | 清华大学 | Diffusion model data enhancement-based emotion recognition method and system |
CN117473397B (en) * | 2023-12-25 | 2024-03-19 | 清华大学 | Diffusion model data enhancement-based emotion recognition method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107633218B (en) | Method and apparatus for generating image | |
US11978245B2 (en) | Method and apparatus for generating image | |
CN109829432B (en) | Method and apparatus for generating information | |
CN108197652B (en) | Method and apparatus for generating information | |
CN116385328A (en) | Image data enhancement method and device based on noise addition to image | |
CN109118456B (en) | Image processing method and device | |
CN110516678B (en) | Image processing method and device | |
CN111831855B (en) | Method, apparatus, electronic device, and medium for matching videos | |
CN108510084B (en) | Method and apparatus for generating information | |
CN110211017B (en) | Image processing method and device and electronic equipment | |
CN110059623B (en) | Method and apparatus for generating information | |
CN111459364B (en) | Icon updating method and device and electronic equipment | |
CN112419179B (en) | Method, apparatus, device and computer readable medium for repairing image | |
CN116403250A (en) | Face recognition method and device with shielding | |
CN111369475A (en) | Method and apparatus for processing video | |
CN108921792B (en) | Method and device for processing pictures | |
CN114066722A (en) | Method and device for acquiring image and electronic equipment | |
CN118042246A (en) | Video generation method, device, electronic equipment and readable storage medium | |
CN117894038A (en) | Method and device for generating object gesture in image | |
CN111669476B (en) | Watermark processing method, device, electronic equipment and medium | |
CN113762279A (en) | Target image recognition method, model training method, device, equipment and medium | |
CN111783731A (en) | Method and device for extracting video features | |
CN116108810A (en) | Text data enhancement method and device | |
CN115953803A (en) | Training method and device for human body recognition model | |
CN116596813A (en) | Image data enhancement method and device based on image destruction processing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |