CN113052776A - Unsupervised image defogging method based on multi-scale depth image prior - Google Patents

Unsupervised image defogging method based on multi-scale depth image prior Download PDF

Info

Publication number
CN113052776A
CN113052776A CN202110381898.1A CN202110381898A CN113052776A CN 113052776 A CN113052776 A CN 113052776A CN 202110381898 A CN202110381898 A CN 202110381898A CN 113052776 A CN113052776 A CN 113052776A
Authority
CN
China
Prior art keywords
image
size
network
output
original
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110381898.1A
Other languages
Chinese (zh)
Inventor
姜竹青
汪千淞
门爱东
王海婴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN202110381898.1A priority Critical patent/CN113052776A/en
Publication of CN113052776A publication Critical patent/CN113052776A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to an unsupervised image defogging method based on multi-scale depth image prior, belonging to the technical field of computer vision images. First, the original image is downsampled for generating a small-size image prior. Firstly, respectively inputting three noise images with the same size as the down-sampled fog-carrying images into a neural network with three codec structures to obtain three intermediate results representing an atmospheric illumination map, a transmission map and a clear image after defogging; and then modeling the three intermediate results by using an atmospheric scattering model to obtain a reconstructed foggy image. And secondly, inputting a noise image with the same size as the original image into the same network, and initializing the network by using the prior acquired by the small image. The method is reasonable in design, fully considers the problem of difficulty in priori extraction of the unsupervised defogged image, reduces the difficulty in priori extraction by using a multi-scale method, and improves the visual effect and stability of a reconstructed image.

Description

Unsupervised image defogging method based on multi-scale depth image prior
Technical Field
The invention belongs to the technical field of computer vision images, and particularly relates to an unsupervised image defogging method based on multi-scale depth image prior.
Background
Haze is a typical atmospheric phenomenon that results from the accumulation of small droplets of water, dust, smoke, or other particulate matter in the air. These particles absorb and scatter light, directly resulting in reduced visibility. The contrast of the images taken in such weather conditions may deteriorate, losing visual detail, which may present difficulties for subsequent applications. Besides the direct influence on the visual effect, the image quality is ensured, and the image quality is also the basis for high-level visual application such as target detection, semantic segmentation and the like. Therefore, image defogging has been widely studied in recent years as an image preprocessing and visual enhancement technique, and has achieved a remarkable effect.
The Image defogging technology (Image defogging) is used for processing the problems of low visibility, low contrast and the like of haze-containing images, removing haze layers hidden by the haze-containing images and recovering the original colors and contrast of the images. The image after defogging can help a computer to better observe, analyze and process pictures, and has very important application value in many fields such as video monitoring, remote sensing, automatic driving and the like.
Conventional image defogging algorithms employ handmade priors that are derived from the inherent properties of the image, such as texture, contrast, and color difference. Wherein the more classical algorithm is Dark Channel Prior (DCP) defogging, which observes the presence of a dark channel in a local image block of an outdoor haze-free, fog image and accordingly proposes to use the prior to estimate a transmission map and an atmospheric illumination map to reconstruct a haze-free image. The Color Attenuation Prior (CAP) defogging algorithm assumes a positive correlation between image depth and the difference between brightness and saturation, from which the transmission map is estimated. Although both of these methods achieve significant results, the quality of defogging depends in large part on the consistency between the prior information employed and the actual image attributes.
With the benefit of the development of deep learning, more and more researchers are applying neural networks to the image defogging problem. Unlike conventional methods based on manual prior, deep learning-based methods generate defogged images in a data-driven manner. For example, image defogging techniques that estimate transmission maps under supervision of a true transmission map using a trainable convolutional neural network (DehazeNet), image defogging techniques that combine coarse and fine scale networks to estimate transmission maps using a multi-scale convolutional neural network (MSCNN), image defogging techniques that estimate transmission maps and atmospheric light simultaneously using a generation countermeasure network (DehazeGAN), and image defogging techniques that directly generate defogged images using an end-to-end trainable network without the aid of an atmospheric scattering model (AOD-Net) are used. However, as with neural networks in other tasks, these deep learning based image defogging methods rely on a large training data set, and the introduction of data sets necessarily brings about a data set pairing problem and an image domain coverage problem.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides an unsupervised image defogging method based on multi-scale depth image prior, which uses multi-scale information to fully extract inherent information of an image, aiming at the defect of the existing defogging algorithm that different manual prior or data sets need to be designed aiming at different scenes.
The technical problem to be solved by the invention is realized by adopting the following technical scheme:
an unsupervised image defogging method based on multi-scale depth image prior comprises the following steps:
step 1, in a small-scale prior extraction stage, an input foggy image is sampled to 1/2 size of an original image;
step 2, inputting three noise images with the same output size as that in the step 1 into a neural network with three codec structures respectively to obtain three intermediate results representing an atmospheric illumination map, a transmission map and a defogged image;
and 3, modeling the three intermediate results output in the step 2 according to an atmospheric scattering model to obtain a synthesized foggy image. Constructing a loss function by using the graph and the output of the step 1, and simultaneously, using a depth-of-field prior constraint defogged image to optimize the neural network of the three codec structures in the step 2;
and 4, in the original scale image recovery stage, taking an atmospheric illumination map generated by the small scale network as the prior of the original scale network. The training process of the original scale network is the same as that in the step 2-3, only the input image needs to be replaced by the foggy image with the original size, the size of the input noise image is changed into the size of the original image, and the depth of field prior constraint is removed. Finally, the reconstructed defogged image can be obtained from the original size network.
Further, the specific details of step 1 include the following:
(1) setting the width of the original foggy image as W and the height as H, and inputting the width of the small-size image as W/2 and the height as H/2 in a small-scale prior extraction stage;
(2) the down-sampling of the image uses a bicubic interpolation algorithm.
Further, the specific details of step 2 include the following:
(1) the network J inputs the noise image, and the number of channels, width and height of the noise image are 8, W/2 and H/2 respectively. The output of the network is a defogged sharp image J1(x) The number, width and height of the channels are respectively 3, W/2 and H/2;
(2) the network T inputs the noise image, and the number of channels, width and height of the noise image are 8, W/2 and H/2 respectively. The output of the network is a transmission map T1(x) The number, width and height of the channels are respectively 1, W/2 and H/2;
(3) the structure of the network A is the same as that of the network T, the input noise image has the channel number, width and height of 8, W/2 and H/2 respectively. The output of the network is an atmospheric illumination map A1(x) The number, width and height of the channels are respectively 3, W/2 and H/2.
Further, the specific details of the step (1) include the following:
firstly, the network J is a codec structure, the size of a characteristic diagram output by the convolutional layer is gradually reduced and then gradually enlarged, the size of the finally output characteristic diagram is the same as that of the input characteristic diagram, and the structure is like a letter U;
② the characteristic diagram reduction stage has 6 convolution layers, which are divided into 3 groups. Each group of the feature map is reduced to 0.5 times of the original feature map. Each group consists of two convolutional layers, the first convolutional layer step size is 2, the convolutional kernel size is 3 × 3, the second convolutional layer step size is 1, and the convolutional kernel size is 3 × 3. Let the characteristic diagram output from the Nth group be UN,N∈[1,3];
And the characteristic diagram amplifying stage has 6 convolution layers which are divided into 3 groups. Each pass through one group, the feature map will be reduced to 2 times the original. Each group consists of two convolutional layers and a bilinear interpolation layer, wherein the step size of the first convolutional layer is 1, the size of a convolutional kernel is 3 multiplied by 3, the step size of the second convolutional layer is 1, the size of the convolutional kernel is 3 multiplied by 3, and then the two convolutional layers pass through the bilinear interpolation layer by 2 times. The input to each group consists of the output of the previous group (except the first group) and the hop information stacked in the channel dimension. The N-th group of jumper connection information is a characteristic diagram U4-NOutput obtained by 1 × 1 convolution with channel number of 16 and step length of 1;
fourthly, the final output image of the network J is obtained by using the output of the Sigmoid layer processing step III.
Further, the specific details of the step (2) include the following:
firstly, the network T is of a codec structure, the size of a characteristic diagram output by a convolutional layer is gradually reduced and then gradually enlarged, the size of the finally output characteristic diagram is the same as that of the input characteristic diagram, and the structure is like a letter U;
② the characteristic diagram reduction stage has 10 convolution layers, which are divided into 5 groups. Each group of the feature map is reduced to 0.5 times of the original feature map. Each group consists of two convolutional layers, the first convolutional layer step size is 2, the convolutional kernel size is 3 × 3, the second convolutional layer step size is 1, and the convolutional kernel size is 3 × 3. Let the characteristic diagram output from the Nth group be UN,N∈[1,5];
③ the characteristic diagram amplifying stage has 10 convolution layers which are divided into 5 groups. Each pass through one group, the feature map will be reduced to 2 times the original. Each group consists of two convolutional layers and a bilinear interpolation layer, wherein the step size of the first convolutional layer is 1, the size of a convolutional kernel is 3 multiplied by 3, the step size of the second convolutional layer is 1, the size of the convolutional kernel is 3 multiplied by 3, and then the two convolutional layers pass through the bilinear interpolation layer by 2 times. The input to each group consists of the output of the previous group (except the first group) and the hop information stacked in the channel dimension. The N-th group of jumper connection information is a characteristic diagram U6-NOutput obtained by 1 × 1 convolution with 4 channel numbers and 1 step length;
fourthly, the final output image of the network T is obtained by using the output of the Sigmoid layer processing step III.
Further, the specific details of step 3 include the following:
(1) the specific modeling method of the fogged image i (x) synthesized by the output of the network J, A, T is: (x) j (x) +(1-t (x)) a (x);
(2) optimizing a loss function between the output I (x) of step (1) and the down-sampled original foggy image. The first 500 iterations use the mean square error loss, the last 200 iterations use the structural similarity loss, and the total number of iterations is 700. In the optimization process, depth-of-field prior loss function constraint J (x) is used, namely mean square error loss optimization J is usedv(x)-JS(x) Wherein, Jv(x) Luminance image of J (x), JS(x) Saturation image of J (x). This constraint can help network a to generate the correct atmospheric illumination map.
Further, the specific details of step 4 include the following:
(1) removing the depth of field prior loss function by using the same network structure as the step 2, wherein the size of the input noise image is the same as that of the original input foggy image, and the number of channels is 8;
(2) the first 200 iterations used the mean square error loss of the synthesized fogged image and the original fogged image, and the last 1800 iterations used the structural similarity loss. The first 300 iterations require the additional addition of 2 times of A after bicubic interpolation up-sampling 1(x) Atmospheric illumination map A output by original scale network A2(x) The loss of the mean square error is used for transmitting the atmospheric illumination prior extracted by the small-scale network to the original-scale network;
(3) image J output by original scale network J2(x) Namely the defogged image finally obtained by the method.
The invention has the advantages and positive effects that:
1. the invention divides the whole defogging process into two stages. The first stage is to obtain small-scale atmosphere illumination prior and use the small-scale image to perform prior extraction, thereby effectively reducing the solution space of the neural network parameters. The depth of field prior is used for restraining the output small-scale defogged image, so that the small-scale prior extraction network can be helped to obtain more accurate atmospheric illumination prior. And in the second stage, an atmospheric illumination map of the original size is generated through atmospheric illumination prior, so that the recovery quality of the defogged image of the original size is effectively ensured.
2. The method is reasonable in design, an atmospheric illumination model is used as a theoretical basis to construct a model overall framework, a deep learning algorithm is used to ensure the expression capability of the network, and an unsupervised training method is used to avoid the problems of pairing of data sets and coverage of image domains and serious decline of defogging performance in different image domains.
Drawings
FIG. 1 is an overall flow diagram of multi-scale defogging according to the present invention;
FIG. 2 is a flow diagram of a small-scale prior extraction module of the present invention;
FIG. 3 is a flow diagram of a full-scale defogged image generation module according to the present invention;
FIG. 4 is a network configuration diagram of the defogged image generating portion of the present invention;
fig. 5 is a network configuration diagram of an atmospheric map and transmission map generating section of the present invention.
Detailed Description
The embodiments of the present invention will be described in detail with reference to the accompanying drawings.
An unsupervised image defogging method based on multi-scale depth image prior is disclosed, as shown in fig. 1 to 5, and comprises the following steps:
s1, in a small-scale prior extraction stage, down-sampling an input image with fog to 1/2 size of an original image;
step S2, respectively inputting three noise images with the same output size as that in the step 1 into a neural network with three codec structures to obtain three intermediate results representing an atmospheric illumination map, a transmission map and a defogged image;
and S3, modeling the three intermediate results output in the step S2 according to an atmospheric scattering model to obtain a synthesized foggy image. Constructing a loss function by using the graph and the output of the step S1, and simultaneously, using a depth-of-field prior constraint to defogging the image so as to optimize the neural network of the three codec structures in the step 2;
and step S4, in the original scale image recovery stage, the atmosphere illumination map generated by the small scale network is used as the prior of the original scale network. The training process of the original scale network is the same as that of the steps S2-S3, and only the input image needs to be replaced by the fog image with the original size, the size of the input noise image is changed into the size of the original image, and the depth-of-field prior constraint is removed. Finally, the reconstructed defogged image can be obtained from the original size network.
The specific implementation method of step S1 is as follows:
s1.1, setting the original fog image size width as W and the original fog image size height as H, and inputting a small-size image with the width as W/2 and the height as H/2 in a small-scale prior extraction stage;
and S1.2, using a bicubic interpolation algorithm for image downsampling.
The specific implementation method of step S2 is as follows:
s2.1, the network J inputs a noise image, and the number, width and height of channels are 8, W/2 and H/2 respectively. The output of the network is a defogged sharp image J1(x) The number, width and height of the channels are respectively 3, W/2 and H/2;
and S2.2, inputting the noise image by the network T, wherein the number, width and height of channels are 8, W/2 and H/2 respectively. The output of the network is a transmission map T1(x) The number, width and height of the channels are respectively 1, W/2 and H/2;
and S2.3, the structure of the network A is the same as that of the network T, and the noise image is input, wherein the number, width and height of channels are 8, W/2 and H/2 respectively. The output of the network is an atmospheric illumination map A1(x) The number, width and height of the channels are respectively 3, W/2 and H/2.
The specific implementation method of step S2.1 is as follows:
step S2.1.1, the network J is a codec structure, the size of the characteristic diagram output by the convolutional layer is gradually reduced and then gradually enlarged, the size of the characteristic diagram output finally is the same as that of the characteristic diagram input, and the structure is like a letter U;
in step S2.1.2, the characteristic diagram reduction stage has 6 convolution layers, which are divided into 3 groups. Each group of the feature map is reduced to 0.5 times of the original feature map. Each consisting of two convolutional layers, the first convolutional layer having a step size of 2 and a convolutional kernel size of3 x 3, the second convolutional layer step size is 1, and the convolutional kernel size is 3 x 3. Let the characteristic diagram output from the Nth group be UN,N∈[1,3];
In step S2.1.3, the characteristic diagram amplifying stage has 6 convolution layers, which are divided into 3 groups. Each pass through one group, the feature map will be reduced to 2 times the original. Each group consists of two convolutional layers and a bilinear interpolation layer, wherein the step size of the first convolutional layer is 1, the size of a convolutional kernel is 3 multiplied by 3, the step size of the second convolutional layer is 1, the size of the convolutional kernel is 3 multiplied by 3, and then the two convolutional layers pass through the bilinear interpolation layer by 2 times. The input to each group consists of the output of the previous group (except the first group) and the hop information stacked in the channel dimension. The N-th group of jumper connection information is a characteristic diagram U4-NOutput obtained by 1 × 1 convolution with channel number of 16 and step length of 1;
and S2.1.4, processing the output of the step (c) by using a Sigmoid layer to obtain a final output image of the network J.
The specific implementation method of step S2.2 is as follows:
s2.2.1, the network T is a codec structure, the size of the characteristic diagram output by the convolutional layer is gradually reduced and then gradually enlarged, the size of the characteristic diagram output finally is the same as that of the characteristic diagram input, and the structure is like a letter U;
in step S2.2.2, the characteristic diagram reduction stage has 10 convolution layers, which are divided into 5 groups. Each group of the feature map is reduced to 0.5 times of the original feature map. Each group consists of two convolutional layers, the first convolutional layer step size is 2, the convolutional kernel size is 3 × 3, the second convolutional layer step size is 1, and the convolutional kernel size is 3 × 3. Let the characteristic diagram output from the Nth group be UN,N∈[1,5];
In step S2.2.3, the characteristic diagram amplifying stage has 10 convolution layers, which are divided into 5 groups. Each pass through one group, the feature map will be reduced to 2 times the original. Each group consists of two convolutional layers and a bilinear interpolation layer, wherein the step size of the first convolutional layer is 1, the size of a convolutional kernel is 3 multiplied by 3, the step size of the second convolutional layer is 1, the size of the convolutional kernel is 3 multiplied by 3, and then the two convolutional layers pass through the bilinear interpolation layer by 2 times. The input to each group consists of the output of the previous group (except the first group) and the hop information stacked in the channel dimension. The N-th group of jumper connection information is a characteristic diagram U%-NThe number of the passing channels is 4, and the step length is1 x 1 convolution of 1 to obtain the output;
and S2.2.4, processing the output of the step (c) by using a Sigmoid layer to obtain a final output image of the network T.
The specific implementation method of step S3 is as follows:
step S3.1, the specific modeling method of the fog-bearing image i (x) synthesized by the output of the network J, A, T is: (x) j (x) +(1-t (x)) a (x);
and step S3.2, optimizing a loss function between the output I (x) of the step (1) and the original fog image after down sampling. The first 500 iterations use the mean square error loss, the last 200 iterations use the structural similarity loss, and the total number of iterations is 700. In the optimization process, depth-of-field prior loss function constraint J (x) is used, namely mean square error loss optimization J is usedv(x)-JS(x) Wherein, Jv(x) Luminance image of J (x), JS(x) Saturation image of J (x). This constraint can help network a to generate the correct atmospheric illumination map.
The specific implementation method of step S4 is as follows:
s4.1, removing a depth of field prior loss function by using the same network structure as the step 2, wherein the size of an input noise image is the same as that of an original input fogged image, and the number of channels is 8;
and S4.2, using the mean square error loss of the synthesized foggy image and the original foggy image in the first 200 iterations, and using the structural similarity loss in the last 1800 iterations. The first 300 iterations require the additional addition of 2 times of A after bicubic interpolation up-sampling 1(x) Atmospheric illumination map A output by original scale network A2(x) The loss of the mean square error is used for transmitting the atmospheric illumination prior extracted by the small-scale network to the original-scale network;
step S4.3, image J output by original scale network J2(x) Namely the defogged image finally obtained by the method.
Through the steps, the clear image after defogging can be obtained.
Finally, we evaluate network performance using PSNR (Peak Signal to Noise Ratio) and SSIM (structural similarity index). The method comprises the following steps:
and (3) testing environment: python 3.9; a PyTorch frame; ubuntu16.04 system; NVIDIA GTX 2080ti GPU
And (3) testing sequence: the dataset selected by the present invention is a mixed Subjective test Set (HSTS) in real Single Image defogging (restore), containing 10 sets of foggy-fogless Image pairs.
The test method comprises the following steps: the invention uses all image pairs in HSTS to evaluate the network effect quantitatively and qualitatively.
Testing indexes are as follows: the invention uses two indexes of PSNR and SSIM for evaluation. The index data are calculated by different algorithms which are popular at present, and then result comparison is carried out, so that the method can obtain better results in the field of real image defogging.
Nothing in this specification is said to apply to the prior art.
It should be emphasized that the embodiments described herein are illustrative rather than restrictive, and thus the present invention is not limited to the embodiments described in the detailed description, but also includes other embodiments that can be derived from the technical solutions of the present invention by those skilled in the art.

Claims (7)

1. The unsupervised image defogging method based on the multi-scale depth image prior is characterized by comprising the following steps of:
step 1, in a small-scale prior extraction stage, an input foggy image is sampled to 1/2 size of an original image;
step 2, inputting three noise images with the same output size as that in the step 1 into a neural network with three codec structures respectively to obtain three intermediate results representing an atmospheric illumination map, a transmission map and a defogged image;
and 3, modeling the three intermediate results output in the step 2 according to an atmospheric scattering model to obtain a synthesized foggy image. Constructing a loss function by using the graph and the output of the step 1, and simultaneously, using a depth-of-field prior constraint defogged image to optimize the neural network of the three codec structures in the step 2;
and 4, in the original scale image recovery stage, taking an atmospheric illumination map generated by the small scale network as the prior of the original scale network. The training process of the original scale network is the same as that in the step 2-3, only the input image needs to be replaced by the foggy image with the original size, the size of the input noise image is changed into the size of the original image, and the depth of field prior constraint is removed. Finally, the reconstructed defogged image can be obtained from the original size network.
2. The unsupervised image defogging method based on multi-scale depth image priors according to claim 1, wherein the specific details of the step 1 comprise the following:
(1) setting the width of the original foggy image as W and the height as H, and inputting the width of the small-size image as W/2 and the height as H/2 in a small-scale prior extraction stage;
(2) the down-sampling of the image uses a bicubic interpolation algorithm.
3. The unsupervised image defogging method based on multi-scale depth image priors according to claim 1, wherein the specific details of the step 2 comprise the following:
(1) the network J inputs the noise image, and the number of channels, width and height of the noise image are 8, W/2 and H/2 respectively. The output of the network is a defogged sharp image J1(x) The number, width and height of the channels are respectively 3, W/2 and H/2;
(2) the network T inputs the noise image, and the number of channels, width and height of the noise image are 8, W/2 and H/2 respectively. The output of the network is a transmission map T1(x) The number, width and height of the channels are respectively 1, W/2 and H/2;
(3) the structure of the network A is the same as that of the network T, the input noise image has the channel number, width and height of 8, W/2 and H/2 respectively. The output of the network is an atmospheric illumination map A1(x) The number, width and height of the channels are respectively 3, W/2 and H/2.
4. The unsupervised image defogging method based on multi-scale depth image priors according to claim 3, wherein the specific details of the step (1) comprise the following:
firstly, the network J is a codec structure, the size of a characteristic diagram output by the convolutional layer is gradually reduced and then gradually enlarged, the size of the finally output characteristic diagram is the same as that of the input characteristic diagram, and the structure is like a letter U;
② the characteristic diagram reduction stage has 6 convolution layers, which are divided into 3 groups. Each group of the feature map is reduced to 0.5 times of the original feature map. Each group consists of two convolutional layers, the first convolutional layer step size is 2, the convolutional kernel size is 3 × 3, the second convolutional layer step size is 1, and the convolutional kernel size is 3 × 3. Let the characteristic diagram output from the Nth group be UN,N∈[1,3];
And the characteristic diagram amplifying stage has 6 convolution layers which are divided into 3 groups. Each pass through one group, the feature map will be reduced to 2 times the original. Each group consists of two convolutional layers and a bilinear interpolation layer, wherein the step size of the first convolutional layer is 1, the size of a convolutional kernel is 3 multiplied by 3, the step size of the second convolutional layer is 1, the size of the convolutional kernel is 3 multiplied by 3, and then the two convolutional layers pass through the bilinear interpolation layer by 2 times. The input to each group consists of the output of the previous group (except the first group) and the hop information stacked in the channel dimension. The N-th group of jumper connection information is a characteristic diagram U4-NOutput obtained by 1 × 1 convolution with channel number of 16 and step length of 1;
fourthly, the final output image of the network J is obtained by using the output of the Sigmoid layer processing step III.
5. The unsupervised image defogging method based on multi-scale depth image priors according to claim 3, wherein the specific details of the step (2) comprise the following:
firstly, the network T is of a codec structure, the size of a characteristic diagram output by a convolutional layer is gradually reduced and then gradually enlarged, the size of the finally output characteristic diagram is the same as that of the input characteristic diagram, and the structure is like a letter U;
② the characteristic diagram reduction stage has 10 convolution layers, which are divided into 5 groups. Each group of the feature map is reduced to 0.5 times of the original feature map. Each group consists of two convolutional layers, the first convolutional layer step size is 2, the convolutional kernel size is 3 × 3, the second convolutional layer step size is 1, and the convolutional kernel size is 3 × 3.Let the characteristic diagram output from the Nth group be UN,N∈[1,5];
③ the characteristic diagram amplifying stage has 10 convolution layers which are divided into 5 groups. Each pass through one group, the feature map will be reduced to 2 times the original. Each group consists of two convolutional layers and a bilinear interpolation layer, wherein the step size of the first convolutional layer is 1, the size of a convolutional kernel is 3 multiplied by 3, the step size of the second convolutional layer is 1, the size of the convolutional kernel is 3 multiplied by 3, and then the two convolutional layers pass through the bilinear interpolation layer by 2 times. The input to each group consists of the output of the previous group (except the first group) and the hop information stacked in the channel dimension. The N-th group of jumper connection information is a characteristic diagram U6-NOutput obtained by 1 × 1 convolution with 4 channel numbers and 1 step length;
fourthly, the final output image of the network T is obtained by using the output of the Sigmoid layer processing step III.
6. The unsupervised image defogging method based on multi-scale depth image priors according to claim 1, wherein the specific details of the step 3 comprise the following:
(1) the specific modeling method of the fogged image i (x) synthesized by the output of the network J, A, T is: (x) j (x) +(1-t (x)) a (x);
(2) optimizing a loss function between the output I (x) of step (1) and the down-sampled original foggy image. The first 500 iterations use the mean square error loss, the last 200 iterations use the structural similarity loss, and the total number of iterations is 700. In the optimization process, depth-of-field prior loss function constraint J (x) is used, namely mean square error loss optimization J is usedv(x)-JS(x) Wherein, Jv(x) Luminance image of J (x), JS(x) Saturation image of J (x). This constraint can help network a to generate the correct atmospheric illumination map.
7. The unsupervised image defogging method based on multi-scale depth image priors according to claim 1, wherein the specific details of the step 4 comprise the following:
(1) removing the depth of field prior loss function by using the same network structure as the step 2, wherein the size of the input noise image is the same as that of the original input foggy image, and the number of channels is 8;
(2) the first 200 iterations used the mean square error loss of the synthesized fogged image and the original fogged image, and the last 1800 iterations used the structural similarity loss. The first 300 iterations, an additional 2 times bicubic interpolated up-sampled A 'need to be added'1(x) Atmospheric illumination map A output by original scale network A2(x) The loss of the mean square error is used for transmitting the atmospheric illumination prior extracted by the small-scale network to the original-scale network;
(3) image J output by original scale network J2(x) Namely the defogged image finally obtained by the method.
CN202110381898.1A 2021-04-09 2021-04-09 Unsupervised image defogging method based on multi-scale depth image prior Pending CN113052776A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110381898.1A CN113052776A (en) 2021-04-09 2021-04-09 Unsupervised image defogging method based on multi-scale depth image prior

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110381898.1A CN113052776A (en) 2021-04-09 2021-04-09 Unsupervised image defogging method based on multi-scale depth image prior

Publications (1)

Publication Number Publication Date
CN113052776A true CN113052776A (en) 2021-06-29

Family

ID=76519261

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110381898.1A Pending CN113052776A (en) 2021-04-09 2021-04-09 Unsupervised image defogging method based on multi-scale depth image prior

Country Status (1)

Country Link
CN (1) CN113052776A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115272122A (en) * 2022-07-31 2022-11-01 中国人民解放军火箭军工程大学 Priori-guided single-stage distillation image defogging method
CN117789041A (en) * 2024-02-28 2024-03-29 浙江华是科技股份有限公司 Ship defogging method and system based on atmospheric scattering priori diffusion model

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115272122A (en) * 2022-07-31 2022-11-01 中国人民解放军火箭军工程大学 Priori-guided single-stage distillation image defogging method
CN115272122B (en) * 2022-07-31 2023-03-21 中国人民解放军火箭军工程大学 Priori-guided single-stage distillation image defogging method
CN117789041A (en) * 2024-02-28 2024-03-29 浙江华是科技股份有限公司 Ship defogging method and system based on atmospheric scattering priori diffusion model
CN117789041B (en) * 2024-02-28 2024-05-10 浙江华是科技股份有限公司 Ship defogging method and system based on atmospheric scattering priori diffusion model

Similar Documents

Publication Publication Date Title
CN111915530B (en) End-to-end-based haze concentration self-adaptive neural network image defogging method
CN110517203B (en) Defogging method based on reference image reconstruction
Huang et al. Deep hyperspectral image fusion network with iterative spatio-spectral regularization
CN110866879B (en) Image rain removing method based on multi-density rain print perception
CN113052210A (en) Fast low-illumination target detection method based on convolutional neural network
CN112241939B (en) Multi-scale and non-local-based light rain removal method
CN113052776A (en) Unsupervised image defogging method based on multi-scale depth image prior
CN112365414A (en) Image defogging method based on double-path residual convolution neural network
CN112508960A (en) Low-precision image semantic segmentation method based on improved attention mechanism
CN112070688A (en) Single image defogging method for generating countermeasure network based on context guidance
CN111553856B (en) Image defogging method based on depth estimation assistance
CN114565539B (en) Image defogging method based on online knowledge distillation
CN112419163B (en) Single image weak supervision defogging method based on priori knowledge and deep learning
CN115526779A (en) Infrared image super-resolution reconstruction method based on dynamic attention mechanism
CN111861939A (en) Single image defogging method based on unsupervised learning
CN115063434A (en) Low-low-light image instance segmentation method and system based on feature denoising
CN113256538B (en) Unsupervised rain removal method based on deep learning
CN112785517B (en) Image defogging method and device based on high-resolution representation
CN113628143A (en) Weighted fusion image defogging method and device based on multi-scale convolution
CN116128768B (en) Unsupervised image low-illumination enhancement method with denoising module
CN117036182A (en) Defogging method and system for single image
CN110675320A (en) Method for sharpening target image under spatial parameter change and complex scene
CN113936022A (en) Image defogging method based on multi-modal characteristics and polarization attention
CN115705493A (en) Image defogging modeling method based on multi-feature attention neural network
CN112435200A (en) Infrared image data enhancement method applied to target detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination