CN112215766B

CN112215766B - Image defogging method combining image restoration and image enhancement and convolution network thereof

Info

Publication number: CN112215766B
Application number: CN202010988868.2A
Authority: CN
Inventors: 刘春晓; 章理登; 李彪
Original assignee: Zhejiang Gongshang University
Current assignee: Zhejiang Gongshang University
Priority date: 2020-09-18
Filing date: 2020-09-18
Publication date: 2024-03-01
Anticipated expiration: 2040-09-18
Also published as: CN112215766A

Abstract

The invention relates to the technical field of image enhancement and computer vision, in particular to an image defogging method integrating image restoration and image enhancement and a convolution network thereof. The method utilizes three different convolutional neural networks to perform fusion defogging, and the method simulates the mapping from an original foggy image to a foggy image to realize defogging; the gating fusion network calculates respective self-adaptive weight graphs of two defogging graphs of the restoration network and the enhancement network, and the weight graphs can play a role in fusing good parts of the restoration network and the enhancement network and improving defogging performance. The method mainly solves the design defects of the traditional convolutional neural defogging network based on the physical model and the end-to-end, optimizes the defogging diagram, expands the application range of the defogging algorithm and enhances the robustness of the defogging algorithm.

Description

Image defogging method combining image restoration and image enhancement and convolution network thereof

Technical Field

The invention relates to the technical field of image enhancement and computer vision, in particular to an image defogging method integrating image restoration and image enhancement and a convolution network thereof.

Background

Small solid particles and small liquid droplets suspended in air can cause a haze effect. The reflected light from the object may be attenuated and scattered by the haze weather, thereby reducing the visibility and contrast of the environment. When the sensor is used to collect images in hazy weather, images containing haze, referred to as hazy images, are obtained. The defogging algorithm is an image enhancement algorithm that aims to improve the quality, sharpness, contrast, etc. of an image without affecting the image information, or as a preprocessing step for other advanced visual tasks. The defogging algorithm is input as a single foggy image, and the target output is a clean and clear defogging image. The main methods at present can be classified into a conventional-based method and a deep learning-based method.

The conventional method is largely used in early defogging work, but with the development of deep learning in other computer image fields, the deep learning-based method in defogging algorithm is widely used. The deep learning-based method is classified into a restoration method based on a physical model and an end-to-end enhancement method. The restoration method based on the physical model needs to input the foggy image which accords with the physical model of the current method, otherwise, the result is poor. End-to-end based enhancement methods lack information guidance and may perform poorly in complex scenarios.

AtJ networks (Guo T, li X, cherukuri V, et al Dense Scene Information Estimation Network for Dehazing [ C ]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition works chops.2019:0-0.) employ a network comprising an encoder and three decoders, wherein an atmospheric scattering model is used for defogging.

Disclosure of Invention

In order to solve the problems, the invention provides the image defogging convolution network which integrates image restoration and image enhancement, and the invention fully utilizes the atmospheric scattering model and combines the advantages of restoration and enhancement methods to achieve the optimal defogging result.

The invention adopts the technical proposal for solving the technical problems that:

the image defogging method combining image restoration and image enhancement utilizes three different convolutional neural networks to perform fusion defogging, and the method simulates the mapping from an original foggy image to a foggy image to realize defogging, and comprises the following implementation steps:

step one: three different convolutional neural networks are designed, wherein the three convolutional neural networks are respectively responsible for three tasks of foggy image restoration, foggy image enhancement and defogging image fusion, the network responsible for the foggy image restoration task is a restoration network, the network responsible for the foggy image enhancement is an enhancement network, and the network responsible for defogging image fusion is a gating fusion network;

step two: extracting a haze-free image and a depth image from the NYU Depth Dataset V data set to synthesize a haze image;

step three: setting training parameters;

step four: training a recovery network;

step five: training the enhancement network in combination with the defogging diagram of the restoration network;

step six: training a gating fusion network;

the fourth, fifth and sixth steps are realized based on a deep learning framework;

further, the described architecture of the restoration network uses one encoder and two decoders.

The architecture of the enhancement network uses one encoder and one decoder.

The structure of the gated fusion network uses one encoder and one decoder.

The deep learning framework is Pytorch;

when the output of the fourth step is converged, stopping the fourth step;

when the output of the step five is converged, stopping the step five;

stopping the step six when the output of the step six is converged;

the restored network output is a restored defogging diagram; enhancing network output to be an enhanced defogging diagram; and the gating fusion network input image is output by the restoration network and the enhancement network, and is output as a final defogging image.

And in the process of synthesizing the foggy graph in the second step, randomly taking the scattering coefficient and the atmospheric light value.

And step four, the image output in the step two is used for updating network parameters by using a back propagation algorithm provided by a deep learning framework library.

Step four, stopping when the output of the recovery network in step four converges; step five, stopping when the output of the reinforced network converges; and step six, stopping when the output of the step six gate control fusion network converges.

The method uses an atmospheric scattering model as a physical model, and the atmospheric scattering model is expressed as follows:

I＝J×t+A×(1-t)

wherein I is a foggy diagram; j is a non-fog chart, and the size of the J is consistent with that of the fog chart I; a is an atmospheric light diagram, and the size of the atmospheric light diagram is consistent with that of the foggy diagram I; t is a transmission chart, which shows the influence degree of haze on the hazy chart, is related to the depth d and the scattering coefficient beta, and is consistent with the size of the hazy chart I, and is expressed as follows:

t＝e ^-β×d

wherein e is a mathematical constant, and beta is a scattering coefficient and represents haze concentration in a unit space; d is a depth map, the size of the depth map is consistent with that of the foggy map I, the number of channels is 1, and the depth value of each pixel corresponding position in the foggy map or the foggy map is represented.

The image defogging method comprises the following specific steps:

step 1: designing an image defogging convolution network with image restoration and image enhancement fused, and designing a corresponding loss function and a training strategy;

wherein the input of the restoration network is a foggy graph I, and the output is a transmission graphAnd atmospheric light map->The size of the foggy diagram I is 608×448×3, the transmission diagram +.>The size of (2) is 608×448×3, atmospheric light map +.>Is 608 x 448 x 3; the restoration network comprises an encoder and two decoders, wherein the encoder is responsible for extracting relevant characteristic information from the hazy map, and the two decoders are respectively responsible for extracting the transmission map +.>Atmospheric light map->Then combining the atmospheric scattering model to obtain a defogging diagram J ₁ Expressed by the formula:

defogging picture J ₁ Is 608 x 448 x 3, is the output of the restoration network, and is therefore a defogging network based on an atmospheric scattering model;

the input of the enhanced network is a foggy diagram I, and the output is a defogging diagram J ₂ The method comprises the steps of carrying out a first treatment on the surface of the Defogging picture J ₂ Is 608 x 448 x 3; the enhancement network comprises an encoder for extracting relevant characteristic information from the foggy map and a decoder for recovering the foggy map J from the relevant characteristic information ₂ ；

The input of the gating fusion network is defogging diagram J ₁ And defogging pattern J ₂ Output is defogging diagram J _final Is the final defogging picture; the gated fusion network comprises an encoder and a decoder, wherein the encoder is responsible for removing the fog pattern J ₁ And defogging pattern J ₂ Related information is extracted from the data, and the decoder is responsible for extracting a corresponding weight graph W from the related information ₁ And weight map W ₂ Weight map W ₁ And weight map W ₂ The defogging patterns J are obtained by fusing the following formulas, wherein the sizes of the defogging patterns J are 608 multiplied by 448 multiplied by 3 _final ：

J _final ＝W ₁ ⊙J ₁ +W ₂ ⊙J ₂

The symbol ". This represents pixel-wise multiplication.

Loss function L used in training of restoration network _re The following are provided:

A _gt t is the target atmospheric light pattern _gt For the target transmission diagram, J _gt Defogging the target; l (L) _mse (. Cndot.). Cndot. (cndot.) calculates the square mean of the pixel-by-pixel difference of the two inputs, called mse loss, e.g.The formula can be expressed as:

wherein n represents the number of pixels;

loss function L for use in training of enhanced networks _en The following are provided:

as follows, the dot product is pixel by pixel, M _mse (J ₂ ,J _gt ) For the mse loss plot, the equation is as follows:

M _mse (J ₂ ,J _gt )＝|J ₂ -J _gt | ²

M _mse (J ₂ ,J _gt ) Is 608 x 448 x 3; w (W) _mse (J ₁ ,J _gt ) For pixel-by-pixel weights of the mse loss map, the formula is as follows:

wherein e represents a mathematical constant, W _mse (J ₁ ,J _gt ) Is 608 x 448 x 3;

in the training of the gated fusion network, the following loss functions are used:

L _fu ＝L _mse (J _final ,J _gt )

J _final door for doorControlling the output of the fusion network;

in generating the data, the method extracts a haze-free image J from the NYU Depth Dataset V2 dataset _gt And depth map d, synthesized according to the atmospheric scattering model:

t _gt ＝e ^-β×d

I＝J _gt ×t _gt +A _gt ×(1-t _gt )

wherein the scattering coefficient beta _gt ∈[0.8,1.2]And atmospheric light map A _gt ∈[0.7,1,0]Random value, atmospheric light map A _gt All pixel values are equal, t _gt The method comprises the steps that a target transmission diagram is adopted, and I is a fog diagram waiting for defogging;

step three: setting training parameters; the network structure is realized and trained on a deep learning framework Pytorch, the trained optimizer is Adam, and the initial learning rate is 4 multiplied by 10 ^-4 The exponential decay rate of the first moment estimation is 0.9, the exponential decay rate of the second moment estimation is 0.999, and the total training is 200 rounds, and the learning rate is multiplied by 0.7 every 20 rounds;

step four: training a recovery network, wherein the input of the recovery network is a foggy graph I, and the output is a transmission graphAnd atmospheric light map->And further obtaining a defogging diagram J according to the atmospheric scattering model ₁ ；

Step five: training the enhancement network, wherein the input of the enhancement network is a foggy diagram I, and the output is a defogging diagram J ₂ In combination with defogging patterns J of the restoration network ₁ Training is carried out;

step six: training a gating fusion network, wherein the input of the gating fusion network is a defogging diagram J ₁ And defogging pattern J ₂ Output is defogging diagram J _final ；

And the images output in the third, fourth and fifth steps are subjected to network parameter updating by using a back propagation algorithm provided by a deep learning framework Pytorch.

The training data is data for learning by a network, and includes an original hazy image (an image input into the network) and a target image (a corresponding hazless image). When the input image passes through the network, an output image (output defogging image) is generated, and the network model parameters are updated by calculating the difference between the output image and the target image (the loss function prescribes the calculation mode of the difference), so that the network learns the mapping process, namely the training process.

In addition, the invention provides an image defogging convolution network for fusing image restoration and image enhancement, which comprises a restoration network for taking charge of a foggy image restoration task, an enhancement network for taking charge of foggy image enhancement and a gating fusion network for taking charge of defogging image fusion.

The defogging is carried out by utilizing a convolutional neural network, and the defogging is realized by simulating the mapping from a foggy image to a foggy image, wherein the implementation steps are that three convolutional neural networks based on an encoder and a decoder, training data synthesis, image preprocessing, training parameter setting and training models are designed; through investigation and observation, it is found that there are some problems with both the physical model-based and the end-to-end defogging methods, but fusing their output results can solve these problems. The defogging capacity of the enhanced network for learning the non-physical model is improved by combining the defogging diagram of the restored network in the loss calculation of the enhanced network; the gating fusion network calculates respective self-adaptive weight graphs of two defogging graphs of the restoration network and the enhancement network, and the weight graphs can play a role in fusing good parts of the restoration network and the enhancement network and improving defogging performance. The method mainly solves the design defects of the traditional convolutional neural defogging network based on the physical model and the end-to-end, optimizes the defogging diagram, expands the application range of the defogging algorithm and enhances the robustness of the defogging algorithm.

The invention has the advantages that: and (3) integrating a deep learning restoration defogging method based on an embedded physical model and an end-to-end deep learning enhancement defogging method based on the embedded physical model, and inputting output results of the two methods into a gating fusion network to obtain a final defogging diagram. And the restoration network is combined with the atmospheric scattering model, so that the complexity of defogging tasks is effectively reduced. The enhanced network improves the ability to defog non-physical model images by combining defogging maps of the restored network. The gating fusion network flexibly fuses defogging graphs of two different defogging strategies, and obtains the optimal proportion for combining the defogging graphs by learning, so that the advantages of the two are combined, and the defects of the two are avoided.

Drawings

Fig. 1 is a network structure of a restoration network.

Fig. 2 is a network architecture of an enhanced network and a gated converged network.

Detailed Description

The invention will be further described with reference to the drawings and detailed description in order to make the technical means, the creation characteristics, the achievement of the purpose and the effect of the implementation of the invention easy to understand.

This is an image defogging convolutional network with image restoration and image enhancement fused, which uses three different convolutional neural networks to achieve defogging, which simulates the mapping of an original foggy image to a foggy image to achieve defogging.

In the method, an atmospheric scattering model is used as a physical model, and the atmospheric scattering model is expressed as follows:

I＝J×t+A×(1-t)

t＝e ^-β×d

the input of the restoration network in the method is a foggy graph I, and the output is a transmission graphAnd atmospheric light map->The size of the foggy diagram I is 608×448×3, the transmission diagram +.>The size of (2) is 608×448×3, atmospheric light map +.>Is 608 x 448 x 3. The restoration network comprises an encoder and two decoders, wherein the encoder is responsible for extracting relevant characteristic information from the hazy map, and the two decoders are respectively responsible for extracting the transmission map +.>And atmospheric light map->Then combining the atmospheric scattering model to obtain a defogging diagram J ₁ Expressed by the formula:

defogging picture J ₁ The size of (2) is 608×448×3, which is the output of the restoration network, and thus the restoration network is a defogging network based on an atmospheric scattering model. The restoration network structure is from a AtJ network, as shown in fig. 1.

The enhanced network in the method is input into a foggy graph J, and output into a defogging graph J ₂ . Defogging picture J ₂ Is 608 x 448 x 3. The enhancement network comprises an encoder and a decoderA decoder, wherein the encoder is responsible for extracting relevant characteristic information from the foggy map, and the decoder is responsible for recovering the foggy map J from the relevant characteristic information ₂ . The enhanced network structure comes from AtJ as shown in fig. 2.

The gating fusion network input in the method is defogging diagram J ₁ And defogging pattern J ₂ Output is defogging diagram J _final Is the final defogging figure. The gated fusion network comprises an encoder and a decoder, wherein the encoder is responsible for removing the fog pattern J ₁ And defogging pattern J ₂ Related information is extracted from the data, and the decoder is responsible for extracting a corresponding weight graph W from the related information ₁ And weight map W ₂ Weight map W ₁ And weight map W ₂ The defogging pattern J was obtained according to the following formula, with the dimensions of 608X 448X 3 _final ：

J _final ＝W ₁ ⊙J ₁ +W ₂ ⊙J ₂

The symbol ". This represents pixel-wise multiplication. The gated converged network structure is from a AtJ network, as shown in fig. 2, with the same network structure as the enhanced network.

A _gt t is the target atmospheric light pattern _gt For the target transmission diagram, J _gt Defogging patterns are aimed at. L (L) _mse (. Cndot.). Cndot. (cndot.) calculates the square mean of the pixel-by-pixel difference of the two inputs, called mse loss, e.g.The formula can be expressed as:

where n represents the number of pixels.

Training in an enhanced networkLoss function L used in the process _en The following are provided:

as indicated by the dot product, M _mse (J ₂ ,J _gt ) For the mse loss plot, the equation is as follows:

M _mse (J ₂ ,J _gt )＝|J ₂ -J _gt | ²

M _mse (J ₂ ,J _gt ) Is 608 x 448 x 3.W (W) _mse (J ₁ ,J _gt ) For pixel-by-pixel weights of the mse loss map, the formula is as follows:

wherein e represents a mathematical constant, W _mse (J ₁ ,J _gt ) Is 608 x 448 x 3.

In the training of the gated fusion network, the method uses the following loss functions:

L _fu ＝L _mse (J _final ,J _gt )

J _final to gate the output of the converged network.

Step two: extracting depth information and color images from the NYU Depth Dataset V data set, and synthesizing a hazy image;

I＝J _gt ×t _gt +A _gt ×(1-t _gt )

wherein the scattering coefficient beta _gt ∈[0.8,1.2]And atmospheric light map A _gt ∈[0.7,1,0]Random value, atmospheric light map A _gt All pixel values are equal, t _gt For the target transmission map, I is the hazy map waiting for defogging.

Step three: training parameter setting. The network structure is realized and trained on a deep learning framework Pytorch, the trained optimizer is Adam, and the initial learning rate is 4 multiplied by 10 ^-4 The exponential decay rate of the first moment estimate is 0.9, the exponential decay rate of the second moment estimate is 0.999, and a total training is 200 rounds, and the learning rate is multiplied by 0.7 every 20 rounds.

Step four: training a recovery network, wherein the input of the recovery network is a foggy graph I, and the output is a transmission graphAnd atmospheric light map->And further obtaining a defogging diagram J according to the atmospheric scattering model ₁ 。

Step five: training the enhancement network, wherein the input of the enhancement network is a foggy diagram I, and the output is a defogging diagram J ₂ In combination with defogging patterns J of the restoration network ₁ Training is performed.

Step six: training a gating fusion network, wherein the input of the gating fusion network is a defogging diagram J ₁ And defogging pattern J ₂ Output is defogging diagram J _final 。

The above display is defined.

Claims

1. The image defogging method combining image restoration and image enhancement is characterized in that defogging is realized by using a convolutional neural network, and the method comprises the following steps:

the method simulates the mapping from a foggy image to realize defogging, and comprises the following implementation steps:

step one: three different convolutional neural networks are designed, wherein the three convolutional neural networks are respectively responsible for three tasks of foggy image restoration, foggy image enhancement and defogging image fusion, the network responsible for the foggy image restoration task is a restoration network, the network responsible for the foggy image enhancement task is an enhancement network, and the network responsible for the defogging image fusion task is a gating fusion network;

step three: setting training parameters;

step four: training a recovery network;

step six: training a gating fusion network;

the fourth, fifth and sixth steps are realized based on a deep learning framework Pytorch;

the method comprises the following specific steps:

step one: designing an image defogging convolution network with image restoration and image enhancement fused, and designing a corresponding loss function and a training strategy;

J _final ＝W ₁ ⊙J ₁ +W ₂ ⊙J ₂

The "" -represents pixel-wise multiplication;

wherein n represents the number of pixels;

M _mse (J ₂ ,J _gt )＝|J ₂ -J _gt | ²

L _fu ＝L _mse (J _final ,J _gt )

J _final the output of the gate control fusion network;

t _gt ＝e ^-β×d

I＝J _gt ×t _gt +A _gt ×(1-t _gt )

2. The image defogging method according to claim 1, wherein in the first step, the restoration network is a structure of one encoder and two decoders; the enhancement network is a structure of an encoder and a decoder; the gated fusion network is a structure of one encoder and one decoder.

3. The image defogging method according to claim 2, wherein the restored network output is a restored defogging map; enhancing network output to be an enhanced defogging diagram; and the gating fusion network input image is output by the restoration network and the enhancement network, and is output as a final defogging image.

4. The image defogging method according to claim 1, wherein the scattering coefficient and the atmospheric light value are randomly valued in the course of synthesizing the foggy figure in the second step.

5. The image defogging method according to claim 1, wherein the image outputted in the fourth step is subjected to network parameter updating using a back propagation algorithm provided by a deep learning frame library.

6. The image defogging method according to claim 1, wherein the fourth step stops when the output of the restoration network converges; step five, stopping when the output of the reinforced network converges; and step six, stopping when the output of the step six gate control fusion network converges.

7. The image defogging method of claim 1, wherein the deep learning frame is Pytorch.

8. The image defogging method according to claim 1, wherein the method uses an atmospheric scattering model as a physical model, and the atmospheric scattering model is expressed as follows:

I＝J×t+A×(1-t)

t＝e ^-β×d

9. An image defogging convolution network for fusing image restoration and image enhancement, which is characterized in that: the system comprises a restoration network responsible for a foggy image restoration task, an enhancement network responsible for foggy image enhancement and a gating fusion network responsible for defogging image fusion; image defogging using the defogging method of any of claims 1 to 8.