CN108230278B

CN108230278B - Image raindrop removing method based on generation countermeasure network

Info

Publication number: CN108230278B
Application number: CN201810157009.1A
Authority: CN
Inventors: 曾坤; 郭浩翀; 林谋广
Original assignee: Sun Yat Sen University
Current assignee: Sun Yat Sen University
Priority date: 2018-02-24
Filing date: 2018-02-24
Publication date: 2021-08-06
Anticipated expiration: 2038-02-24
Also published as: CN108230278A

Abstract

The embodiment of the invention discloses an image raindrop removing method based on a generation countermeasure network. The method mainly comprises the steps of constructing a generation countermeasure network and utilizing a deep rain removing algorithm to provide a more efficient and remarkable image rain removing method, only pictures need to be input into the generation network in actual use, and a result picture can be obtained through one-time forward propagation.

Description

Image raindrop removing method based on generation countermeasure network

Technical Field

The invention relates to the technical field of image filtering and the field of machine learning, in particular to an image raindrop removing method based on a generation countermeasure network.

Background

With the rapid development of smart phones in recent years, more and more people use mobile phones to take pictures of outdoor scenes. When the outdoor scene is shot, the shot picture often has raindrops or raindrops in the scene due to rainy days. Therefore, to obtain a clearer image, it is necessary to perform a certain process on the image. With the development of computers and the continuous research of deep learning in recent years, it is more effective and feasible to use a more effective deep learning method to solve the traditional research problem.

The Convolutional Neural Network (CNN) is a variation of the multi-layered perceptron (MLP), and the CNN does not perform as well as the MLP on the conventional samples, but has a better effect on the image samples, and compared with the conventional one-dimensional data samples as input, the CNN can make the image directly serve as the data of the network, thereby avoiding some operations that need feature extraction and partial data processing. The CNN performs convolution operation on the upper-layer output by using a convolution kernel, and can automatically extract characteristic pictures of the upper-layer output, including abstract structures on colors, textures and shapes; particularly with good robustness against displacement, scaling and other distortion invariance problems.

The method for generating the countermeasure network (GAN) is a training model proposed in 2014, and by means of countermeasure training between two models (a generating network G and a discriminating network D), the method refers to the idea of the MinMax problem in the game theory, so that the effects of the two models are improved finally. The goal of GAN is to give a set of true sample distributions, and train the generator G and the discriminator D according to the set continuously and iteratively, so that the generator G can generate samples from the noise signal that conform to the true sample distributions as much as possible, and the discriminator D can discriminate whether the samples conform to the true sample distributions from the sample distributions.

Perceptual relevance was proposed in 2016, and the main contribution is to propose a new metric that helps GAN produce sharp images. The method is to replace the loss metric in the original pixel space with the loss metric in the feature space. In training GAN, in addition to the countertraining loss in the original GAN, two additional loss terms are added, and the total loss terms are three loss terms: feature space loss, countermeasure loss, pixel space loss.

The image rain removing method mainly comprises the following steps: a single convolutional neural network is used for image rain removal, and the method comprises the following specific steps: 1) acquiring a clear image and an artificial rainfall image, and constructing an image library; 2) designing a convolutional neural network; 3) training the designed convolutional neural network by using the clear-rain image pair in the image library; 4) training a certain number of rounds to obtain a trained convolutional neural network; 5) and inputting the rain-carrying image into a trained convolutional neural network to obtain a corresponding clear image.

However, the effect of the picture generated by the above technology has some defects, and especially in the background and the raindrop similar part, the picture is often distorted.

Disclosure of Invention

The invention aims to overcome the defects of the existing method and provides an image raindrop removing method based on a generation countermeasure network. The invention provides a more efficient and obvious image rain removing method by constructing a generation confrontation network and utilizing a deep rain removing algorithm, only pictures need to be input into the generation network in actual use, and a result picture can be obtained through one-time forward propagation.

In order to solve the above problem, the present invention provides an image raindrop removing method based on a generation countermeasure network, the method including:

acquiring an outdoor scene picture set from a database;

image preprocessing, namely adding a raining effect to the acquired outdoor scene picture set, and constructing a training set and a testing set;

constructing a generating network, wherein the input of the generating network is a scene image with rain, and the output of the generating network is a clear scene image;

training a generating network according to errors in pixel space;

adding errors in the feature space to train the generative network again;

constructing a discriminant network, wherein the input of the discriminant network is a real sample or a sample generated by a generator, and the output of the discriminant network is a single mark of true or false;

adding a discriminant network into the model, and training the generating network by adopting an error back propagation algorithm;

and inputting the rain scene graph with the concentrated test into the trained generating network, and outputting the rain scene graph with the concentrated test into a corresponding clear scene image.

Preferably, the retraining of the error added to the feature space is performed by using a generative network, specifically:

introducing a trained comparator network, respectively inputting the generated clear picture and the actual clear picture into the comparator network, respectively acquiring feature graphs of the generated clear picture and the actual clear picture in the comparator network, and calculating the Euclidean distance between the two feature graphs to be used as an error on a feature space. And combining the error on the pixel space and the error on the feature space as a total error, and training the generative network by adopting an error back propagation algorithm.

In the embodiment of the invention, an image raindrop removing method based on a generation countermeasure network is provided. The invention provides a more efficient and obvious image rain removing method by constructing a generation confrontation network and utilizing a deep rain removing algorithm, only pictures need to be input into the generation network in actual use, and a result picture can be obtained through one-time forward propagation.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is an overall flow diagram of an embodiment of the present invention;

fig. 2 is a structural diagram of a generative network G according to an embodiment of the present invention;

FIG. 3 is a block diagram of a discriminant network D according to an embodiment of the present invention;

fig. 4 is an overall schematic diagram of an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Fig. 1 is an overall flow chart of an embodiment of the present invention, and as shown in fig. 1, the method includes:

s1, acquiring an outdoor scene picture set from the database;

s2, preprocessing images, adding a rain effect to the acquired outdoor scene picture set, and constructing a training set and a test set;

s3, constructing a generating network, wherein the input is a scene image with rain and the output is a clear scene image;

s4, training a generating network according to the error on the pixel space;

s5, adding errors in the feature space to train a generative network again;

s6, constructing a discriminant network, wherein the input of the discriminant network is a real sample or a sample generated by the generator, and the output of the discriminant network is a single mark of true or false;

s7, adding the discriminant network into the model, and training the generating network by adopting an error back propagation algorithm;

and S8, inputting the rain scene graph with the concentrated test into the trained generating network, and outputting the rain scene graph with the concentrated test into a corresponding clear scene image.

Step S1 is to obtain 1000 outdoor scene pictures from the SUN database.

Step S2 is specifically as follows:

s21, setting fuzzy angles to be 75 degrees, 80 degrees, 90 degrees, 100 degrees and 105 degrees respectively by using PS software and a method for constructing a dynamic fuzzy filter; the distance is 30 pixels and 50 pixels; a total of 10 filters were constructed to simulate different rain effects.

S22, adding the filter described in S21 to each external scene picture, wherein each picture corresponds to 10 pictures with rain effect, and 10000 clear-fuzzy pairs are formed together; 8000 pairs of pictures are randomly extracted to generate raining-clear pair blocks with the size of about 300 ten thousand pairs of 32x32 as a training set; the remaining 2000 pictures were evaluated as prediction sets for the network.

Step S3 is specifically as follows:

as shown in fig. 2, a generative network is constructed, which is composed of 24 residual block convolution units, and the input of the network is a scene image with rain, and the output of the network is a clear scene image; the whole network consists of 24 blocks in total, wherein each block comprises a convolutional layer, a Batch Normalization layer (BN) and a Swish layer; since the input image is a color image, the convolution kernel size of the first layer is 7x7x3, the convolution step is 1, and 16 convolution kernels generate 16 outputs; the convolution kernel size of the 2 nd to 23 th layers is 3x3x16, the convolution step is 1, and 16 convolution kernels are generated to output 16 pieces of data; the size of the convolution kernel of the last layer is 3x3x3, the convolution step is 1, and 3 convolution kernels are generated to generate 3 pieces of output; wherein the output of the N +1 th layer is the output of the N +1 th block plus the output of the N-1 block (N > -2).

Step S4 is specifically as follows:

training a generating network, adopting an error back propagation algorithm, using Euclidean distance of the image on a pixel space as an error, and training the neural network described in S3; the learning rate is set to 0.1, 40 pictures are input in each batch, 20 ten thousand iterations are used as a complete training, and a generative network is initially trained.

Wherein the error L in pixel space₁The Mean Squared Error (MSE) is adopted, and the specific formula is as follows:

wherein θ G is a parameter in the model constructed in S3, G is a mapping equation of the model, and X is_iFor the ith picture with rain, Y_iFor a corresponding clear picture, n is the total number of training samples, thus L₁Calculated is a rain-removed picture G generated by the model_θg(X_i) With the real picture Y_iThe sum of squared euclidean distances between them, this error is minimized by the back-propagation method to optimize the model.

Wherein the back propagation uses a random gradient descent based on standard back propagation, and the weight matrix is updated by the following expression:

wherein l is the number of layers, i is the number of iteration rounds, η is the learning rate,

is a gradient.

By adopting the scheme, the generator model can generate a rain-removing picture preliminarily.

Step S5 is specifically as follows:

and S51, introducing the trained VGG19 network into the model as a comparator network, respectively inputting the generated clear picture and the actual clear picture into the VGG19 network, respectively acquiring the 2 nd feature map of the two in front of the 2 nd maximum pooling layer in the VGG19 network, and calculating the Euclidean distance between the two feature maps as the error on the feature space.

The specific calculation method of the error on the feature space is as follows:

the comparator model uses the trained VGG19 model to respectively compare the pictures G generated by the generator_θg(X_i) And a corresponding clear picture Y_iInputting the data into a comparator model, acquiring a characteristic diagram output by the intermediate layer, and calculating the mean square error of Euclidean distance between the characteristic diagram and the intermediate layer, wherein the formula is as follows:

the function phi is a feature map corresponding to the picture, and other parameters are the same as those in the error in the pixel space in S4.

S52, combining the error on the pixel space mentioned in S4 and the error on the feature space in S5 as a total error, training the convolutional neural network generator by adopting an error back propagation algorithm, and iterating a certain number of rounds to optimize the convolutional neural network generator; the weight of the error on the feature space is set to be 0.1, the generative network is trained by adopting a back propagation method, and the generative network is trained again by 20 ten thousand iterations.

Specifically, the overall error of the model is:

L＝L₁+λ₁·L₂

wherein λ₁Is the weight of the error on the feature space in the model overall error. Based on the above total error, on the model that has been trained in S4, standard back propagation based model is again usedAnd (4) performing gradient descent by using a random gradient descent method, and updating network parameters in the generator model.

With the above preferred scheme, the picture generated by the generator model can be made to be closer to a natural image in detail.

Step S6 is specifically as follows:

as shown in fig. 3, a discriminant network is constructed, the input is a real sample or a sample generated by a generator, the output is a single identifier, and the value is true or false; the front of the whole network consists of four convolution blocks, wherein each block comprises a convolution layer, a Batch Normalization layer (BN) and a Swish layer; the input image is a 32x32 generated image or a real image, the size of a convolution kernel of a first layer is 5x5x3, the convolution step is 2, and 128 convolution kernels are generated to output 128 pieces of images; the convolution kernel size of the second layer is 3x3x128, the convolution step is 2, and 512 convolution kernels generate 512 outputs; the size of the convolution kernel of the third layer is 3x3x512, the convolution step is 2, and 1024 convolution kernels generate 1024 outputs; then connecting a full connection layer with 1024-dimensional output, namely a Leaky ReLU layer; and connecting a one-dimensional full connection layer and the sigmoid to obtain output.

Step S7 is specifically as follows:

a discriminant network is added to the model, which is shown generally in fig. 4. Taking the confrontation error between the generating network and the discriminant network as a new error term, combining the new error term with the overall error in S5, and training the convolutional neural network generator by adopting an error back propagation algorithm;

and S71, fixing the generating network parameters and training the discriminant network in each training round. And inputting 20 clear pictures as negative samples and 20 real clear pictures as positive samples in each batch, and training a generative network by adopting an error back propagation algorithm and taking 1 ten thousand iterations as one complete training.

And S72, fixing discriminant network parameters and training a generating network. Adding a countermeasure error brought by the discriminant network into the total error of the generative network, training the generative network by adopting a back propagation method, and iteratively training the generative network for 1 ten thousand times;

and S73, repeating the steps S71 and S72 for 10 rounds.

The specific calculation method of the countermeasure error is as follows:

for the newly added discriminant network D, the input is a real sample or a generated sample, and the training target is as follows: when the input is a real sample, the network output is 1; when the input is a generated sample, the net output is 0. The expression for the countererror is therefore as follows:

wherein θ D is a parameter of discriminant network D; for the generating network G, the smaller the value of the above formula is, the better the value is, the closer the picture generated by the generating network G is to the real clear image is; the higher the value of the above equation is, the better the discriminant network D is, the more accurately the discriminant network can distinguish between the generated picture and the actual picture.

According to the training process of generating the countermeasure network, since the generative network G has already been trained in steps S4 and S5, the parameters of the generative network G are locked first, and the discriminant network D is trained by using a random gradient descent method; after training for a certain number of times, the parameters of the discriminant network D are locked, and the generative network G is trained again by using a random gradient descent method in combination with the overall error in S5.

Step S7 may cause the model to eventually generate a picture that is more similar to a natural picture.

Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.

In addition, the above detailed description is provided for the method for removing raindrops based on an image generated as a countermeasure network, and a specific example is applied in this document to illustrate the principle and the implementation of the present invention, and the above description of the embodiment is only used to help understanding the method of the present invention and the core idea thereof; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. An image raindrop removing method based on a generation countermeasure network, the method comprising:

acquiring an outdoor scene picture set from a database;

training a generating network according to errors in pixel space;

adding errors in the feature space to train the generative network again;

2. The method for removing raindrops on the basis of the image generated by the countermeasure network as claimed in claim 1, wherein the error added to the feature space retrains the generator network, specifically:

introducing a trained comparator network, respectively inputting the generated clear picture and the actual clear picture into the comparator network, respectively obtaining the feature graphs of the two in the comparator network, calculating the Euclidean distance between the two feature graphs as an error on a feature space, combining the error on a pixel space and the error on the feature space as a total error, and training the generating network by adopting an error back propagation algorithm.