CN113658074B

CN113658074B - Single image raindrop removing method based on LAB color space multi-scale fusion network

Info

Publication number: CN113658074B
Application number: CN202110938534.9A
Authority: CN
Inventors: 牛玉贞; 陈锋; 林闽沪
Original assignee: Fuzhou University
Current assignee: Fuzhou University
Priority date: 2021-08-16
Filing date: 2021-08-16
Publication date: 2023-07-28
Anticipated expiration: 2041-08-16
Also published as: CN113658074A

Abstract

The invention provides a single image raindrop removing method based on a LAB color space multi-scale fusion network, which comprises the following steps of; step A: generating a training image pair according to the degradation state and the clean state of the original attached raindrops in the same scene, and carrying out color space transformation on the training image pair to obtain a training image pair corresponding to the LAB color space; and (B) step (B): preprocessing a training image pair to obtain an image block data set of an LAB color space; step C: based on the characteristics of the image pair on the LAB color space, a convolution neural network for removing raindrops of a single image is designed by utilizing a multi-scale learning strategy; step D: designing a target loss function loss, taking an image block data set as training data, calculating gradients of all parameters in a convolutional neural network by using a back propagation method, and updating the parameters by using a random gradient descent method; step E: generating an image after raindrop removal by using an image raindrop removal network, and recovering to an RGB color space; the invention can obviously improve the performance of removing the image raindrops.

Description

Single image raindrop removing method based on LAB color space multi-scale fusion network

Technical Field

The invention relates to the fields of image and video processing and computer vision, in particular to a single-image raindrop removing method based on an LAB color space multi-scale fusion network.

Background

With the continuous development of unmanned, smart city, intelligent transportation and other fields, how to use various image acquisition devices to acquire effective outdoor images and improve the quality of the acquired outdoor images is a problem to be solved in these fields. The high-quality outdoor image can facilitate the subsequent use of an image recognition system and the like for automatic navigation, vehicle tracking, pedestrian detection and the like. However, when outdoor images are collected, the outdoor images are particularly susceptible to weather changes. Rain is a common weather phenomenon that when raindrops adhere to a glass window or a camera lens, it hinders the visibility of a background scene, reduces the quality of an image, causes serious degradation of the image, such as a reduction in contrast, a reduction in saturation, and a blurring of the visibility. This makes the detailed information on the image unrecognizable, resulting in a great reduction in the value of the image to be used. Therefore, removing raindrops from a single image to restore a clean background is of great research importance for improving the stability and applicability of outdoor computer vision systems.

Most of the current work is focused on removing rain streaks in a single image, and less research is conducted on removing rain drops in a single image. Although raindrops in an image are not as dense as rain stripes, the raindrops are generally larger in size than the rain stripes, and the image background is completely shielded. Moreover, the appearance of a raindrop is affected by a number of parameters, its shape is often different from a thin and vertical rainstripe, which results in a physical modeling of the raindrop that is also quite different from the rainstripe. Therefore, the single-image rain stripe removing methods which are well explored cannot be directly used for removing the single-image rain drops, and the rain drop removing work for the single-image becomes more difficult.

Current methods of single image raindrop removal are largely divided into two broad categories, namely model-based methods and existing deep learning-based methods. Model-based methods typically use a filter to decompose it into high and low frequency components, and then differentiate the rain and non-rain components in the high frequency components by dictionary learning. The model-based methods mostly rely on manually preset parameters for feature extraction and performance optimization of images, only shallow information in the images can be extracted, and there is a lack of deep reasoning about the content, so that they are not robust to changes in input, such as raindrops with various blurs, brightness and sizes. The deep learning-based method is a data-driven method for training a convolutional neural network by using a large amount of data. The strong characteristic learning representation capability of the convolutional neural network can better extract image characteristics and obtain a better raindrop removal effect. However, these deep learning-based methods do not fully explore and combine the physical properties of raindrops, which are powerful a priori guidance for the task of removing raindrops, and their single-scale frameworks have difficulty capturing cross-scale correlations of raindrops. Therefore, how to combine deep learning with the physical characteristics of raindrop enrichment is just like a matter of focus for future researchers.

Disclosure of Invention

The invention provides a single image raindrop removing method based on an LAB color space multi-scale fusion network, which combines the physical characteristics of raindrops with a deep convolution neural network, and obviously improves the performance of removing the raindrops of the image.

The invention adopts the following technical scheme.

A single image raindrop removing method based on LAB color space multi-scale fusion network is based on multi-scale self-adaptive fusion network, comprising the following steps;

step A: generating a training image pair according to the original attached raindrop degradation state and the clean state in the same scene, and performing color space transformation on the training image pair to obtain a training image pair corresponding to the LAB color space original attached raindrop degradation image and the clean image;

and (B) step (B): b, preprocessing the training image pair of the LAB color space original attached raindrop degradation image and the clean image obtained in the step A to obtain an image block data set formed by the training image pair of the LAB color space original attached raindrop degradation image and the clean image;

step C: based on the characteristics of the image pair on the LAB color space, designing a convolution neural network for removing raindrops of a single image by utilizing a multi-scale learning strategy;

step D: designing a target loss function loss for optimizing a network, taking an image block data set as training data, calculating gradients of all parameters in the convolutional neural network by using a back propagation method according to the designed target loss function loss, updating the parameters by using a random gradient descent method, and finally learning optimal parameters of a model;

step E: inputting an image to be detected into a designed image raindrop removing network, predicting and generating a clean image after raindrop removal in the LAB color space by using a trained model, and finally carrying out color space transformation on the clean image to restore the clean image to the RGB color space.

The step A comprises the following steps:

step A1: converting the training image pair of the original attached raindrop degradation image and the clean image from the RGB color space to the XYZ color space;

step A2: the training image pair of the XYZ color space obtained in the step A1 is converted into the training image pair of the LAB color space.

The step A1 comprises the following steps:

step A11: in the RGB color space, the values of three channels of any pixel are recorded as r, g and b, the value ranges are 0 and 255, and the conversion is carried out according to the following conversion formula:

wherein the gamma () function performs nonlinear tone editing on the image to improve the image contrast; the following gamma function is used:

step A12: and C, arranging the R, G and B components obtained in the step A11 into a vector, multiplying the vector by a coefficient matrix M to finish the conversion from an RGB color space to an XYZ color space, wherein the conversion formula is as follows:

wherein the method comprises the steps of

The step A2 comprises the following steps:

step A21: let L be ^* ，A ^* ，B ^* For the values of the three channels of the final LAB color space, the XYZ color space is converted to the LAB color space according to the following conversion formula:

L ^* ＝116f(Y/Y _n )-16

A ^* ＝500[f(X/X _n )-f(Y/Y _n )]

B ^* ＝200[f(Y/Y _n )-f(Z/Z _n )]

wherein X, Y, Z are calculated values from the RGB color space to the XYZ color space, X _n ，Y _n ，Z _n The specific formula for general default 95.047, 100.0, 108.883, f (t) is as follows:

the step B comprises the following steps of;

step B1: and cutting the LAB color space original attached raindrop degradation image and the corresponding clean image in a consistent mode to obtain an image block with W multiplied by W, wherein the image block with W multiplied by W is cut at intervals of m pixel points to avoid overlapping cutting, and the attached raindrop image block with W multiplied by W and the clean image block with W multiplied by W are in one-to-one correspondence after cutting.

The step C comprises the following steps of;

step C1: separating an L channel from a B channel in an LAB color space, designing a double-branch network, wherein one branch is used for removing the raindrops of the L channel of the original attached raindrop degradation image, and the other branch is used for removing the raindrops of the B channel of the original attached raindrop degradation image;

step C2: designing a multi-scale self-adaptive fusion network of each branch for extracting relevant characteristics of raindrops so as to optimize the task of removing the raindrops;

step C3: and designing a multi-scale self-adaptive feature fusion module in the multi-scale self-adaptive fusion network, and optimizing and extracting image features and feature fusion.

In the step C1, the structures of the multi-scale self-adaptive fusion networks corresponding to the two branches are the same, and the step C2 comprises the following steps of;

step C21: the L channel image block and the B channel image block corresponding to the original attached raindrop degradation image are respectively sent into the multi-scale self-adaptive fusion network of each branch,

step C22: the processing flow of the L channel image block and the B channel image block after being sent into the network corresponding to the branches is the same, the L channel is used for describing, the L channel image block corresponding to the original attached raindrop degradation image is input into a convolution layer with an activation function of ReLU, the conversion from the image to the feature map is carried out, and the features are output according to the following formula:

F ₀ ＝Conv1(X _l )

wherein Conv1 represents a convolution layer with an activation function ReLU, X _l An L-channel image block corresponding to the original attached raindrop degradation image, F ₀ Representing the extracted feature map;

step C23: map F of the characteristics ₀ The method comprises the steps of inputting the data into a plurality of designed multi-scale self-adaptive feature fusion modules, wherein the number of the multi-scale self-adaptive feature fusion modules is 5, and calculating according to the following formula:

F ₁ ＝MSAFF(MSAFF(MSAFF(MSAFF(MSAFF(F ₀ )))))

wherein MSAFF represents the designed multi-scale adaptive feature fusion module;

step C24: output result F of step C23 ₁ The method comprises the steps of inputting the original attached raindrop degradation image L channel raindrop removal image with the channel number of 1 into a convolution layer with an activation function of ReLU to finish conversion from a feature map to an image, and outputting the original attached raindrop degradation image L channel raindrop removal image with the channel number of 1 according to the following formula:

Y _l ＝Conv2(F ₁ )

wherein Conv2 represents the convolutional layer with an activation function ReLU, Y _l A raindrop removal image for an original attached raindrop degradation image L channel with a channel number of 1;

the step C3 comprises the following steps:

step C31: in the multi-scale self-adaptive feature fusion module, firstly, input features F' are respectively sent into 3 cavity convolutions with different expansion rates, and the convolution is calculated according to the following formula:

F ₁ ′＝LeakyReLU(Dilated _r＝2 (F′))

F ₂ ′＝LeakyReLU(Dilated _r＝4 (F′))

F ₃ ′＝LeakyReLU(Dilated _r＝8 (F′))

wherein F is ₁ ′，F ₂ ′，F ₃ ' output features of cavity convolutions with expansion rates of 2, 4 and 8 respectively, dlated _r＝n For cavity convolution or expansion convolution, the receptive field is increased through an expansion rate r parameter, and space context information is effectively aggregated to better extract characteristics; the expansion ratio r represents the number of 0 s that play a role in spacing between elements in the convolution kernel, when r=1, the hole convolution is the same as the normal convolution, and the elements in the convolution kernel are adjacent to each other and have no 0 s for spacing; when r > 1, inserting r 10 between elements in the convolution kernel to enlarge the receptive field; the formula of LeakyReLU is as follows:

wherein x represents the input value of the LeakyReLU function, and a is a fixed linear coefficient;

step C32: convolving the 3 holes obtained in step C31 with the output characteristic F ₁ ′，F ₂ ′，F ₃ The' send-in self-adaptive feature fusion module is calculated according to the following formula:

F ₄ ′＝AFF(F ₁ ′，F ₂ ′，F ₃ ′)

F ₄ ' being an adaptive feature fusion moduleThe output, AFF, is an adaptive feature fusion module defined as follows:

first, 3 output features F of step C31 ₁ ′，F ₂ ′，F ₃ ' perform an element-wise summation operation:

L′＝F ₁ ′+F ₂ ′+F ₃ ′

the result L' of the element-wise summation operation is then fed into 1 global average pooling layer, 1 convolution layer of 1 x 1, calculated as follows:

s′＝AvgPool _r (L′)

z＝σ(Connv _1×1 (s′))

wherein AvgPool _r Global average pooling with step size r, conv _1×1 The size of the eigenvector z is 1 x 1, representing the convolution layer for dimension reductionC is the number of channels of the result L' of the above-mentioned element-wise summation operation, and σ (x) represents the pralu activation function, defined as follows:

i represents different channels, a _i Is a parameter to be learned.

And then respectively sending the obtained eigenvectors z into 3 parallel convolution branches, and calculating according to the following formula:

s _i ＝Softmax(v _i )i＝1，2，3

wherein the method comprises the steps ofRepresenting the ith branch for the 1 x 1 convolutional layer of the upscale, i.e. v _i Is 1×1×C, and Softmax is SoftmaActivating a function by x;

finally, s is ₁ ，s ₂ ，s ₃ Respectively with F ₁ ′，F ₂ ′，F ₃ The' multiplication and addition are carried out, so that the output of the self-adaptive feature fusion module is obtained, and the output is calculated according to the following formula:

F ₄ ′＝AFF _out ＝s ₁ F ₁ ′+s ₂ F ₂ ′+s ₃ F ₃ ′

step C33: inputting the characteristic F' of the multi-scale self-adaptive characteristic fusion module and the characteristic F obtained in the step C32 ₄ ' perform an element-wise summation operation, calculated as:

F ₅ ′＝F′+F ₄ ′

step C34: feature F obtained in step C33 ₅ ' send into a standard residual error module and channel attention module, calculate according to the following formula:

F _out ＝SE(RB(F ₅ ′))

where RB (x) denotes the standard residual block, SE (x) denotes the channel attention block, F _out And the output of the multi-scale self-adaptive feature fusion module is represented.

Said step D comprises a method of,

step D1: using the common loss function as a constraint to optimize our network model, the formula is as follows:

wherein SSIM is a structural similarity loss function; assume that a training image block pair in the LAB color space obtained in step B is givenAnd->Where i=1, …, N (total number of samples of training data), +.>Is the image block on the L channel corresponding to the input image damaged by rain drops in LAB color space,>is the image block on the corresponding B-channel of the input image damaged by rain drops in LAB color space, and similarly,/for>And->The clean image in the LAB color space corresponds to the image block on the L channel and the image block on the B channel; y is Y _l ' represents a raindrop-removed image block predicted to be generated by an L-channel branch network, Y _b ' represents the image block after removing the raindrops predicted and generated by the branch network of the B channel;

step D2: and (3) randomly dividing the image block data set into a plurality of batches, wherein each batch contains the same number of image block pairs, training and optimizing the designed network until the L value calculated in the step (D1) converges to a threshold value or the iteration number reaches the threshold value, and storing a trained model to complete the network training process.

The step E comprises the following steps of;

step E1: converting the clean image after raindrop removal in the generated LAB color space from the LAB color space to the XYZ color space;

step E2: and (3) converting the result of the XYZ color space obtained in the step E1 into the final result of the RGB color space.

The step E1 comprises the following steps;

step E11: in the LAB color space, the values of three channels of any pixel are marked as l, a and b, wherein the value range of l is [0, 100], the value ranges of a and b are [ 127-128 ], and the conversion is carried out according to the following conversion formula:

wherein X ', Y ', Z ' are calculated values from the LAB color space to the XYZ color space, X _n ，Y _n ，Z _n Default values are 95.047, 100.0, 108.883, f ^-1 The specific formula of (t) is as follows:

step E2 includes the following steps;

step E21: and E11, arranging the three components X ', Y', Z 'obtained in the step E11 into a vector, multiplying the vector by a coefficient matrix M' to finish the primary conversion from the XYZ color space to the RGB color space, wherein the conversion formula is as follows:

wherein the method comprises the steps of

Step E22: and E21, converting the primary conversion results R ', G', B 'obtained in the step E21 again according to the following conversion formula to obtain values R', G ', B' of three channels of any pixel in the final RGB color space, wherein the value ranges are [0, 255]:

wherein gamma is gamma ^-1 (x) The specific formula of (2) is as follows:

compared with the prior art, the invention has the beneficial effects that: according to the method, firstly, based on the characteristics of an original attached raindrop degradation image on an LAB color space, a raindrop removal task is decomposed into two subtasks for independently removing raindrops on an L channel and a B channel, and the two subtasks are realized by designing a double-branch network. With this advantage of multiple scales, each branched network is designed as a multi-scale adaptive fusion network to better remove raindrops. Specifically, a cavity convolution module and a feature self-adaptive fusion module with different expansion rates are adopted, so that the multi-scale features of the image are better extracted and the multi-scale feature fusion is effectively carried out. The convolutional neural network for removing the image raindrops based on the LAB color space is designed aiming at the problem of removing the image raindrops, the image quality after the raindrops are removed can be ensured, and the size of network parameters is smaller than that of other methods, so that the convolutional neural network has higher use value.

Drawings

The invention is described in further detail below with reference to the attached drawings and detailed description:

FIG. 1 is a schematic diagram of the process flow for carrying out the method of the present invention;

FIG. 2 is a schematic structural diagram of a single image raindrop removal method model based on a LAB color space multi-scale adaptive fusion network in an embodiment of the invention;

FIG. 3 is a schematic structural diagram of a multi-scale adaptive feature fusion module according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of an adaptive feature fusion module in the multi-scale adaptive feature fusion module according to an embodiment of the present invention.

Detailed Description

As shown in the figure, the single image raindrop removing method based on the LAB color space multi-scale fusion network is based on the multi-scale self-adaptive fusion network and comprises the following steps of;

The step A comprises the following steps:

The step A1 comprises the following steps:

wherein the method comprises the steps of

The step A2 comprises the following steps:

L ^* ＝116f(Y/Y _n )-16

A ^* ＝500[f(X/X _n )-f(Y/Y _n )]

B ^* ＝200[f(Y/Y _n )-f(Z/Z _n )]

the step B comprises the following steps of;

The step C comprises the following steps of;

F ₀ ＝Conv1(X _l )

F ₁ ＝MSAFF(MSAFF(MSAFF(MSAFF(MSAFF(F ₀ )))))

Y _l ＝Conv2(F ₁ )

the step C3 comprises the following steps:

F ₁ ′＝LeakyReLU(Dilated _r＝2 (F′))

F ₂ ′＝LeakyReLU(Dilated _r＝4 (F′))

F ₃ ′＝LeakyReLU(Dilated _r＝8 (F′))

wherein F is ₁ ′，F ₂ ′，F ₃ ' output features of cavity convolutions with expansion rates of 2, 4 and 8 respectively, dlated _r＝n For cavity convolution or expansion convolution, the receptive field is increased through an expansion rate r parameter, and space context information is effectively aggregated to better extract characteristics; the expansion ratio r represents the number of 0 s that play a role in spacing between elements in the convolution kernel, when r=1, the hole convolution is the same as the normal convolution, and the elements in the convolution kernel are adjacent to each other and have no 0 s for spacing; and when r is greater than 1,r 10 is inserted between elements in the convolution kernel to enlarge the receptive field; the formula of LeakyReLU is as follows:

F ₄ ′＝AFF(F ₁ ′，F ₂ ′，F ₃ ′)

F ₄ ' is the output of the adaptive feature fusion module, AFF (x) is the adaptive feature fusion module, which is defined as follows: first, 3 output features F of step C31 ₁ ′，F ₂ ′，F ₃ ' perform an element-wise summation operation:

L′＝F ₁ ′+F ₂ ′+F ₃ ′

s′＝AvgPool _r (L′)

z＝σ(Conv _1×1 (s′))

i represents different channels, a _i Is a parameter to be learned.

s _i ＝Softmax(v _i )i＝1，2，3

wherein the method comprises the steps ofRepresenting the ith branch for the 1 x 1 convolutional layer of the upscale, i.e. v _i Is 1×1×c, softmax is the Softmax activation function;

F ₄ ′＝AFF _out ＝s ₁ F ₁ ′+s ₂ F ₂ ′+s ₃ F ₃ ′

F ₅ ′＝F′+F ₄ ′

F _out ＝SE(RB(F ₅ ′))

Said step D comprises a method of,

The step E comprises the following steps of;

The step E1 comprises the following steps;

step E2 includes the following steps;

wherein the method comprises the steps of

wherein gamma is gamma ^-1 (x) The specific formula of (2) is as follows:

the above is a preferred embodiment of the present invention, and all changes made according to the technical solution of the present invention belong to the protection scope of the present invention when the generated functional effects do not exceed the scope of the technical solution of the present invention.

Claims

1. The single image raindrop removing method based on the LAB color space multi-scale fusion network is based on the multi-scale self-adaptive fusion network and is characterized in that: comprises the following steps of;

step E: inputting an image to be detected into a designed convolution neural network for removing raindrops of a single image, predicting and generating a clean image after removing raindrops in an LAB color space by using a trained convolution neural network model for removing raindrops of the single image, and finally carrying out color space transformation on the clean image to restore the clean image to an RGB color space; the step B comprises the following steps of;

step B1: dicing the LAB color space original attached raindrop degradation image and the corresponding clean image thereof in a consistent mode to obtain an image block with W multiplied by W, dicing every m pixel points during dicing to avoid overlapping dicing, wherein the W multiplied by W attached raindrop image block corresponds to the W multiplied by W clean image block one by one after dicing;

the step C comprises the following steps of;

step C3: designing a multi-scale self-adaptive feature fusion module in a multi-scale self-adaptive fusion network, which is used for optimizing and extracting image features and feature fusion;

F ₀ ＝Conv1(X _l )

F ₁ ＝MSAFF(MSAFF(MSAFF(MSAFF(MSAFF(F ₀ )))))

Y _l ＝Conv2(F ₁ )

the step C3 comprises the following steps:

F ₁ ′＝LeakyReLU(Dilated _r＝2 (F′))

F ₂ ′＝LeakyReLU(Dilated _r＝4 (F′))

F ₃ ′＝LeakyReLU(Dilated _r＝8 (F′))

wherein F is ₁ ′，F ₂ ′，F ₃ ' output features of cavity convolutions with expansion rates of 2, 4 and 8 respectively, dlated _r＝n For cavity convolution or expansion convolution, the receptive field is increased through an expansion rate r parameter, and space context information is effectively aggregated to better extract characteristics; the expansion ratio r represents the number of 0 s that play a role in spacing between elements in the convolution kernel, when r=1, the hole convolution is the same as the normal convolution, and the elements in the convolution kernel are adjacent to each other and have no 0 s for spacing; when r is more than 1, r-1 0 are inserted between elements in the convolution kernel to enlarge the receptive field; the formula of LeakyReLU is as follows:

F ₄ ′＝AFF(F ₁ ′，F ₂ ′，F ₃ ′)

F ₄ ' is the output of the adaptive feature fusion module, AFF (x) is the adaptive feature fusion module, which is defined as follows:

L′＝F ₁ ′+F ₂ ′+F ₃ ′

s′＝AvgPool _r (L′)

z＝σ(Conv _1×1 (s′))

i represents different channels, a _i Is a parameter to be learned;

s _i ＝Softmax(v _i )i＝1，2，3

F ₄ ′＝AFF _out ＝s ₁ F ₁ ′+s ₂ F ₂ ′+s ₃ F ₃ ′

step C33: inputting the characteristic F of the multi-scale self-adaptive characteristic fusion module and the characteristic F obtained in the step C32 ₄ ' perform an element-wise summation operation, calculated as:

F ₅ ′＝F′+F ₄ ′

F _out ＝SE(RB(F ₅ ′))

2. The LAB color space multiscale fusion network-based single image raindrop removal method of claim 1, wherein: the step A comprises the following steps:

3. The LAB color space multiscale fusion network-based single-image raindrop removal method of claim 2, wherein: the step A1 comprises the following steps:

wherein the method comprises the steps of

The step A2 comprises the following steps:

L ^* ＝116f(Y/Y _n )-16

A ^* ＝500[f(X/X _n )-f(Y/Y _n )]

B ^* ＝200[f(Y/Y _n )-f(Z/Z _n )]

4. the LAB color space multiscale fusion network-based single image raindrop removal method of claim 1, wherein: said step D comprises a method of,

wherein SSIM is a structural similarity loss function; assume that a training image block pair in the LAB color space obtained in step B is givenAnd->Where i=1, …, total number of samples of N training data, +.>Is the image block on the L channel corresponding to the input image damaged by rain drops in LAB color space,>is the image block on the B channel corresponding to the input image damaged by rain drops in the LAB color space, and similarly, Y _l ⁱ And->The clean image in the LAB color space corresponds to the image block on the L channel and the image block on the B channel; y is Y _l 'represents the image block after removing the raindrops predicted and generated by the L channel branch network, Y' _b Representing the image block after removing the raindrops predicted and generated by the branch network of the B channel;

5. The LAB color space multiscale fusion network-based single image raindrop removal method of claim 1, wherein: the step E comprises the following steps of;

6. The LAB color space multiscale fusion network-based single-image raindrop removal method of claim 5, wherein: the step E1 comprises the following steps;

step E2 includes the following steps;

wherein the method comprises the steps of

wherein gamma is gamma ^-1 (x) The specific formula of (2) is as follows: