CN112767280A - Single image raindrop removing method based on loop iteration mechanism - Google Patents
Single image raindrop removing method based on loop iteration mechanism Download PDFInfo
- Publication number
- CN112767280A CN112767280A CN202110134465.6A CN202110134465A CN112767280A CN 112767280 A CN112767280 A CN 112767280A CN 202110134465 A CN202110134465 A CN 202110134465A CN 112767280 A CN112767280 A CN 112767280A
- Authority
- CN
- China
- Prior art keywords
- image
- convolution
- network
- module
- raindrop
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 47
- 230000007246 mechanism Effects 0.000 title claims abstract description 18
- 230000006870 function Effects 0.000 claims abstract description 46
- 238000012549 training Methods 0.000 claims abstract description 30
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 22
- 230000015556 catabolic process Effects 0.000 claims abstract description 12
- 238000006731 degradation reaction Methods 0.000 claims abstract description 12
- 238000013528 artificial neural network Methods 0.000 claims abstract description 7
- 238000011478 gradient descent method Methods 0.000 claims abstract description 5
- 230000008450 motivation Effects 0.000 claims abstract description 5
- 238000007781 pre-processing Methods 0.000 claims abstract description 5
- 230000002776 aggregation Effects 0.000 claims description 53
- 238000004220 aggregation Methods 0.000 claims description 53
- 238000010586 diagram Methods 0.000 claims description 16
- 230000004913 activation Effects 0.000 claims description 15
- 230000015654 memory Effects 0.000 claims description 14
- 238000013461 design Methods 0.000 claims description 5
- 230000008569 process Effects 0.000 claims description 5
- 230000004931 aggregating effect Effects 0.000 claims description 4
- 230000007787 long-term memory Effects 0.000 claims description 4
- 238000005457 optimization Methods 0.000 claims description 4
- 230000006403 short-term memory Effects 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 239000011800 void material Substances 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 2
- 238000003062 neural network model Methods 0.000 claims description 2
- 238000006116 polymerization reaction Methods 0.000 claims 6
- 238000013135 deep learning Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
Images
Classifications
-
- G06T5/73—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Abstract
The invention relates to a single image raindrop removing method based on a loop iteration mechanism. The method comprises the following steps: preprocessing the training image pair of the original raindrop degradation image and the clean image to obtain an image block data set consisting of the training image pair of the original raindrop degradation image and the clean image; designing a convolution neural network for removing raindrops of a single image by using a motivation for continuously iterating to remove rain; designing a target loss function loss for optimizing the network, taking the image block data set as training data, calculating the gradient of each parameter in the designed image raindrop removal convolutional neural network by using a back propagation method according to the designed target loss function loss, updating the parameter by using a random gradient descent method, and finally learning the optimal parameter of the model; and inputting the image to be detected, and predicting and generating a clean image after raindrops are removed by using the trained model. The method can obviously improve the raindrop removal performance of the image, and greatly reduce the network parameter size of the image.
Description
Technical Field
The invention relates to the field of image and video processing and computer vision, in particular to a single image raindrop removing method based on a loop iteration mechanism.
Background
With the rapid development of internet and multimedia technology, images have become an indispensable part for human to communicate and transmit information, and have great significance for the development of various aspects of modern society. However, the image acquisition is inevitably performed in an outdoor environment, which is inevitably affected by bad weather, such as rainy days, foggy days, snowy days, and the like. These bad weather can cause a series of visual degradation to the captured images and videos, such as contrast, saturation, visibility, etc. Rainy days are a common weather phenomenon in daily life, and raindrops attached to a glass window, a windshield or a camera lens can obstruct the visibility of a background scene, reduce the quality of an image and cause serious degradation of the image. The detailed information on the image cannot be identified, the use value of the image is greatly reduced, and certain difficulty is brought to high-level visual understanding tasks such as target detection, pedestrian re-identification and image segmentation. In rainy weather, raindrops inevitably adhere to the lens of the camera. These attached raindrops may blur or obscure part of the background scene in the image or obscure part of the foreground scene in the image, so it is of great significance to remove raindrops from a single image to restore a clean background.
Although studies on single-image raindrop streak removal have been well explored, there are few studies on single-image raindrop removal, and these well-explored single-image raindrop streak removal methods cannot be directly applied to single-image raindrop removal. Although the raindrops are less dense than the raindrops, the raindrops are generally larger in size than the raindrops, and completely block the background. Also the appearance of a raindrop is influenced by many parameters, its shape often being different from thin and vertical rainstripes. Meanwhile, the physical modeling of raindrops is also completely different from rainstripes, which makes the raindrop removal work more difficult.
Current methods for single image raindrop removal are largely divided into two broad categories, namely model-based methods and existing deep learning-based methods. Model-based methods generally use a filter to decompose the rain component into high and low frequency components, and then distinguish the rain component from the non-rain component in the high frequency component through dictionary learning. Most of model-based methods rely on parameters preset manually to perform feature extraction and performance optimization of images, so that features of raindrops cannot be extracted well, and the performance of raindrop removal is poor.
The deep learning-based method is a data-driven method, which trains a convolutional neural network by using a large amount of data. The powerful feature learning and representing capability expressed by the convolutional neural network can better extract image features, and a better raindrop removing effect is achieved. However, these methods based on deep learning also have a problem that the relationship between the raindrop removal performance and the network parameter amount cannot be balanced well. These methods either achieve good performance, but at the cost of a large number of parameters, greatly limit their potential value in practical applications with limited computational resources; or less parametric but at the cost of poorer performance. Therefore, how to design an efficient and practical single-image raindrop removal method is just one of the focuses of the future researchers.
Disclosure of Invention
The invention aims to provide a single image raindrop removing method based on a loop iteration mechanism, which can obviously improve the performance of removing raindrops of images and greatly reduce the size of network parameters.
In order to achieve the purpose, the technical scheme of the invention is as follows: a single image raindrop removing method based on a loop iteration mechanism comprises the following steps:
step A, preprocessing a training image pair of an original raindrop degradation image and a clean image to obtain an image block data set consisting of the training image pair of the original raindrop degradation image and the clean image;
b, designing a convolution neural network for removing raindrops of a single image based on the thought of 'divide and conquer' and by using the motivation of continuously iterating to remove rain;
step C, designing a target loss function loss for optimizing the network, taking the image block data set as training data, calculating the gradient of each parameter in the designed single-image raindrop removal convolutional neural network by using a back propagation method according to the designed target loss function loss, updating the parameter by using a random gradient descent method, and finally learning the optimal parameter of the single-image raindrop removal convolutional neural network;
and D, inputting the image to be detected into the designed single image raindrop removal convolutional neural network, and predicting to generate a clean image after raindrop removal by using the trained single image raindrop removal convolutional neural network.
In an embodiment of the present invention, the specific implementation manner of step a is as follows: dicing the original raindrop-attached degraded image and the corresponding clean image according to a consistent mode to obtain W multiplied by W-sized image blocks, and simultaneously dicing every m pixel points in order to avoid overlapping dicing; after the dicing, the W multiplied by W image blocks with the raindrops and the W multiplied by W clean image blocks correspond to each other one by one.
In an embodiment of the present invention, the step B is implemented as follows:
step B1, designing a multi-stage raindrop removal network, wherein the network is a convolutional neural network based on a loop iteration mechanism and specifically consists of a plurality of stage sub-networks with the same network structure and shared network parameters;
step B2, designing a phase sub-network, which is used for extracting the relevant features of the raindrops so as to better remove the raindrops;
step B3, a context aggregation module and an attention context aggregation module in the design phase sub-network, for aggregating the spatial context information lacking in the network.
In an embodiment of the present invention, the step B2 is implemented as follows:
step B21, splicing the image block obtained by the previous-stage sub-network after raindrop removal and the corresponding original raindrop-attached image block on a channel as the input of the sub-network of each stage; for the sub-network of the first stage, the input is the result of splicing two original raindrop image blocks on a channel;
step B22, inputting the spliced result obtained in step B21 into a convolutional layer with an activation function of ReLU, converting the image into a feature map, and outputting the features according to the following formula:
wherein Conv1 represents the convolutional layer with the activation function ReLU, IoFor the original raindrop image block, It-1Representing the image block obtained in the previous stage after raindrop removal, for the first stage, It-1Is Io,Representing operation according to a channel splicing characteristic, F0Representing the extracted feature map;
step B23, converting the feature map F0Inputting the data into a convolution long-short term memory network module which consists of a forgetting gate f, an input gate i and an output gate o and is calculated according to the following formula:
ft=σ(Wxf*F0+Whf*Ht-1+Wcf⊙Ct-1+bf)
it=σ(Wxi*F0+Whi*Ht-1+Wci⊙Ct-1+bi)
Ct=ft⊙Ct-1+it⊙tanh(Wxc*F0+Whc*Ht-1+bc)
ot=σ(Wxo*F0+Who*Ht-1+Wco⊙Ct+bo)
F1=Ht=ot⊙tanh(Ct)
wherein, forget the door f at the moment of ttAnd an input gate itIs inputted by the feature diagram F0Last time (i.e., t-1) convolution long and short term memory netOutput H of the envelope modulet-1And cell information state C at last momentt-1Formed by three parts, an output gate o at time ttIs inputted from the feature map F0Convolution of the output H of the long-short term memory network module at the last time (i.e. t-1)t-1And time t cell information state CtThe three parts are formed; w*And b*Respectively representing a weight parameter and a deviation parameter of a convolution kernel corresponding to the weight parameter and the deviation parameter, tanh represents a tangent function, sigma represents a Sigmoid function, an operator indicates a convolution operation, and an operator indicates a dot product operation; ctFor the cell information state at the current time t, the cell information state is fed to the convolution long-short term memory network module at the next time, HtA characteristic diagram which represents the output of the convolution long-term and short-term memory network module at the current time t; for convenience of description, refer to HtIs marked as F1;
Converting each moment into each stage in a multi-stage mode; for the first stage of the network, since it has no previous stage, the inputs of the forgetting gate and the input gate thereof are set to 0;
step B24, the output F of the convolution long-short term memory network module1Inputting into multiple context aggregation modules and attention context aggregation module, sequentially including context aggregation module with expansion rate of 2>Attention context aggregation Module with expansion Rate 2->Context aggregation Module with expansion Rate 2>Context aggregation Module with expansion ratio of 4->Attention context aggregation Module with expansion ratio of 4->The context aggregation module with the expansion rate of 4 is calculated according to the following formula:
F2=CAU4(SECAU4(CAU4(CAU2(SECAU2(CAU2(F1))))))
wherein, the CAUr(. indicates a context aggregation Module with an expansion Rate r, SECAUr() represents an attention context aggregation module with an expansion rate r;
step B25, and outputting the result F of the step B242Inputting the data into a standard residual error module, and then sending the data into a convolution layer with an activation function of ReLUAnd (3) converting the characteristic diagram into an image, and outputting the raindrop removal image of the current-stage sub-network t with the channel number of 3 according to the following formula:
It=Conv2(Res(F2))
where Res (. + -.) denotes standard residual module, Conv2 denotes convolutional layer with activation function ReLU, ItThe image is removed for raindrops of the sub-network t at the current stage.
In an embodiment of the present invention, the step B3 is implemented as follows:
step B31, in the context aggregation module, firstly, the input feature F is sent to a smooth hole convolution module, and the calculation is carried out according to the following formula:
F3=Dilatedr(Sep(F))
wherein, F3For the output characteristics of the smooth hole convolution module, F is the input of the context aggregation module, Sep is the separable shared convolution layer, i.e. the separable convolution based on the channel and all the channels share the parameters, scaledrThe (star) is a hole (expansion) convolution, the receptive field is increased through the parameter of the expansion rate r, and the spatial context information is effectively aggregated to better extract the characteristics; the expansion rate r represents the interval of the elements in the convolution kernel by 0, when r is 1, the void convolution is the same as the ordinary convolution, the elements in the convolution kernel are adjacent to each other, and no 0 exists; when r is more than 1, r-1 0 s are needed to be inserted among elements in the convolution kernel to enlarge the receptive field; the expansion rate r in the step B24 is the expansion rate of the convolution of the hole;
the only difference between the attention context aggregation module and the context aggregation module is the step that the attention context aggregation module is added with the channel attention module, and the following steps are the same, and the attention context aggregation module is calculated according to the following formula:
F3=SE(Dilatedr(Sep(F)))
wherein SE (×) represents the channel attention module;
in the step B32, the attention context aggregation module and the context aggregation module, the feature F output by the step B313Is sent to a self-calibrationThe output of the residual module formed by positive convolution is calculated according to the following formula:
F4=LeakyReLU(F3+SCC(F3))
F4and outputting a residual module formed by the self-correcting convolution, wherein the module comprises a self-correcting convolution, a LeakyReLU function and residual concatenation, and the LeakyReLU (x) has the following formula:
wherein, x represents the input value of LeakyReLU function, and a is a fixed linear coefficient;
SCC (×) is a self-correcting convolution, defined as follows:
firstly, the output characteristic F of the step B313Respectively input to 1 × 1 convolutional layers of an inactive function:
X1,X2=Conv1×1(F3)
wherein, Conv1×1A 1X 1 convolutional layer, X1,X2Respectively, a characteristic diagram obtained by halving the number of channels obtained by passing through 1 × 1 convolutional layers, i.e., if F3The number of channels of (2) is C, then X1,X2The number of the channels is C/2;
then X is put1,X2Respectively fed into respective branch operations, wherein X1The branch to the self-calibration operation is calculated as follows:
T1=AvgPoolr(X1)
X′1=Up(T1*K2)
Y1=Y′1*K4
wherein, AvgPoolr(x) is the average pooling with step size r, Up (x) is the upsampling operation, x is the convolution operation,is an operator of element-by-element multiplication, + is an operator of element-by-element addition, and sigma is a sigmoid activation function; k2、K3、K4The convolution kernels have the same sizes; y is1Is the output result of the self-correcting operation branch;
at the same time, X2Sending the corresponding convolution branch, and calculating according to the following formula:
Y2=X2*K1
finally, splicing the output results of the two branches on the channels to restore the channel number to the channel number C of the original characteristic diagram, and calculating according to the following formula:
In an embodiment of the present invention, the step C is implemented as follows:
step C1, optimizing the convolution neural network model for single image raindrop removal by using a loss function as a constraint, wherein the specific formula is as follows:
wherein SSIM is a structural similarity loss function; suppose a given training image pair (X)i,Yi) Where i is 1, …, N (total number of samples of training data), XiIs an image block of an input image damaged by raindrops, YiImage patches that are their corresponding clean images, Y denotes the use of a training image pair (X)i,Yi) The method comprises the steps that a clean image block is generated by a time network in a predicted mode after raindrops are removed;
and step C2, randomly dividing the image block data set into a plurality of batches, carrying out training optimization on the designed network until the L value calculated in the step C1 converges to a threshold value or the iteration number reaches the threshold value, storing the trained model, and finishing the network training process.
Compared with the prior art, the invention has the following beneficial effects: the method is based on the concept of divide-and-conquer, the raindrop removal task is decomposed into a cycle iteration process, and raindrops are removed circularly by designing a multi-stage convolution neural network so as to achieve a better image recovery result. Meanwhile, a latest smooth hole convolution module and a latest self-correction convolution module are adopted to better aggregate spatial context information and successfully eliminate the artifact problem generated in the raindrop removing process by the original hole convolution. The method designs an independent convolution neural network for removing the raindrops of the image aiming at the problem of removing the raindrops of the image, can ensure the quality of the image after the raindrops are removed, has smaller network parameter size compared with other methods, and has higher use value.
Drawings
FIG. 1 is a flow chart of an implementation of the method of the present invention.
Fig. 2 is a structural diagram of a single image raindrop removal method model based on a loop iteration mechanism in the embodiment of the present invention.
Detailed Description
The technical scheme of the invention is specifically explained below with reference to the accompanying drawings.
The invention provides a single image raindrop removing method based on a loop iteration mechanism, which comprises the following steps of:
step A, preprocessing a training image pair of an original raindrop degradation image and a clean image to obtain an image block data set consisting of the training image pair of the original raindrop degradation image and the clean image;
b, designing a convolution neural network for removing raindrops of a single image based on the thought of 'divide and conquer' and by using the motivation of continuously iterating to remove rain;
step C, designing a target loss function loss for optimizing the network, taking the image block data set as training data, calculating the gradient of each parameter in the designed single-image raindrop removal convolutional neural network by using a back propagation method according to the designed target loss function loss, updating the parameter by using a random gradient descent method, and finally learning the optimal parameter of the single-image raindrop removal convolutional neural network;
and D, inputting the image to be detected into the designed single image raindrop removal convolutional neural network, and predicting to generate a clean image after raindrop removal by using the trained single image raindrop removal convolutional neural network.
The following is a specific implementation of the present invention.
As shown in fig. 1, a single image raindrop removal method based on a loop iteration mechanism includes the following steps:
step A, preprocessing a training image pair of an original raindrop degradation image and a clean image to obtain an image block data set consisting of the training image pair of the original raindrop degradation image and the clean image;
b, designing a convolution neural network for removing raindrops of a single image based on the thought of 'divide and conquer' and by using the motivation of continuously iterating to remove rain;
step C, designing a target loss function loss for optimizing the network, taking the image block data set as training data, calculating the gradient of each parameter in the designed single-image raindrop removal convolutional neural network by using a back propagation method according to the designed target loss function loss, updating the parameter by using a random gradient descent method, and finally learning the optimal parameter of the single-image raindrop removal convolutional neural network;
and D, inputting the image to be detected into the designed single image raindrop removal convolutional neural network, and predicting to generate a clean image after raindrop removal by using the trained single image raindrop removal convolutional neural network.
Further, the step a comprises the steps of:
step A1: and dicing the original raindrop degraded image and the corresponding clean image according to a consistent mode to obtain W multiplied by W-sized image blocks, and simultaneously dicing every m pixel points in order to avoid overlapping dicing. After the dicing, the W multiplied by W image blocks with the raindrops and the W multiplied by W clean image blocks correspond to each other one by one.
Further, the step B includes the steps of:
step B1, designing a multi-stage raindrop removal network, wherein the network is a convolutional neural network based on a loop iteration mechanism and specifically consists of a plurality of stage sub-networks with the same network structure and shared network parameters;
step B2, designing a phase sub-network, which is used for extracting the relevant features of the raindrops so as to better remove the raindrops;
step B3, a context aggregation module and an attention context aggregation module in the sub-network in the design phase are used for aggregating the spatial context information lacking in the sub-network;
further, the step B2 includes the following steps:
and step B21, splicing the image block obtained by the previous-stage sub-network after the raindrops are removed and the corresponding original raindrop image block on the channel as the input of the sub-network of each stage. Note that, for the sub-network in the first stage, the input is the result of splicing the two original raindrop image blocks on the channel;
step B22, inputting the spliced result obtained in step B21 into a convolutional layer with an activation function of ReLU, converting the image into a feature map, and outputting the features according to the following formula:
wherein Conv1 represents the convolutional layer with the activation function ReLU, IoFor the original raindrop image block, It-1Representing the image block obtained in the previous stage after raindrop removal, for the first stage, It-1Is Io,Representing operation according to a channel splicing characteristic, F0Representing the extracted feature map.
Step B23, converting the feature map F0Inputting the data into a convolution long-short term memory network module which consists of a forgetting gate f, an input gate i and an output gate o and is calculated according to the following formula:
ft=σ(Wxf*F0+Whf*Ht-1+Wcf⊙Ct-1+bf)
it=σ(Wxi*F0+Whi*Ht-1+Wci⊙Ct-1+bi)
Ct=ft⊙Ct-1+it⊙tanh(Wxc*F0+Whc*Ht-1+bc)
ot=σ(Wxo*F0+Who*Ht-1+Wco⊙Ct+bo)
F1=Ht=ot⊙tanh(Ct)
wherein, forget the door f at the moment of ttAnd an input gate itIs inputted by the feature diagram F0Convolution of the output H of the long-short term memory network module at the last time (i.e. t-1)t-1And cell information state C at last momentt-1Formed by three parts, an output gate o at time ttIs inputted from the feature map F0Convolution of the output H of the long-short term memory network module at the last time (i.e. t-1)t-1And time t cell information state CtThese three parts constitute. W*And b*The weight parameter and the bias parameter are respectively corresponding to the convolution kernel, tanh represents a tangent function, σ represents a Sigmoid function, an operator indicates a convolution operation, and an operator indicates a dot-by-dot operation. CtFor the cell information state at the current time t, the cell information state is fed to the convolution long-short term memory network module at the next time, HtAnd (3) representing a characteristic diagram output by the convolution long-term and short-term memory network module at the current time t. For convenience to connectFrom the description of the method, we will turn HtIs marked as F1。
Our method converts each of the above-described time instants into each of the stages in a multi-stage manner. Note that: for the first phase of our network, we set the inputs of their forgetting gate and input gate to 0, since they have no previous phase.
Step B24, the output F of the convolution long-short term memory network module1Inputting into multiple context aggregation modules and attention context aggregation module, sequentially including context aggregation module with expansion rate of 2>Attention context aggregation Module with expansion Rate 2->Context aggregation Module with expansion Rate 2>Context aggregation Module with expansion ratio of 4->Attention context aggregation Module with expansion ratio of 4->The context aggregation module with the expansion rate of 4 is calculated according to the following formula:
F2=CAU4(SECAU4(CAU4(CAU2(SECAU2(CAU2(F1))))))
wherein, the CAUr(. indicates a context aggregation Module with an expansion Rate r, SECAUr(. x) denotes the attention context aggregation module with expansion rate r.
Step B25, and outputting the result F of the step B242Inputting the raindrop removal image into a standard residual module, then sending the raindrop removal image into a convolution layer with an activation function of ReLU to complete the conversion from the characteristic diagram to the image, and outputting the raindrop removal image of the sub-network t at the current stage with the channel number of 3 according to the following formula:
It=Conv2(Res(F2))
where Res (. + -.) denotes standard residual module, Conv2 denotes convolutional layer with activation function ReLU, ItThe image is removed for raindrops of the sub-network t at the current stage.
Further, the step B3 includes the following steps:
step B31, in the context aggregation module, firstly, the input feature F is sent to a smooth hole convolution module, and the calculation is carried out according to the following formula:
F3=Dilatedr(Sep(F))
wherein, F3For the output characteristics of the smooth hole convolution module, F is the input of the context aggregation module, Sep is the separable shared convolution layer, i.e. the separable convolution based on the channel and all the channels share the parameters, scaledrAnd the (star) is a hole (expansion) convolution, the receptive field is increased through the parameter of the expansion rate r, and the spatial context information is effectively aggregated to better extract the features. The expansion rate r represents the interval of the elements in the convolution kernel by 0, when r is 1, the void convolution is the same as the ordinary convolution, the elements in the convolution kernel are adjacent to each other, and no 0 exists; when r is more than 1, r-1 0 s are needed to be inserted between elements in the convolution kernel to enlarge the receptive field. The expansion rate r in step B24 is the expansion rate of the hole convolution.
The only difference between the attention context aggregation module and the context aggregation module is the step that the attention context aggregation module is added with the channel attention module, and the following steps are the same, and the attention context aggregation module is calculated according to the following formula:
F3=SE(Dilatedr(Sep(F)))
wherein SE (×) denotes the channel attention module.
Step B32: attention context aggregation module and feature F output by step B31 in context aggregation module3The output of a residual error module formed by self-correcting convolution is sent to be calculated according to the following formula:
F4=LeakyReLU(F3+SCC(F3))
F4and outputting a residual module formed by the self-correcting convolution, wherein the module comprises a self-correcting convolution, a LeakyReLU function and residual concatenation, and the LeakyReLU (x) has the following formula:
where x represents the input value of the LeakyReLU function and a is a fixed linear coefficient.
SCC (×) is a self-correcting convolution, defined as follows:
firstly, the output characteristic F of the step B313Respectively input to 1 × 1 convolutional layers of an inactive function:
X1,X2=Conv1×1(F3)
wherein, Conv1×1A 1X 1 convolutional layer, X1,X2Respectively, a characteristic diagram obtained by halving the number of channels obtained by passing through 1 × 1 convolutional layers, i.e., if F3The number of channels of (2) is C, then X1,X2The number of channels of (2) is C/2.
Then X is put1,X2Respectively fed into respective branch operations, wherein X1The branch to the self-calibration operation is calculated as follows:
T1=AvgPoolr(X1)
X′1=Up(T1*K2)
Y1=Y′1*K4
wherein, AvgPoolr(x) is the average pooling with step size r, Up (x) is the upsampling operation, x is the convolution operation,is the element-by-element multiplication operator, + is the element-by-element addition operator, and σ is the sigmoid activation function. K2、K3、K4Are convolution kernels with the same convolution kernel size. Y is1Is the output result of the self-correcting operation branch.
At the same time, X2Sending the corresponding convolution branch, and calculating according to the following formula:
Y2=X2*K1
finally, splicing the output results of the two branches on the channels to restore the channel number to the channel number C of the original characteristic diagram, and calculating according to the following formula:
Further, the step C includes the steps of:
step C1, we use the common loss function as the constraint to optimize our network model, and the specific formula is as follows:
where SSIM is a structural similarity loss function. Suppose a given training image pair (X)i,Yi) Where i is 1, …, N (total number of samples of training data), XiIs an image block of an input image damaged by raindrops, YiImage patches that are their corresponding clean images, Y denotes the use of a training image pair (X)i,Yi) The method comprises the steps that a clean image block is generated by a time network in a predicted mode after raindrops are removed;
and step C2, randomly dividing the image block data set into a plurality of batches, carrying out training optimization on the designed network until the L value calculated in the step C1 converges to a threshold value or the iteration number reaches the threshold value, storing the trained model, and finishing the network training process.
The above are preferred embodiments of the present invention, and all changes made according to the technical scheme of the present invention that produce functional effects do not exceed the scope of the technical scheme of the present invention belong to the protection scope of the present invention.
Claims (6)
1. A single image raindrop removing method based on a loop iteration mechanism is characterized by comprising the following steps:
step A, preprocessing a training image pair of an original raindrop degradation image and a clean image to obtain an image block data set consisting of the training image pair of the original raindrop degradation image and the clean image;
b, designing a convolution neural network for removing raindrops of a single image based on the thought of 'divide and conquer' and by using the motivation of continuously iterating to remove rain;
step C, designing a target loss function loss for optimizing the network, taking the image block data set as training data, calculating the gradient of each parameter in the designed single-image raindrop removal convolutional neural network by using a back propagation method according to the designed target loss function loss, updating the parameter by using a random gradient descent method, and finally learning the optimal parameter of the single-image raindrop removal convolutional neural network;
and D, inputting the image to be detected into the designed single image raindrop removal convolutional neural network, and predicting to generate a clean image after raindrop removal by using the trained single image raindrop removal convolutional neural network.
2. The method for removing raindrops of a single image based on a loop iteration mechanism according to claim 1, wherein the step a is specifically implemented as follows: dicing the original raindrop-attached degraded image and the corresponding clean image according to a consistent mode to obtain W multiplied by W-sized image blocks, and simultaneously dicing every m pixel points in order to avoid overlapping dicing; after the dicing, the W multiplied by W image blocks with the raindrops and the W multiplied by W clean image blocks correspond to each other one by one.
3. The method for removing raindrops of a single image based on a loop iteration mechanism according to claim 1, wherein the step B is implemented by the following steps:
step B1, designing a multi-stage raindrop removal network, wherein the network is a convolutional neural network based on a loop iteration mechanism and specifically consists of a plurality of stage sub-networks with the same network structure and shared network parameters;
step B2, designing a phase sub-network, which is used for extracting the relevant features of the raindrops so as to better remove the raindrops;
step B3, a context aggregation module and an attention context aggregation module in the design phase sub-network, for aggregating the spatial context information lacking in the network.
4. The method for removing raindrops from a single image based on the loop iteration mechanism according to claim 3, wherein the step B2 is implemented by the following steps:
step B21, splicing the image block obtained by the previous-stage sub-network after raindrop removal and the corresponding original raindrop-attached image block on a channel as the input of the sub-network of each stage; for the sub-network of the first stage, the input is the result of splicing two original raindrop image blocks on a channel;
step B22, inputting the spliced result obtained in step B21 into a convolutional layer with an activation function of ReLU, converting the image into a feature map, and outputting the features according to the following formula:
wherein Conv1 represents the convolutional layer with the activation function ReLU, IoFor the original raindrop image block, It-1Representing the image block obtained in the previous stage after raindrop removal, for the first stage, It-1Is Io,Representing operation according to a channel splicing characteristic, F0Representing the extracted feature map;
step B23, converting the feature map F0Inputting the data into a convolution long-short term memory network module which consists of a forgetting gate f, an input gate i and an output gate o and is calculated according to the following formula:
ft=σ(Wxf*F0+Whf*Ht-1+Wcf⊙Ct-1+bf)
it=σ(Wxi*F0+Whi*Ht-1+Wci⊙Ct-1+bi)
Ct=ft⊙Ct-1+it⊙tanh(Wxc*F0+Whc*Ht-1+bc)
ot=σ(Wxo*F0+Who*Ht-1+Wco⊙Ct+bo)
F1=Ht=ot⊙tanh(Ct)
wherein, forget the door f at the moment of ttAnd an input gate itIs inputted by the feature diagram F0Convolution of the output H of the long-short term memory network module at the last time (i.e. t-1)t-1And cell information state C at last momentt-1Formed by three parts, an output gate o at time ttIs inputted from the feature map F0Convolution of the output H of the long-short term memory network module at the last time (i.e. t-1)t-1And time t cell information state CtThe three parts are formed; w*And b*Respectively representing a weight parameter and a deviation parameter of a convolution kernel corresponding to the weight parameter and the deviation parameter, tanh represents a tangent function, sigma represents a Sigmoid function, an operator indicates a convolution operation, and an operator indicates a dot product operation; ctFor the cell information state at the current time t, the cell information state is fed to the convolution long-short term memory network module at the next time, HtA characteristic diagram which represents the output of the convolution long-term and short-term memory network module at the current time t; for convenience of description, refer to HtIs marked as F1;
Converting each moment into each stage in a multi-stage mode; for the first stage of the network, since it has no previous stage, the inputs of the forgetting gate and the input gate thereof are set to 0;
step B24, the output F of the convolution long-short term memory network module1Input to multiple context aggregation modules and attention contexts designedThe polymerization modules sequentially comprise a context polymerization module with the expansion rate of 2- > an attention context polymerization module with the expansion rate of 2- > a context polymerization module with the expansion rate of 4- > an attention context polymerization module with the expansion rate of 4- > a context polymerization module with the expansion rate of 4, and are calculated according to the following formula:
F2=CAU4(SECAU4(CAU4(CAU2(SECAU2(CAU2(F1))))))
wherein, the CAUr(. indicates a context aggregation Module with an expansion Rate r, SECAUr() represents an attention context aggregation module with an expansion rate r;
step B25, and outputting the result F of the step B242Inputting the raindrop removal image into a standard residual module, then sending the raindrop removal image into a convolution layer with an activation function of ReLU to complete the conversion from the characteristic diagram to the image, and outputting the raindrop removal image of the sub-network t at the current stage with the channel number of 3 according to the following formula:
It=Conv2(Res(F2))
where Res (. + -.) denotes standard residual module, Conv2 denotes convolutional layer with activation function ReLU, ItThe image is removed for raindrops of the sub-network t at the current stage.
5. The method for removing raindrops from a single image based on the loop iteration mechanism according to claim 3, wherein the step B3 is implemented by the following steps:
step B31, in the context aggregation module, firstly, the input feature F is sent to a smooth hole convolution module, and the calculation is carried out according to the following formula:
F3=Dilatedr(Sep(F))
wherein, F3For the output characteristics of the smooth hole convolution module, F is the input of the context aggregation module, Sep is the separable shared convolution layer, i.e. the separable convolution based on the channel and all the channels share the parameters, scaledrThe (star) is a hole (expansion) convolution, and the expansion rate r is a parameter for increasing the receptive field and effectively aggregating the spaceThe following information to better extract features; the expansion rate r represents the interval of the elements in the convolution kernel by 0, when r is 1, the void convolution is the same as the ordinary convolution, the elements in the convolution kernel are adjacent to each other, and no 0 exists; when r is more than 1, r-1 0 s are needed to be inserted among elements in the convolution kernel to enlarge the receptive field; the expansion rate r in the step B24 is the expansion rate of the convolution of the hole;
the only difference between the attention context aggregation module and the context aggregation module is the step that the attention context aggregation module is added with the channel attention module, and the following steps are the same, and the attention context aggregation module is calculated according to the following formula:
F3=SE(Dilatedr(Sep(F)))
wherein SE (×) represents the channel attention module;
in the step B32, the attention context aggregation module and the context aggregation module, the feature F output by the step B313The output of a residual error module formed by self-correcting convolution is sent to be calculated according to the following formula:
F4=LeakyReLU(F3+SCC(F3))
F4and outputting a residual module formed by the self-correcting convolution, wherein the module comprises a self-correcting convolution, a LeakyReLU function and residual concatenation, and the LeakyReLU (x) has the following formula:
wherein, x represents the input value of LeakyReLU function, and a is a fixed linear coefficient;
SCC (×) is a self-correcting convolution, defined as follows:
firstly, the output characteristic F of the step B313Respectively input to 1 × 1 convolutional layers of an inactive function:
X1,X2=Conv1×1(F3)
wherein, Conv1×1A 1X 1 convolutional layer, X1,X2Respectively, a characteristic diagram obtained by halving the number of channels obtained by passing through 1 × 1 convolutional layers, i.e., if F3The number of channels of (2) is C, then X1,X2The number of the channels is C/2;
then X is put1,X2Respectively fed into respective branch operations, wherein X1The branch to the self-calibration operation is calculated as follows:
T1=AvgPoolr(X1)
X′1=Up(T1*K2)
Y1=Y′1*K4
wherein, Avgpdolr(x) is the average pooling with step size r, Up (x) is the upsampling operation, x is the convolution operation,is an operator of element-by-element multiplication, + is an operator of element-by-element addition, and sigma is a sigmoid activation function; k2、K3、K4The convolution kernels have the same sizes; y is1Is the output result of the self-correcting operation branch;
at the same time, X2Sending the corresponding convolution branch, and calculating according to the following formula:
Y2=X2*K1
finally, splicing the output results of the two branches on the channels to restore the channel number to the channel number C of the original characteristic diagram, and calculating according to the following formula:
6. The method for removing raindrops of a single image based on a loop iteration mechanism according to claim 1, wherein the step C is implemented by the following steps:
step C1, optimizing the convolution neural network model for single image raindrop removal by using a loss function as a constraint, wherein the specific formula is as follows:
wherein SSIM is a structural similarity loss function; suppose a given training image pair (X)i,Yi) Where i ═ 1., N (total number of samples of training data), XiIs an image block of an input image damaged by raindrops, YiImage patches that are their corresponding clean images, Y denotes the use of a training image pair (X)i,Yi) The method comprises the steps that a clean image block is generated by a time network in a predicted mode after raindrops are removed;
and step C2, randomly dividing the image block data set into a plurality of batches, carrying out training optimization on the designed network until the L value calculated in the step C1 converges to a threshold value or the iteration number reaches the threshold value, storing the trained model, and finishing the network training process.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110134465.6A CN112767280B (en) | 2021-02-01 | 2021-02-01 | Single image raindrop removing method based on loop iteration mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110134465.6A CN112767280B (en) | 2021-02-01 | 2021-02-01 | Single image raindrop removing method based on loop iteration mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112767280A true CN112767280A (en) | 2021-05-07 |
CN112767280B CN112767280B (en) | 2022-06-14 |
Family
ID=75704404
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110134465.6A Active CN112767280B (en) | 2021-02-01 | 2021-02-01 | Single image raindrop removing method based on loop iteration mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112767280B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113450288A (en) * | 2021-08-04 | 2021-09-28 | 广东工业大学 | Single image rain removing method and system based on deep convolutional neural network and storage medium |
CN113610329A (en) * | 2021-10-08 | 2021-11-05 | 南京信息工程大学 | Short-time rainfall approaching forecasting method of double-current convolution long-short term memory network |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180268284A1 (en) * | 2017-03-15 | 2018-09-20 | Samsung Electronics Co., Ltd. | System and method for designing efficient super resolution deep convolutional neural networks by cascade network training, cascade network trimming, and dilated convolutions |
US10304193B1 (en) * | 2018-08-17 | 2019-05-28 | 12 Sigma Technologies | Image segmentation and object detection using fully convolutional neural network |
CN111861925A (en) * | 2020-07-24 | 2020-10-30 | 南京信息工程大学滨江学院 | Image rain removing method based on attention mechanism and gate control circulation unit |
CN112085678A (en) * | 2020-09-04 | 2020-12-15 | 国网福建省电力有限公司检修分公司 | Method and system suitable for removing raindrops from power equipment machine patrol image |
CN112132756A (en) * | 2019-06-24 | 2020-12-25 | 华北电力大学(保定) | Attention mechanism-based single raindrop image enhancement method |
CN112184566A (en) * | 2020-08-27 | 2021-01-05 | 北京大学 | Image processing method and system for removing attached water mist droplets |
CN112184573A (en) * | 2020-09-15 | 2021-01-05 | 西安理工大学 | Context aggregation residual single image rain removing method based on convolutional neural network |
-
2021
- 2021-02-01 CN CN202110134465.6A patent/CN112767280B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180268284A1 (en) * | 2017-03-15 | 2018-09-20 | Samsung Electronics Co., Ltd. | System and method for designing efficient super resolution deep convolutional neural networks by cascade network training, cascade network trimming, and dilated convolutions |
US10304193B1 (en) * | 2018-08-17 | 2019-05-28 | 12 Sigma Technologies | Image segmentation and object detection using fully convolutional neural network |
CN112132756A (en) * | 2019-06-24 | 2020-12-25 | 华北电力大学(保定) | Attention mechanism-based single raindrop image enhancement method |
CN111861925A (en) * | 2020-07-24 | 2020-10-30 | 南京信息工程大学滨江学院 | Image rain removing method based on attention mechanism and gate control circulation unit |
CN112184566A (en) * | 2020-08-27 | 2021-01-05 | 北京大学 | Image processing method and system for removing attached water mist droplets |
CN112085678A (en) * | 2020-09-04 | 2020-12-15 | 国网福建省电力有限公司检修分公司 | Method and system suitable for removing raindrops from power equipment machine patrol image |
CN112184573A (en) * | 2020-09-15 | 2021-01-05 | 西安理工大学 | Context aggregation residual single image rain removing method based on convolutional neural network |
Non-Patent Citations (3)
Title |
---|
TIE LIU ET AL.: "Removing Rain in Videos: A Large-Scale Database and a Two-Stream ConvLSTM Approach", 《2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME)》 * |
WENHAN YANG ET AL.: "Joint Rain Detection and Removal from a Single Image with Contextualized Deep Networks", 《 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 * |
丁宇阳等: "双LSTM的光场图像去雨算法研究", 《计算机工程与应用》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113450288A (en) * | 2021-08-04 | 2021-09-28 | 广东工业大学 | Single image rain removing method and system based on deep convolutional neural network and storage medium |
CN113610329A (en) * | 2021-10-08 | 2021-11-05 | 南京信息工程大学 | Short-time rainfall approaching forecasting method of double-current convolution long-short term memory network |
CN113610329B (en) * | 2021-10-08 | 2022-01-04 | 南京信息工程大学 | Short-time rainfall approaching forecasting method of double-current convolution long-short term memory network |
Also Published As
Publication number | Publication date |
---|---|
CN112767280B (en) | 2022-06-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109543502B (en) | Semantic segmentation method based on deep multi-scale neural network | |
CN108717569B (en) | Expansion full-convolution neural network device and construction method thereof | |
CN111462013B (en) | Single-image rain removing method based on structured residual learning | |
CN112767280B (en) | Single image raindrop removing method based on loop iteration mechanism | |
CN108648159B (en) | Image rain removing method and system | |
CN111915530A (en) | End-to-end-based haze concentration self-adaptive neural network image defogging method | |
CN112884073B (en) | Image rain removing method, system, terminal and storage medium | |
CN112419191B (en) | Image motion blur removing method based on convolution neural network | |
CN111062329B (en) | Unsupervised pedestrian re-identification method based on augmented network | |
CN113052775B (en) | Image shadow removing method and device | |
CN110838095B (en) | Single image rain removing method and system based on cyclic dense neural network | |
CN109544475A (en) | Bi-Level optimization method for image deblurring | |
CN111414860A (en) | Real-time portrait tracking and segmenting method | |
CN114723630A (en) | Image deblurring method and system based on cavity double-residual multi-scale depth network | |
CN111815526B (en) | Rain image rainstrip removing method and system based on image filtering and CNN | |
CN116205821A (en) | Single-image rain removing method based on vertical stripe characteristic extraction cross convolution | |
CN114862711B (en) | Low-illumination image enhancement and denoising method based on dual complementary prior constraints | |
CN113627368B (en) | Video behavior recognition method based on deep learning | |
CN115239602A (en) | License plate image deblurring method based on cavity convolution expansion receptive field | |
CN115205148A (en) | Image deblurring method based on double-path residual error network | |
CN114943655A (en) | Image restoration system for generating confrontation network structure based on cyclic depth convolution | |
Jia et al. | Single-image snow removal based on an attention mechanism and a generative adversarial network | |
CN113870145A (en) | Image defogging method based on deep convolutional neural network under Bayes framework | |
CN113658074B (en) | Single image raindrop removing method based on LAB color space multi-scale fusion network | |
CN110415190B (en) | Method, device and processor for removing image compression noise based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |