CN112767280A - Single image raindrop removing method based on loop iteration mechanism - Google Patents

Single image raindrop removing method based on loop iteration mechanism Download PDF

Info

Publication number
CN112767280A
CN112767280A CN202110134465.6A CN202110134465A CN112767280A CN 112767280 A CN112767280 A CN 112767280A CN 202110134465 A CN202110134465 A CN 202110134465A CN 112767280 A CN112767280 A CN 112767280A
Authority
CN
China
Prior art keywords
image
convolution
network
module
raindrop
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110134465.6A
Other languages
Chinese (zh)
Other versions
CN112767280B (en
Inventor
牛玉贞
陈锋
郑路伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN202110134465.6A priority Critical patent/CN112767280B/en
Publication of CN112767280A publication Critical patent/CN112767280A/en
Application granted granted Critical
Publication of CN112767280B publication Critical patent/CN112767280B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • G06T5/73
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Abstract

The invention relates to a single image raindrop removing method based on a loop iteration mechanism. The method comprises the following steps: preprocessing the training image pair of the original raindrop degradation image and the clean image to obtain an image block data set consisting of the training image pair of the original raindrop degradation image and the clean image; designing a convolution neural network for removing raindrops of a single image by using a motivation for continuously iterating to remove rain; designing a target loss function loss for optimizing the network, taking the image block data set as training data, calculating the gradient of each parameter in the designed image raindrop removal convolutional neural network by using a back propagation method according to the designed target loss function loss, updating the parameter by using a random gradient descent method, and finally learning the optimal parameter of the model; and inputting the image to be detected, and predicting and generating a clean image after raindrops are removed by using the trained model. The method can obviously improve the raindrop removal performance of the image, and greatly reduce the network parameter size of the image.

Description

Single image raindrop removing method based on loop iteration mechanism
Technical Field
The invention relates to the field of image and video processing and computer vision, in particular to a single image raindrop removing method based on a loop iteration mechanism.
Background
With the rapid development of internet and multimedia technology, images have become an indispensable part for human to communicate and transmit information, and have great significance for the development of various aspects of modern society. However, the image acquisition is inevitably performed in an outdoor environment, which is inevitably affected by bad weather, such as rainy days, foggy days, snowy days, and the like. These bad weather can cause a series of visual degradation to the captured images and videos, such as contrast, saturation, visibility, etc. Rainy days are a common weather phenomenon in daily life, and raindrops attached to a glass window, a windshield or a camera lens can obstruct the visibility of a background scene, reduce the quality of an image and cause serious degradation of the image. The detailed information on the image cannot be identified, the use value of the image is greatly reduced, and certain difficulty is brought to high-level visual understanding tasks such as target detection, pedestrian re-identification and image segmentation. In rainy weather, raindrops inevitably adhere to the lens of the camera. These attached raindrops may blur or obscure part of the background scene in the image or obscure part of the foreground scene in the image, so it is of great significance to remove raindrops from a single image to restore a clean background.
Although studies on single-image raindrop streak removal have been well explored, there are few studies on single-image raindrop removal, and these well-explored single-image raindrop streak removal methods cannot be directly applied to single-image raindrop removal. Although the raindrops are less dense than the raindrops, the raindrops are generally larger in size than the raindrops, and completely block the background. Also the appearance of a raindrop is influenced by many parameters, its shape often being different from thin and vertical rainstripes. Meanwhile, the physical modeling of raindrops is also completely different from rainstripes, which makes the raindrop removal work more difficult.
Current methods for single image raindrop removal are largely divided into two broad categories, namely model-based methods and existing deep learning-based methods. Model-based methods generally use a filter to decompose the rain component into high and low frequency components, and then distinguish the rain component from the non-rain component in the high frequency component through dictionary learning. Most of model-based methods rely on parameters preset manually to perform feature extraction and performance optimization of images, so that features of raindrops cannot be extracted well, and the performance of raindrop removal is poor.
The deep learning-based method is a data-driven method, which trains a convolutional neural network by using a large amount of data. The powerful feature learning and representing capability expressed by the convolutional neural network can better extract image features, and a better raindrop removing effect is achieved. However, these methods based on deep learning also have a problem that the relationship between the raindrop removal performance and the network parameter amount cannot be balanced well. These methods either achieve good performance, but at the cost of a large number of parameters, greatly limit their potential value in practical applications with limited computational resources; or less parametric but at the cost of poorer performance. Therefore, how to design an efficient and practical single-image raindrop removal method is just one of the focuses of the future researchers.
Disclosure of Invention
The invention aims to provide a single image raindrop removing method based on a loop iteration mechanism, which can obviously improve the performance of removing raindrops of images and greatly reduce the size of network parameters.
In order to achieve the purpose, the technical scheme of the invention is as follows: a single image raindrop removing method based on a loop iteration mechanism comprises the following steps:
step A, preprocessing a training image pair of an original raindrop degradation image and a clean image to obtain an image block data set consisting of the training image pair of the original raindrop degradation image and the clean image;
b, designing a convolution neural network for removing raindrops of a single image based on the thought of 'divide and conquer' and by using the motivation of continuously iterating to remove rain;
step C, designing a target loss function loss for optimizing the network, taking the image block data set as training data, calculating the gradient of each parameter in the designed single-image raindrop removal convolutional neural network by using a back propagation method according to the designed target loss function loss, updating the parameter by using a random gradient descent method, and finally learning the optimal parameter of the single-image raindrop removal convolutional neural network;
and D, inputting the image to be detected into the designed single image raindrop removal convolutional neural network, and predicting to generate a clean image after raindrop removal by using the trained single image raindrop removal convolutional neural network.
In an embodiment of the present invention, the specific implementation manner of step a is as follows: dicing the original raindrop-attached degraded image and the corresponding clean image according to a consistent mode to obtain W multiplied by W-sized image blocks, and simultaneously dicing every m pixel points in order to avoid overlapping dicing; after the dicing, the W multiplied by W image blocks with the raindrops and the W multiplied by W clean image blocks correspond to each other one by one.
In an embodiment of the present invention, the step B is implemented as follows:
step B1, designing a multi-stage raindrop removal network, wherein the network is a convolutional neural network based on a loop iteration mechanism and specifically consists of a plurality of stage sub-networks with the same network structure and shared network parameters;
step B2, designing a phase sub-network, which is used for extracting the relevant features of the raindrops so as to better remove the raindrops;
step B3, a context aggregation module and an attention context aggregation module in the design phase sub-network, for aggregating the spatial context information lacking in the network.
In an embodiment of the present invention, the step B2 is implemented as follows:
step B21, splicing the image block obtained by the previous-stage sub-network after raindrop removal and the corresponding original raindrop-attached image block on a channel as the input of the sub-network of each stage; for the sub-network of the first stage, the input is the result of splicing two original raindrop image blocks on a channel;
step B22, inputting the spliced result obtained in step B21 into a convolutional layer with an activation function of ReLU, converting the image into a feature map, and outputting the features according to the following formula:
Figure BDA0002926210950000031
wherein Conv1 represents the convolutional layer with the activation function ReLU, IoFor the original raindrop image block, It-1Representing the image block obtained in the previous stage after raindrop removal, for the first stage, It-1Is Io
Figure BDA0002926210950000032
Representing operation according to a channel splicing characteristic, F0Representing the extracted feature map;
step B23, converting the feature map F0Inputting the data into a convolution long-short term memory network module which consists of a forgetting gate f, an input gate i and an output gate o and is calculated according to the following formula:
ft=σ(Wxf*F0+Whf*Ht-1+Wcf⊙Ct-1+bf)
it=σ(Wxi*F0+Whi*Ht-1+Wci⊙Ct-1+bi)
Ct=ft⊙Ct-1+it⊙tanh(Wxc*F0+Whc*Ht-1+bc)
ot=σ(Wxo*F0+Who*Ht-1+Wco⊙Ct+bo)
F1=Ht=ot⊙tanh(Ct)
wherein, forget the door f at the moment of ttAnd an input gate itIs inputted by the feature diagram F0Last time (i.e., t-1) convolution long and short term memory netOutput H of the envelope modulet-1And cell information state C at last momentt-1Formed by three parts, an output gate o at time ttIs inputted from the feature map F0Convolution of the output H of the long-short term memory network module at the last time (i.e. t-1)t-1And time t cell information state CtThe three parts are formed; w*And b*Respectively representing a weight parameter and a deviation parameter of a convolution kernel corresponding to the weight parameter and the deviation parameter, tanh represents a tangent function, sigma represents a Sigmoid function, an operator indicates a convolution operation, and an operator indicates a dot product operation; ctFor the cell information state at the current time t, the cell information state is fed to the convolution long-short term memory network module at the next time, HtA characteristic diagram which represents the output of the convolution long-term and short-term memory network module at the current time t; for convenience of description, refer to HtIs marked as F1
Converting each moment into each stage in a multi-stage mode; for the first stage of the network, since it has no previous stage, the inputs of the forgetting gate and the input gate thereof are set to 0;
step B24, the output F of the convolution long-short term memory network module1Inputting into multiple context aggregation modules and attention context aggregation module, sequentially including context aggregation module with expansion rate of 2>Attention context aggregation Module with expansion Rate 2->Context aggregation Module with expansion Rate 2>Context aggregation Module with expansion ratio of 4->Attention context aggregation Module with expansion ratio of 4->The context aggregation module with the expansion rate of 4 is calculated according to the following formula:
F2=CAU4(SECAU4(CAU4(CAU2(SECAU2(CAU2(F1))))))
wherein, the CAUr(. indicates a context aggregation Module with an expansion Rate r, SECAUr() represents an attention context aggregation module with an expansion rate r;
step B25, and outputting the result F of the step B242Inputting the data into a standard residual error module, and then sending the data into a convolution layer with an activation function of ReLUAnd (3) converting the characteristic diagram into an image, and outputting the raindrop removal image of the current-stage sub-network t with the channel number of 3 according to the following formula:
It=Conv2(Res(F2))
where Res (. + -.) denotes standard residual module, Conv2 denotes convolutional layer with activation function ReLU, ItThe image is removed for raindrops of the sub-network t at the current stage.
In an embodiment of the present invention, the step B3 is implemented as follows:
step B31, in the context aggregation module, firstly, the input feature F is sent to a smooth hole convolution module, and the calculation is carried out according to the following formula:
F3=Dilatedr(Sep(F))
wherein, F3For the output characteristics of the smooth hole convolution module, F is the input of the context aggregation module, Sep is the separable shared convolution layer, i.e. the separable convolution based on the channel and all the channels share the parameters, scaledrThe (star) is a hole (expansion) convolution, the receptive field is increased through the parameter of the expansion rate r, and the spatial context information is effectively aggregated to better extract the characteristics; the expansion rate r represents the interval of the elements in the convolution kernel by 0, when r is 1, the void convolution is the same as the ordinary convolution, the elements in the convolution kernel are adjacent to each other, and no 0 exists; when r is more than 1, r-1 0 s are needed to be inserted among elements in the convolution kernel to enlarge the receptive field; the expansion rate r in the step B24 is the expansion rate of the convolution of the hole;
the only difference between the attention context aggregation module and the context aggregation module is the step that the attention context aggregation module is added with the channel attention module, and the following steps are the same, and the attention context aggregation module is calculated according to the following formula:
F3=SE(Dilatedr(Sep(F)))
wherein SE (×) represents the channel attention module;
in the step B32, the attention context aggregation module and the context aggregation module, the feature F output by the step B313Is sent to a self-calibrationThe output of the residual module formed by positive convolution is calculated according to the following formula:
F4=LeakyReLU(F3+SCC(F3))
F4and outputting a residual module formed by the self-correcting convolution, wherein the module comprises a self-correcting convolution, a LeakyReLU function and residual concatenation, and the LeakyReLU (x) has the following formula:
Figure BDA0002926210950000041
wherein, x represents the input value of LeakyReLU function, and a is a fixed linear coefficient;
SCC (×) is a self-correcting convolution, defined as follows:
firstly, the output characteristic F of the step B313Respectively input to 1 × 1 convolutional layers of an inactive function:
X1,X2=Conv1×1(F3)
wherein, Conv1×1A 1X 1 convolutional layer, X1,X2Respectively, a characteristic diagram obtained by halving the number of channels obtained by passing through 1 × 1 convolutional layers, i.e., if F3The number of channels of (2) is C, then X1,X2The number of the channels is C/2;
then X is put1,X2Respectively fed into respective branch operations, wherein X1The branch to the self-calibration operation is calculated as follows:
T1=AvgPoolr(X1)
X′1=Up(T1*K2)
Figure BDA0002926210950000051
Y1=Y′1*K4
wherein, AvgPoolr(x) is the average pooling with step size r, Up (x) is the upsampling operation, x is the convolution operation,
Figure BDA0002926210950000052
is an operator of element-by-element multiplication, + is an operator of element-by-element addition, and sigma is a sigmoid activation function; k2、K3、K4The convolution kernels have the same sizes; y is1Is the output result of the self-correcting operation branch;
at the same time, X2Sending the corresponding convolution branch, and calculating according to the following formula:
Y2=X2*K1
finally, splicing the output results of the two branches on the channels to restore the channel number to the channel number C of the original characteristic diagram, and calculating according to the following formula:
Figure BDA0002926210950000053
Figure BDA0002926210950000054
for channel splicing operations, Y is the output of the self-correcting convolution module.
In an embodiment of the present invention, the step C is implemented as follows:
step C1, optimizing the convolution neural network model for single image raindrop removal by using a loss function as a constraint, wherein the specific formula is as follows:
Figure BDA0002926210950000055
wherein SSIM is a structural similarity loss function; suppose a given training image pair (X)i,Yi) Where i is 1, …, N (total number of samples of training data), XiIs an image block of an input image damaged by raindrops, YiImage patches that are their corresponding clean images, Y denotes the use of a training image pair (X)i,Yi) The method comprises the steps that a clean image block is generated by a time network in a predicted mode after raindrops are removed;
and step C2, randomly dividing the image block data set into a plurality of batches, carrying out training optimization on the designed network until the L value calculated in the step C1 converges to a threshold value or the iteration number reaches the threshold value, storing the trained model, and finishing the network training process.
Compared with the prior art, the invention has the following beneficial effects: the method is based on the concept of divide-and-conquer, the raindrop removal task is decomposed into a cycle iteration process, and raindrops are removed circularly by designing a multi-stage convolution neural network so as to achieve a better image recovery result. Meanwhile, a latest smooth hole convolution module and a latest self-correction convolution module are adopted to better aggregate spatial context information and successfully eliminate the artifact problem generated in the raindrop removing process by the original hole convolution. The method designs an independent convolution neural network for removing the raindrops of the image aiming at the problem of removing the raindrops of the image, can ensure the quality of the image after the raindrops are removed, has smaller network parameter size compared with other methods, and has higher use value.
Drawings
FIG. 1 is a flow chart of an implementation of the method of the present invention.
Fig. 2 is a structural diagram of a single image raindrop removal method model based on a loop iteration mechanism in the embodiment of the present invention.
Detailed Description
The technical scheme of the invention is specifically explained below with reference to the accompanying drawings.
The invention provides a single image raindrop removing method based on a loop iteration mechanism, which comprises the following steps of:
step A, preprocessing a training image pair of an original raindrop degradation image and a clean image to obtain an image block data set consisting of the training image pair of the original raindrop degradation image and the clean image;
b, designing a convolution neural network for removing raindrops of a single image based on the thought of 'divide and conquer' and by using the motivation of continuously iterating to remove rain;
step C, designing a target loss function loss for optimizing the network, taking the image block data set as training data, calculating the gradient of each parameter in the designed single-image raindrop removal convolutional neural network by using a back propagation method according to the designed target loss function loss, updating the parameter by using a random gradient descent method, and finally learning the optimal parameter of the single-image raindrop removal convolutional neural network;
and D, inputting the image to be detected into the designed single image raindrop removal convolutional neural network, and predicting to generate a clean image after raindrop removal by using the trained single image raindrop removal convolutional neural network.
The following is a specific implementation of the present invention.
As shown in fig. 1, a single image raindrop removal method based on a loop iteration mechanism includes the following steps:
step A, preprocessing a training image pair of an original raindrop degradation image and a clean image to obtain an image block data set consisting of the training image pair of the original raindrop degradation image and the clean image;
b, designing a convolution neural network for removing raindrops of a single image based on the thought of 'divide and conquer' and by using the motivation of continuously iterating to remove rain;
step C, designing a target loss function loss for optimizing the network, taking the image block data set as training data, calculating the gradient of each parameter in the designed single-image raindrop removal convolutional neural network by using a back propagation method according to the designed target loss function loss, updating the parameter by using a random gradient descent method, and finally learning the optimal parameter of the single-image raindrop removal convolutional neural network;
and D, inputting the image to be detected into the designed single image raindrop removal convolutional neural network, and predicting to generate a clean image after raindrop removal by using the trained single image raindrop removal convolutional neural network.
Further, the step a comprises the steps of:
step A1: and dicing the original raindrop degraded image and the corresponding clean image according to a consistent mode to obtain W multiplied by W-sized image blocks, and simultaneously dicing every m pixel points in order to avoid overlapping dicing. After the dicing, the W multiplied by W image blocks with the raindrops and the W multiplied by W clean image blocks correspond to each other one by one.
Further, the step B includes the steps of:
step B1, designing a multi-stage raindrop removal network, wherein the network is a convolutional neural network based on a loop iteration mechanism and specifically consists of a plurality of stage sub-networks with the same network structure and shared network parameters;
step B2, designing a phase sub-network, which is used for extracting the relevant features of the raindrops so as to better remove the raindrops;
step B3, a context aggregation module and an attention context aggregation module in the sub-network in the design phase are used for aggregating the spatial context information lacking in the sub-network;
further, the step B2 includes the following steps:
and step B21, splicing the image block obtained by the previous-stage sub-network after the raindrops are removed and the corresponding original raindrop image block on the channel as the input of the sub-network of each stage. Note that, for the sub-network in the first stage, the input is the result of splicing the two original raindrop image blocks on the channel;
step B22, inputting the spliced result obtained in step B21 into a convolutional layer with an activation function of ReLU, converting the image into a feature map, and outputting the features according to the following formula:
Figure BDA0002926210950000071
wherein Conv1 represents the convolutional layer with the activation function ReLU, IoFor the original raindrop image block, It-1Representing the image block obtained in the previous stage after raindrop removal, for the first stage, It-1Is Io
Figure BDA0002926210950000072
Representing operation according to a channel splicing characteristic, F0Representing the extracted feature map.
Step B23, converting the feature map F0Inputting the data into a convolution long-short term memory network module which consists of a forgetting gate f, an input gate i and an output gate o and is calculated according to the following formula:
ft=σ(Wxf*F0+Whf*Ht-1+Wcf⊙Ct-1+bf)
it=σ(Wxi*F0+Whi*Ht-1+Wci⊙Ct-1+bi)
Ct=ft⊙Ct-1+it⊙tanh(Wxc*F0+Whc*Ht-1+bc)
ot=σ(Wxo*F0+Who*Ht-1+Wco⊙Ct+bo)
F1=Ht=ot⊙tanh(Ct)
wherein, forget the door f at the moment of ttAnd an input gate itIs inputted by the feature diagram F0Convolution of the output H of the long-short term memory network module at the last time (i.e. t-1)t-1And cell information state C at last momentt-1Formed by three parts, an output gate o at time ttIs inputted from the feature map F0Convolution of the output H of the long-short term memory network module at the last time (i.e. t-1)t-1And time t cell information state CtThese three parts constitute. W*And b*The weight parameter and the bias parameter are respectively corresponding to the convolution kernel, tanh represents a tangent function, σ represents a Sigmoid function, an operator indicates a convolution operation, and an operator indicates a dot-by-dot operation. CtFor the cell information state at the current time t, the cell information state is fed to the convolution long-short term memory network module at the next time, HtAnd (3) representing a characteristic diagram output by the convolution long-term and short-term memory network module at the current time t. For convenience to connectFrom the description of the method, we will turn HtIs marked as F1
Our method converts each of the above-described time instants into each of the stages in a multi-stage manner. Note that: for the first phase of our network, we set the inputs of their forgetting gate and input gate to 0, since they have no previous phase.
Step B24, the output F of the convolution long-short term memory network module1Inputting into multiple context aggregation modules and attention context aggregation module, sequentially including context aggregation module with expansion rate of 2>Attention context aggregation Module with expansion Rate 2->Context aggregation Module with expansion Rate 2>Context aggregation Module with expansion ratio of 4->Attention context aggregation Module with expansion ratio of 4->The context aggregation module with the expansion rate of 4 is calculated according to the following formula:
F2=CAU4(SECAU4(CAU4(CAU2(SECAU2(CAU2(F1))))))
wherein, the CAUr(. indicates a context aggregation Module with an expansion Rate r, SECAUr(. x) denotes the attention context aggregation module with expansion rate r.
Step B25, and outputting the result F of the step B242Inputting the raindrop removal image into a standard residual module, then sending the raindrop removal image into a convolution layer with an activation function of ReLU to complete the conversion from the characteristic diagram to the image, and outputting the raindrop removal image of the sub-network t at the current stage with the channel number of 3 according to the following formula:
It=Conv2(Res(F2))
where Res (. + -.) denotes standard residual module, Conv2 denotes convolutional layer with activation function ReLU, ItThe image is removed for raindrops of the sub-network t at the current stage.
Further, the step B3 includes the following steps:
step B31, in the context aggregation module, firstly, the input feature F is sent to a smooth hole convolution module, and the calculation is carried out according to the following formula:
F3=Dilatedr(Sep(F))
wherein, F3For the output characteristics of the smooth hole convolution module, F is the input of the context aggregation module, Sep is the separable shared convolution layer, i.e. the separable convolution based on the channel and all the channels share the parameters, scaledrAnd the (star) is a hole (expansion) convolution, the receptive field is increased through the parameter of the expansion rate r, and the spatial context information is effectively aggregated to better extract the features. The expansion rate r represents the interval of the elements in the convolution kernel by 0, when r is 1, the void convolution is the same as the ordinary convolution, the elements in the convolution kernel are adjacent to each other, and no 0 exists; when r is more than 1, r-1 0 s are needed to be inserted between elements in the convolution kernel to enlarge the receptive field. The expansion rate r in step B24 is the expansion rate of the hole convolution.
The only difference between the attention context aggregation module and the context aggregation module is the step that the attention context aggregation module is added with the channel attention module, and the following steps are the same, and the attention context aggregation module is calculated according to the following formula:
F3=SE(Dilatedr(Sep(F)))
wherein SE (×) denotes the channel attention module.
Step B32: attention context aggregation module and feature F output by step B31 in context aggregation module3The output of a residual error module formed by self-correcting convolution is sent to be calculated according to the following formula:
F4=LeakyReLU(F3+SCC(F3))
F4and outputting a residual module formed by the self-correcting convolution, wherein the module comprises a self-correcting convolution, a LeakyReLU function and residual concatenation, and the LeakyReLU (x) has the following formula:
Figure BDA0002926210950000091
where x represents the input value of the LeakyReLU function and a is a fixed linear coefficient.
SCC (×) is a self-correcting convolution, defined as follows:
firstly, the output characteristic F of the step B313Respectively input to 1 × 1 convolutional layers of an inactive function:
X1,X2=Conv1×1(F3)
wherein, Conv1×1A 1X 1 convolutional layer, X1,X2Respectively, a characteristic diagram obtained by halving the number of channels obtained by passing through 1 × 1 convolutional layers, i.e., if F3The number of channels of (2) is C, then X1,X2The number of channels of (2) is C/2.
Then X is put1,X2Respectively fed into respective branch operations, wherein X1The branch to the self-calibration operation is calculated as follows:
T1=AvgPoolr(X1)
X′1=Up(T1*K2)
Figure BDA0002926210950000092
Y1=Y′1*K4
wherein, AvgPoolr(x) is the average pooling with step size r, Up (x) is the upsampling operation, x is the convolution operation,
Figure BDA0002926210950000093
is the element-by-element multiplication operator, + is the element-by-element addition operator, and σ is the sigmoid activation function. K2、K3、K4Are convolution kernels with the same convolution kernel size. Y is1Is the output result of the self-correcting operation branch.
At the same time, X2Sending the corresponding convolution branch, and calculating according to the following formula:
Y2=X2*K1
finally, splicing the output results of the two branches on the channels to restore the channel number to the channel number C of the original characteristic diagram, and calculating according to the following formula:
Figure BDA0002926210950000101
Figure BDA0002926210950000102
for channel splicing operations, Y is the output of the self-correcting convolution module.
Further, the step C includes the steps of:
step C1, we use the common loss function as the constraint to optimize our network model, and the specific formula is as follows:
Figure BDA0002926210950000103
where SSIM is a structural similarity loss function. Suppose a given training image pair (X)i,Yi) Where i is 1, …, N (total number of samples of training data), XiIs an image block of an input image damaged by raindrops, YiImage patches that are their corresponding clean images, Y denotes the use of a training image pair (X)i,Yi) The method comprises the steps that a clean image block is generated by a time network in a predicted mode after raindrops are removed;
and step C2, randomly dividing the image block data set into a plurality of batches, carrying out training optimization on the designed network until the L value calculated in the step C1 converges to a threshold value or the iteration number reaches the threshold value, storing the trained model, and finishing the network training process.
The above are preferred embodiments of the present invention, and all changes made according to the technical scheme of the present invention that produce functional effects do not exceed the scope of the technical scheme of the present invention belong to the protection scope of the present invention.

Claims (6)

1. A single image raindrop removing method based on a loop iteration mechanism is characterized by comprising the following steps:
step A, preprocessing a training image pair of an original raindrop degradation image and a clean image to obtain an image block data set consisting of the training image pair of the original raindrop degradation image and the clean image;
b, designing a convolution neural network for removing raindrops of a single image based on the thought of 'divide and conquer' and by using the motivation of continuously iterating to remove rain;
step C, designing a target loss function loss for optimizing the network, taking the image block data set as training data, calculating the gradient of each parameter in the designed single-image raindrop removal convolutional neural network by using a back propagation method according to the designed target loss function loss, updating the parameter by using a random gradient descent method, and finally learning the optimal parameter of the single-image raindrop removal convolutional neural network;
and D, inputting the image to be detected into the designed single image raindrop removal convolutional neural network, and predicting to generate a clean image after raindrop removal by using the trained single image raindrop removal convolutional neural network.
2. The method for removing raindrops of a single image based on a loop iteration mechanism according to claim 1, wherein the step a is specifically implemented as follows: dicing the original raindrop-attached degraded image and the corresponding clean image according to a consistent mode to obtain W multiplied by W-sized image blocks, and simultaneously dicing every m pixel points in order to avoid overlapping dicing; after the dicing, the W multiplied by W image blocks with the raindrops and the W multiplied by W clean image blocks correspond to each other one by one.
3. The method for removing raindrops of a single image based on a loop iteration mechanism according to claim 1, wherein the step B is implemented by the following steps:
step B1, designing a multi-stage raindrop removal network, wherein the network is a convolutional neural network based on a loop iteration mechanism and specifically consists of a plurality of stage sub-networks with the same network structure and shared network parameters;
step B2, designing a phase sub-network, which is used for extracting the relevant features of the raindrops so as to better remove the raindrops;
step B3, a context aggregation module and an attention context aggregation module in the design phase sub-network, for aggregating the spatial context information lacking in the network.
4. The method for removing raindrops from a single image based on the loop iteration mechanism according to claim 3, wherein the step B2 is implemented by the following steps:
step B21, splicing the image block obtained by the previous-stage sub-network after raindrop removal and the corresponding original raindrop-attached image block on a channel as the input of the sub-network of each stage; for the sub-network of the first stage, the input is the result of splicing two original raindrop image blocks on a channel;
step B22, inputting the spliced result obtained in step B21 into a convolutional layer with an activation function of ReLU, converting the image into a feature map, and outputting the features according to the following formula:
Figure FDA0002926210940000011
wherein Conv1 represents the convolutional layer with the activation function ReLU, IoFor the original raindrop image block, It-1Representing the image block obtained in the previous stage after raindrop removal, for the first stage, It-1Is Io
Figure FDA0002926210940000021
Representing operation according to a channel splicing characteristic, F0Representing the extracted feature map;
step B23, converting the feature map F0Inputting the data into a convolution long-short term memory network module which consists of a forgetting gate f, an input gate i and an output gate o and is calculated according to the following formula:
ft=σ(Wxf*F0+Whf*Ht-1+Wcf⊙Ct-1+bf)
it=σ(Wxi*F0+Whi*Ht-1+Wci⊙Ct-1+bi)
Ct=ft⊙Ct-1+it⊙tanh(Wxc*F0+Whc*Ht-1+bc)
ot=σ(Wxo*F0+Who*Ht-1+Wco⊙Ct+bo)
F1=Ht=ot⊙tanh(Ct)
wherein, forget the door f at the moment of ttAnd an input gate itIs inputted by the feature diagram F0Convolution of the output H of the long-short term memory network module at the last time (i.e. t-1)t-1And cell information state C at last momentt-1Formed by three parts, an output gate o at time ttIs inputted from the feature map F0Convolution of the output H of the long-short term memory network module at the last time (i.e. t-1)t-1And time t cell information state CtThe three parts are formed; w*And b*Respectively representing a weight parameter and a deviation parameter of a convolution kernel corresponding to the weight parameter and the deviation parameter, tanh represents a tangent function, sigma represents a Sigmoid function, an operator indicates a convolution operation, and an operator indicates a dot product operation; ctFor the cell information state at the current time t, the cell information state is fed to the convolution long-short term memory network module at the next time, HtA characteristic diagram which represents the output of the convolution long-term and short-term memory network module at the current time t; for convenience of description, refer to HtIs marked as F1
Converting each moment into each stage in a multi-stage mode; for the first stage of the network, since it has no previous stage, the inputs of the forgetting gate and the input gate thereof are set to 0;
step B24, the output F of the convolution long-short term memory network module1Input to multiple context aggregation modules and attention contexts designedThe polymerization modules sequentially comprise a context polymerization module with the expansion rate of 2- > an attention context polymerization module with the expansion rate of 2- > a context polymerization module with the expansion rate of 4- > an attention context polymerization module with the expansion rate of 4- > a context polymerization module with the expansion rate of 4, and are calculated according to the following formula:
F2=CAU4(SECAU4(CAU4(CAU2(SECAU2(CAU2(F1))))))
wherein, the CAUr(. indicates a context aggregation Module with an expansion Rate r, SECAUr() represents an attention context aggregation module with an expansion rate r;
step B25, and outputting the result F of the step B242Inputting the raindrop removal image into a standard residual module, then sending the raindrop removal image into a convolution layer with an activation function of ReLU to complete the conversion from the characteristic diagram to the image, and outputting the raindrop removal image of the sub-network t at the current stage with the channel number of 3 according to the following formula:
It=Conv2(Res(F2))
where Res (. + -.) denotes standard residual module, Conv2 denotes convolutional layer with activation function ReLU, ItThe image is removed for raindrops of the sub-network t at the current stage.
5. The method for removing raindrops from a single image based on the loop iteration mechanism according to claim 3, wherein the step B3 is implemented by the following steps:
step B31, in the context aggregation module, firstly, the input feature F is sent to a smooth hole convolution module, and the calculation is carried out according to the following formula:
F3=Dilatedr(Sep(F))
wherein, F3For the output characteristics of the smooth hole convolution module, F is the input of the context aggregation module, Sep is the separable shared convolution layer, i.e. the separable convolution based on the channel and all the channels share the parameters, scaledrThe (star) is a hole (expansion) convolution, and the expansion rate r is a parameter for increasing the receptive field and effectively aggregating the spaceThe following information to better extract features; the expansion rate r represents the interval of the elements in the convolution kernel by 0, when r is 1, the void convolution is the same as the ordinary convolution, the elements in the convolution kernel are adjacent to each other, and no 0 exists; when r is more than 1, r-1 0 s are needed to be inserted among elements in the convolution kernel to enlarge the receptive field; the expansion rate r in the step B24 is the expansion rate of the convolution of the hole;
the only difference between the attention context aggregation module and the context aggregation module is the step that the attention context aggregation module is added with the channel attention module, and the following steps are the same, and the attention context aggregation module is calculated according to the following formula:
F3=SE(Dilatedr(Sep(F)))
wherein SE (×) represents the channel attention module;
in the step B32, the attention context aggregation module and the context aggregation module, the feature F output by the step B313The output of a residual error module formed by self-correcting convolution is sent to be calculated according to the following formula:
F4=LeakyReLU(F3+SCC(F3))
F4and outputting a residual module formed by the self-correcting convolution, wherein the module comprises a self-correcting convolution, a LeakyReLU function and residual concatenation, and the LeakyReLU (x) has the following formula:
Figure FDA0002926210940000031
wherein, x represents the input value of LeakyReLU function, and a is a fixed linear coefficient;
SCC (×) is a self-correcting convolution, defined as follows:
firstly, the output characteristic F of the step B313Respectively input to 1 × 1 convolutional layers of an inactive function:
X1,X2=Conv1×1(F3)
wherein, Conv1×1A 1X 1 convolutional layer, X1,X2Respectively, a characteristic diagram obtained by halving the number of channels obtained by passing through 1 × 1 convolutional layers, i.e., if F3The number of channels of (2) is C, then X1,X2The number of the channels is C/2;
then X is put1,X2Respectively fed into respective branch operations, wherein X1The branch to the self-calibration operation is calculated as follows:
T1=AvgPoolr(X1)
X′1=Up(T1*K2)
Figure FDA0002926210940000041
Y1=Y′1*K4
wherein, Avgpdolr(x) is the average pooling with step size r, Up (x) is the upsampling operation, x is the convolution operation,
Figure FDA0002926210940000042
is an operator of element-by-element multiplication, + is an operator of element-by-element addition, and sigma is a sigmoid activation function; k2、K3、K4The convolution kernels have the same sizes; y is1Is the output result of the self-correcting operation branch;
at the same time, X2Sending the corresponding convolution branch, and calculating according to the following formula:
Y2=X2*K1
finally, splicing the output results of the two branches on the channels to restore the channel number to the channel number C of the original characteristic diagram, and calculating according to the following formula:
Figure FDA0002926210940000043
Figure FDA0002926210940000044
for channel splicing operations, Y is the output of the self-correcting convolution module.
6. The method for removing raindrops of a single image based on a loop iteration mechanism according to claim 1, wherein the step C is implemented by the following steps:
step C1, optimizing the convolution neural network model for single image raindrop removal by using a loss function as a constraint, wherein the specific formula is as follows:
Figure FDA0002926210940000045
wherein SSIM is a structural similarity loss function; suppose a given training image pair (X)i,Yi) Where i ═ 1., N (total number of samples of training data), XiIs an image block of an input image damaged by raindrops, YiImage patches that are their corresponding clean images, Y denotes the use of a training image pair (X)i,Yi) The method comprises the steps that a clean image block is generated by a time network in a predicted mode after raindrops are removed;
and step C2, randomly dividing the image block data set into a plurality of batches, carrying out training optimization on the designed network until the L value calculated in the step C1 converges to a threshold value or the iteration number reaches the threshold value, storing the trained model, and finishing the network training process.
CN202110134465.6A 2021-02-01 2021-02-01 Single image raindrop removing method based on loop iteration mechanism Active CN112767280B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110134465.6A CN112767280B (en) 2021-02-01 2021-02-01 Single image raindrop removing method based on loop iteration mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110134465.6A CN112767280B (en) 2021-02-01 2021-02-01 Single image raindrop removing method based on loop iteration mechanism

Publications (2)

Publication Number Publication Date
CN112767280A true CN112767280A (en) 2021-05-07
CN112767280B CN112767280B (en) 2022-06-14

Family

ID=75704404

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110134465.6A Active CN112767280B (en) 2021-02-01 2021-02-01 Single image raindrop removing method based on loop iteration mechanism

Country Status (1)

Country Link
CN (1) CN112767280B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113450288A (en) * 2021-08-04 2021-09-28 广东工业大学 Single image rain removing method and system based on deep convolutional neural network and storage medium
CN113610329A (en) * 2021-10-08 2021-11-05 南京信息工程大学 Short-time rainfall approaching forecasting method of double-current convolution long-short term memory network

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180268284A1 (en) * 2017-03-15 2018-09-20 Samsung Electronics Co., Ltd. System and method for designing efficient super resolution deep convolutional neural networks by cascade network training, cascade network trimming, and dilated convolutions
US10304193B1 (en) * 2018-08-17 2019-05-28 12 Sigma Technologies Image segmentation and object detection using fully convolutional neural network
CN111861925A (en) * 2020-07-24 2020-10-30 南京信息工程大学滨江学院 Image rain removing method based on attention mechanism and gate control circulation unit
CN112085678A (en) * 2020-09-04 2020-12-15 国网福建省电力有限公司检修分公司 Method and system suitable for removing raindrops from power equipment machine patrol image
CN112132756A (en) * 2019-06-24 2020-12-25 华北电力大学(保定) Attention mechanism-based single raindrop image enhancement method
CN112184566A (en) * 2020-08-27 2021-01-05 北京大学 Image processing method and system for removing attached water mist droplets
CN112184573A (en) * 2020-09-15 2021-01-05 西安理工大学 Context aggregation residual single image rain removing method based on convolutional neural network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180268284A1 (en) * 2017-03-15 2018-09-20 Samsung Electronics Co., Ltd. System and method for designing efficient super resolution deep convolutional neural networks by cascade network training, cascade network trimming, and dilated convolutions
US10304193B1 (en) * 2018-08-17 2019-05-28 12 Sigma Technologies Image segmentation and object detection using fully convolutional neural network
CN112132756A (en) * 2019-06-24 2020-12-25 华北电力大学(保定) Attention mechanism-based single raindrop image enhancement method
CN111861925A (en) * 2020-07-24 2020-10-30 南京信息工程大学滨江学院 Image rain removing method based on attention mechanism and gate control circulation unit
CN112184566A (en) * 2020-08-27 2021-01-05 北京大学 Image processing method and system for removing attached water mist droplets
CN112085678A (en) * 2020-09-04 2020-12-15 国网福建省电力有限公司检修分公司 Method and system suitable for removing raindrops from power equipment machine patrol image
CN112184573A (en) * 2020-09-15 2021-01-05 西安理工大学 Context aggregation residual single image rain removing method based on convolutional neural network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
TIE LIU ET AL.: "Removing Rain in Videos: A Large-Scale Database and a Two-Stream ConvLSTM Approach", 《2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME)》 *
WENHAN YANG ET AL.: "Joint Rain Detection and Removal from a Single Image with Contextualized Deep Networks", 《 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 *
丁宇阳等: "双LSTM的光场图像去雨算法研究", 《计算机工程与应用》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113450288A (en) * 2021-08-04 2021-09-28 广东工业大学 Single image rain removing method and system based on deep convolutional neural network and storage medium
CN113610329A (en) * 2021-10-08 2021-11-05 南京信息工程大学 Short-time rainfall approaching forecasting method of double-current convolution long-short term memory network
CN113610329B (en) * 2021-10-08 2022-01-04 南京信息工程大学 Short-time rainfall approaching forecasting method of double-current convolution long-short term memory network

Also Published As

Publication number Publication date
CN112767280B (en) 2022-06-14

Similar Documents

Publication Publication Date Title
CN109543502B (en) Semantic segmentation method based on deep multi-scale neural network
CN108717569B (en) Expansion full-convolution neural network device and construction method thereof
CN111462013B (en) Single-image rain removing method based on structured residual learning
CN112767280B (en) Single image raindrop removing method based on loop iteration mechanism
CN108648159B (en) Image rain removing method and system
CN111915530A (en) End-to-end-based haze concentration self-adaptive neural network image defogging method
CN112884073B (en) Image rain removing method, system, terminal and storage medium
CN112419191B (en) Image motion blur removing method based on convolution neural network
CN111062329B (en) Unsupervised pedestrian re-identification method based on augmented network
CN113052775B (en) Image shadow removing method and device
CN110838095B (en) Single image rain removing method and system based on cyclic dense neural network
CN109544475A (en) Bi-Level optimization method for image deblurring
CN111414860A (en) Real-time portrait tracking and segmenting method
CN114723630A (en) Image deblurring method and system based on cavity double-residual multi-scale depth network
CN111815526B (en) Rain image rainstrip removing method and system based on image filtering and CNN
CN116205821A (en) Single-image rain removing method based on vertical stripe characteristic extraction cross convolution
CN114862711B (en) Low-illumination image enhancement and denoising method based on dual complementary prior constraints
CN113627368B (en) Video behavior recognition method based on deep learning
CN115239602A (en) License plate image deblurring method based on cavity convolution expansion receptive field
CN115205148A (en) Image deblurring method based on double-path residual error network
CN114943655A (en) Image restoration system for generating confrontation network structure based on cyclic depth convolution
Jia et al. Single-image snow removal based on an attention mechanism and a generative adversarial network
CN113870145A (en) Image defogging method based on deep convolutional neural network under Bayes framework
CN113658074B (en) Single image raindrop removing method based on LAB color space multi-scale fusion network
CN110415190B (en) Method, device and processor for removing image compression noise based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant