CN112767280A

CN112767280A - Single image raindrop removing method based on loop iteration mechanism

Info

Publication number: CN112767280A
Application number: CN202110134465.6A
Authority: CN
Inventors: 牛玉贞; 陈锋; 郑路伟
Original assignee: Fuzhou University
Current assignee: Fuzhou University
Priority date: 2021-02-01
Filing date: 2021-02-01
Publication date: 2021-05-07
Anticipated expiration: 2041-02-01
Also published as: CN112767280B

Abstract

The invention relates to a single image raindrop removing method based on a loop iteration mechanism. The method comprises the following steps: preprocessing the training image pair of the original raindrop degradation image and the clean image to obtain an image block data set consisting of the training image pair of the original raindrop degradation image and the clean image; designing a convolution neural network for removing raindrops of a single image by using a motivation for continuously iterating to remove rain; designing a target loss function loss for optimizing the network, taking the image block data set as training data, calculating the gradient of each parameter in the designed image raindrop removal convolutional neural network by using a back propagation method according to the designed target loss function loss, updating the parameter by using a random gradient descent method, and finally learning the optimal parameter of the model; and inputting the image to be detected, and predicting and generating a clean image after raindrops are removed by using the trained model. The method can obviously improve the raindrop removal performance of the image, and greatly reduce the network parameter size of the image.

Description

Single image raindrop removing method based on loop iteration mechanism

Technical Field

The invention relates to the field of image and video processing and computer vision, in particular to a single image raindrop removing method based on a loop iteration mechanism.

Background

With the rapid development of internet and multimedia technology, images have become an indispensable part for human to communicate and transmit information, and have great significance for the development of various aspects of modern society. However, the image acquisition is inevitably performed in an outdoor environment, which is inevitably affected by bad weather, such as rainy days, foggy days, snowy days, and the like. These bad weather can cause a series of visual degradation to the captured images and videos, such as contrast, saturation, visibility, etc. Rainy days are a common weather phenomenon in daily life, and raindrops attached to a glass window, a windshield or a camera lens can obstruct the visibility of a background scene, reduce the quality of an image and cause serious degradation of the image. The detailed information on the image cannot be identified, the use value of the image is greatly reduced, and certain difficulty is brought to high-level visual understanding tasks such as target detection, pedestrian re-identification and image segmentation. In rainy weather, raindrops inevitably adhere to the lens of the camera. These attached raindrops may blur or obscure part of the background scene in the image or obscure part of the foreground scene in the image, so it is of great significance to remove raindrops from a single image to restore a clean background.

Although studies on single-image raindrop streak removal have been well explored, there are few studies on single-image raindrop removal, and these well-explored single-image raindrop streak removal methods cannot be directly applied to single-image raindrop removal. Although the raindrops are less dense than the raindrops, the raindrops are generally larger in size than the raindrops, and completely block the background. Also the appearance of a raindrop is influenced by many parameters, its shape often being different from thin and vertical rainstripes. Meanwhile, the physical modeling of raindrops is also completely different from rainstripes, which makes the raindrop removal work more difficult.

Current methods for single image raindrop removal are largely divided into two broad categories, namely model-based methods and existing deep learning-based methods. Model-based methods generally use a filter to decompose the rain component into high and low frequency components, and then distinguish the rain component from the non-rain component in the high frequency component through dictionary learning. Most of model-based methods rely on parameters preset manually to perform feature extraction and performance optimization of images, so that features of raindrops cannot be extracted well, and the performance of raindrop removal is poor.

The deep learning-based method is a data-driven method, which trains a convolutional neural network by using a large amount of data. The powerful feature learning and representing capability expressed by the convolutional neural network can better extract image features, and a better raindrop removing effect is achieved. However, these methods based on deep learning also have a problem that the relationship between the raindrop removal performance and the network parameter amount cannot be balanced well. These methods either achieve good performance, but at the cost of a large number of parameters, greatly limit their potential value in practical applications with limited computational resources; or less parametric but at the cost of poorer performance. Therefore, how to design an efficient and practical single-image raindrop removal method is just one of the focuses of the future researchers.

Disclosure of Invention

The invention aims to provide a single image raindrop removing method based on a loop iteration mechanism, which can obviously improve the performance of removing raindrops of images and greatly reduce the size of network parameters.

In order to achieve the purpose, the technical scheme of the invention is as follows: a single image raindrop removing method based on a loop iteration mechanism comprises the following steps:

step A, preprocessing a training image pair of an original raindrop degradation image and a clean image to obtain an image block data set consisting of the training image pair of the original raindrop degradation image and the clean image;

b, designing a convolution neural network for removing raindrops of a single image based on the thought of 'divide and conquer' and by using the motivation of continuously iterating to remove rain;

step C, designing a target loss function loss for optimizing the network, taking the image block data set as training data, calculating the gradient of each parameter in the designed single-image raindrop removal convolutional neural network by using a back propagation method according to the designed target loss function loss, updating the parameter by using a random gradient descent method, and finally learning the optimal parameter of the single-image raindrop removal convolutional neural network;

and D, inputting the image to be detected into the designed single image raindrop removal convolutional neural network, and predicting to generate a clean image after raindrop removal by using the trained single image raindrop removal convolutional neural network.

In an embodiment of the present invention, the specific implementation manner of step a is as follows: dicing the original raindrop-attached degraded image and the corresponding clean image according to a consistent mode to obtain W multiplied by W-sized image blocks, and simultaneously dicing every m pixel points in order to avoid overlapping dicing; after the dicing, the W multiplied by W image blocks with the raindrops and the W multiplied by W clean image blocks correspond to each other one by one.

In an embodiment of the present invention, the step B is implemented as follows:

step B1, designing a multi-stage raindrop removal network, wherein the network is a convolutional neural network based on a loop iteration mechanism and specifically consists of a plurality of stage sub-networks with the same network structure and shared network parameters;

step B2, designing a phase sub-network, which is used for extracting the relevant features of the raindrops so as to better remove the raindrops;

step B3, a context aggregation module and an attention context aggregation module in the design phase sub-network, for aggregating the spatial context information lacking in the network.

In an embodiment of the present invention, the step B2 is implemented as follows:

step B21, splicing the image block obtained by the previous-stage sub-network after raindrop removal and the corresponding original raindrop-attached image block on a channel as the input of the sub-network of each stage; for the sub-network of the first stage, the input is the result of splicing two original raindrop image blocks on a channel;

step B22, inputting the spliced result obtained in step B21 into a convolutional layer with an activation function of ReLU, converting the image into a feature map, and outputting the features according to the following formula:

wherein Conv1 represents the convolutional layer with the activation function ReLU, I_oFor the original raindrop image block, I_t-1Representing the image block obtained in the previous stage after raindrop removal, for the first stage, I_t-1Is I_o，

Representing operation according to a channel splicing characteristic, F₀Representing the extracted feature map;

step B23, converting the feature map F₀Inputting the data into a convolution long-short term memory network module which consists of a forgetting gate f, an input gate i and an output gate o and is calculated according to the following formula:

f_t＝σ(W_xf*F₀+W_hf*H_t-1+W_cf⊙C_t-1+b_f)

i_t＝σ(W_xi*F₀+W_hi*H_t-1+W_ci⊙C_t-1+b_i)

C_t＝f_t⊙C_t-1+i_t⊙tanh(W_xc*F₀+W_hc*H_t-1+b_c)

o_t＝σ(W_xo*F₀+W_ho*H_t-1+W_co⊙C_t+b_o)

F₁＝H_t＝o_t⊙tanh(C_t)

wherein, forget the door f at the moment of t_tAnd an input gate i_tIs inputted by the feature diagram F₀Last time (i.e., t-1) convolution long and short term memory netOutput H of the envelope module_t-1And cell information state C at last moment_t-1Formed by three parts, an output gate o at time t_tIs inputted from the feature map F₀Convolution of the output H of the long-short term memory network module at the last time (i.e. t-1)_t-1And time t cell information state C_tThe three parts are formed; w_*And b_*Respectively representing a weight parameter and a deviation parameter of a convolution kernel corresponding to the weight parameter and the deviation parameter, tanh represents a tangent function, sigma represents a Sigmoid function, an operator indicates a convolution operation, and an operator indicates a dot product operation; c_tFor the cell information state at the current time t, the cell information state is fed to the convolution long-short term memory network module at the next time, H_tA characteristic diagram which represents the output of the convolution long-term and short-term memory network module at the current time t; for convenience of description, refer to H_tIs marked as F₁；

Converting each moment into each stage in a multi-stage mode; for the first stage of the network, since it has no previous stage, the inputs of the forgetting gate and the input gate thereof are set to 0;

step B24, the output F of the convolution long-short term memory network module₁Inputting into multiple context aggregation modules and attention context aggregation module, sequentially including context aggregation module with expansion rate of 2>Attention context aggregation Module with expansion Rate 2->Context aggregation Module with expansion Rate 2>Context aggregation Module with expansion ratio of 4->Attention context aggregation Module with expansion ratio of 4->The context aggregation module with the expansion rate of 4 is calculated according to the following formula:

F₂＝CAU₄(SECAU₄(CAU₄(CAU₂(SECAU₂(CAU₂(F1))))))

wherein, the CAU_r(. indicates a context aggregation Module with an expansion Rate r, SECAU_r() represents an attention context aggregation module with an expansion rate r;

step B25, and outputting the result F of the step B24₂Inputting the data into a standard residual error module, and then sending the data into a convolution layer with an activation function of ReLUAnd (3) converting the characteristic diagram into an image, and outputting the raindrop removal image of the current-stage sub-network t with the channel number of 3 according to the following formula:

I_t＝Conv2(Res(F₂))

where Res (. + -.) denotes standard residual module, Conv2 denotes convolutional layer with activation function ReLU, I_tThe image is removed for raindrops of the sub-network t at the current stage.

In an embodiment of the present invention, the step B3 is implemented as follows:

step B31, in the context aggregation module, firstly, the input feature F is sent to a smooth hole convolution module, and the calculation is carried out according to the following formula:

F₃＝Dilated_r(Sep(F))

wherein, F₃For the output characteristics of the smooth hole convolution module, F is the input of the context aggregation module, Sep is the separable shared convolution layer, i.e. the separable convolution based on the channel and all the channels share the parameters, scaled_rThe (star) is a hole (expansion) convolution, the receptive field is increased through the parameter of the expansion rate r, and the spatial context information is effectively aggregated to better extract the characteristics; the expansion rate r represents the interval of the elements in the convolution kernel by 0, when r is 1, the void convolution is the same as the ordinary convolution, the elements in the convolution kernel are adjacent to each other, and no 0 exists; when r is more than 1, r-1 0 s are needed to be inserted among elements in the convolution kernel to enlarge the receptive field; the expansion rate r in the step B24 is the expansion rate of the convolution of the hole;

the only difference between the attention context aggregation module and the context aggregation module is the step that the attention context aggregation module is added with the channel attention module, and the following steps are the same, and the attention context aggregation module is calculated according to the following formula:

F₃＝SE(Dilated_r(Sep(F)))

wherein SE (×) represents the channel attention module;

in the step B32, the attention context aggregation module and the context aggregation module, the feature F output by the step B31₃Is sent to a self-calibrationThe output of the residual module formed by positive convolution is calculated according to the following formula:

F₄＝LeakyReLU(F₃+SCC(F₃))

F₄and outputting a residual module formed by the self-correcting convolution, wherein the module comprises a self-correcting convolution, a LeakyReLU function and residual concatenation, and the LeakyReLU (x) has the following formula:

wherein, x represents the input value of LeakyReLU function, and a is a fixed linear coefficient;

SCC (×) is a self-correcting convolution, defined as follows:

firstly, the output characteristic F of the step B31₃Respectively input to 1 × 1 convolutional layers of an inactive function:

X₁，X₂＝Conv1×1(F₃)

wherein, Conv_1×1A 1X 1 convolutional layer, X₁，X₂Respectively, a characteristic diagram obtained by halving the number of channels obtained by passing through 1 × 1 convolutional layers, i.e., if F₃The number of channels of (2) is C, then X₁，X₂The number of the channels is C/2;

then X is put₁，X₂Respectively fed into respective branch operations, wherein X₁The branch to the self-calibration operation is calculated as follows:

T₁＝AvgPool_r(X₁)

X′₁＝Up(T₁*K₂)

Y₁＝Y′₁*K₄

wherein, AvgPool_r(x) is the average pooling with step size r, Up (x) is the upsampling operation, x is the convolution operation,

is an operator of element-by-element multiplication, + is an operator of element-by-element addition, and sigma is a sigmoid activation function; k₂、K₃、K₄The convolution kernels have the same sizes; y is₁Is the output result of the self-correcting operation branch;

at the same time, X₂Sending the corresponding convolution branch, and calculating according to the following formula:

Y₂＝X₂*K₁

finally, splicing the output results of the two branches on the channels to restore the channel number to the channel number C of the original characteristic diagram, and calculating according to the following formula:

for channel splicing operations, Y is the output of the self-correcting convolution module.

In an embodiment of the present invention, the step C is implemented as follows:

step C1, optimizing the convolution neural network model for single image raindrop removal by using a loss function as a constraint, wherein the specific formula is as follows:

wherein SSIM is a structural similarity loss function; suppose a given training image pair (X)_i，Y_i) Where i is 1, …, N (total number of samples of training data), X_iIs an image block of an input image damaged by raindrops, Y_iImage patches that are their corresponding clean images, Y denotes the use of a training image pair (X)_i，Y_i) The method comprises the steps that a clean image block is generated by a time network in a predicted mode after raindrops are removed;

and step C2, randomly dividing the image block data set into a plurality of batches, carrying out training optimization on the designed network until the L value calculated in the step C1 converges to a threshold value or the iteration number reaches the threshold value, storing the trained model, and finishing the network training process.

Compared with the prior art, the invention has the following beneficial effects: the method is based on the concept of divide-and-conquer, the raindrop removal task is decomposed into a cycle iteration process, and raindrops are removed circularly by designing a multi-stage convolution neural network so as to achieve a better image recovery result. Meanwhile, a latest smooth hole convolution module and a latest self-correction convolution module are adopted to better aggregate spatial context information and successfully eliminate the artifact problem generated in the raindrop removing process by the original hole convolution. The method designs an independent convolution neural network for removing the raindrops of the image aiming at the problem of removing the raindrops of the image, can ensure the quality of the image after the raindrops are removed, has smaller network parameter size compared with other methods, and has higher use value.

Drawings

FIG. 1 is a flow chart of an implementation of the method of the present invention.

Fig. 2 is a structural diagram of a single image raindrop removal method model based on a loop iteration mechanism in the embodiment of the present invention.

Detailed Description

The technical scheme of the invention is specifically explained below with reference to the accompanying drawings.

The invention provides a single image raindrop removing method based on a loop iteration mechanism, which comprises the following steps of:

The following is a specific implementation of the present invention.

As shown in fig. 1, a single image raindrop removal method based on a loop iteration mechanism includes the following steps:

Further, the step a comprises the steps of:

step A1: and dicing the original raindrop degraded image and the corresponding clean image according to a consistent mode to obtain W multiplied by W-sized image blocks, and simultaneously dicing every m pixel points in order to avoid overlapping dicing. After the dicing, the W multiplied by W image blocks with the raindrops and the W multiplied by W clean image blocks correspond to each other one by one.

Further, the step B includes the steps of:

step B3, a context aggregation module and an attention context aggregation module in the sub-network in the design phase are used for aggregating the spatial context information lacking in the sub-network;

further, the step B2 includes the following steps:

and step B21, splicing the image block obtained by the previous-stage sub-network after the raindrops are removed and the corresponding original raindrop image block on the channel as the input of the sub-network of each stage. Note that, for the sub-network in the first stage, the input is the result of splicing the two original raindrop image blocks on the channel;

Representing operation according to a channel splicing characteristic, F₀Representing the extracted feature map.

f_t＝σ(W_xf*F₀+W_hf*H_t-1+W_cf⊙C_t-1+b_f)

i_t＝σ(W_xi*F₀+W_hi*H_t-1+W_ci⊙C_t-1+b_i)

C_t＝f_t⊙C_t-1+i_t⊙tanh(W_xc*F₀+W_hc*H_t-1+b_c)

o_t＝σ(W_xo*F₀+W_ho*H_t-1+W_co⊙C_t+b_o)

F₁＝H_t＝o_t⊙tanh(C_t)

wherein, forget the door f at the moment of t_tAnd an input gate i_tIs inputted by the feature diagram F₀Convolution of the output H of the long-short term memory network module at the last time (i.e. t-1)_t-1And cell information state C at last moment_t-1Formed by three parts, an output gate o at time t_tIs inputted from the feature map F₀Convolution of the output H of the long-short term memory network module at the last time (i.e. t-1)_t-1And time t cell information state C_tThese three parts constitute. W_*And b_*The weight parameter and the bias parameter are respectively corresponding to the convolution kernel, tanh represents a tangent function, σ represents a Sigmoid function, an operator indicates a convolution operation, and an operator indicates a dot-by-dot operation. C_tFor the cell information state at the current time t, the cell information state is fed to the convolution long-short term memory network module at the next time, H_tAnd (3) representing a characteristic diagram output by the convolution long-term and short-term memory network module at the current time t. For convenience to connectFrom the description of the method, we will turn H_tIs marked as F₁。

Our method converts each of the above-described time instants into each of the stages in a multi-stage manner. Note that: for the first phase of our network, we set the inputs of their forgetting gate and input gate to 0, since they have no previous phase.

F₂＝CAU₄(SECAU₄(CAU₄(CAU₂(SECAU₂(CAU₂(F₁))))))

wherein, the CAU_r(. indicates a context aggregation Module with an expansion Rate r, SECAU_r(. x) denotes the attention context aggregation module with expansion rate r.

Step B25, and outputting the result F of the step B24₂Inputting the raindrop removal image into a standard residual module, then sending the raindrop removal image into a convolution layer with an activation function of ReLU to complete the conversion from the characteristic diagram to the image, and outputting the raindrop removal image of the sub-network t at the current stage with the channel number of 3 according to the following formula:

I_t＝Conv2(Res(F₂))

Further, the step B3 includes the following steps:

F₃＝Dilated_r(Sep(F))

wherein, F₃For the output characteristics of the smooth hole convolution module, F is the input of the context aggregation module, Sep is the separable shared convolution layer, i.e. the separable convolution based on the channel and all the channels share the parameters, scaled_rAnd the (star) is a hole (expansion) convolution, the receptive field is increased through the parameter of the expansion rate r, and the spatial context information is effectively aggregated to better extract the features. The expansion rate r represents the interval of the elements in the convolution kernel by 0, when r is 1, the void convolution is the same as the ordinary convolution, the elements in the convolution kernel are adjacent to each other, and no 0 exists; when r is more than 1, r-1 0 s are needed to be inserted between elements in the convolution kernel to enlarge the receptive field. The expansion rate r in step B24 is the expansion rate of the hole convolution.

F₃＝SE(Dilated_r(Sep(F)))

wherein SE (×) denotes the channel attention module.

Step B32: attention context aggregation module and feature F output by step B31 in context aggregation module₃The output of a residual error module formed by self-correcting convolution is sent to be calculated according to the following formula:

F₄＝LeakyReLU(F₃+SCC(F₃))

where x represents the input value of the LeakyReLU function and a is a fixed linear coefficient.

SCC (×) is a self-correcting convolution, defined as follows:

X₁，X₂＝Conv_1×1(F₃)

wherein, Conv_1×1A 1X 1 convolutional layer, X₁，X₂Respectively, a characteristic diagram obtained by halving the number of channels obtained by passing through 1 × 1 convolutional layers, i.e., if F₃The number of channels of (2) is C, then X₁，X₂The number of channels of (2) is C/2.

T₁＝AvgPool_r(X₁)

X′₁＝Up(T₁*K₂)

Y₁＝Y′₁*K₄

is the element-by-element multiplication operator, + is the element-by-element addition operator, and σ is the sigmoid activation function. K₂、K₃、K₄Are convolution kernels with the same convolution kernel size. Y is₁Is the output result of the self-correcting operation branch.

Y₂＝X₂*K₁

Further, the step C includes the steps of:

step C1, we use the common loss function as the constraint to optimize our network model, and the specific formula is as follows:

where SSIM is a structural similarity loss function. Suppose a given training image pair (X)_i，Y_i) Where i is 1, …, N (total number of samples of training data), X_iIs an image block of an input image damaged by raindrops, Y_iImage patches that are their corresponding clean images, Y denotes the use of a training image pair (X)_i，Y_i) The method comprises the steps that a clean image block is generated by a time network in a predicted mode after raindrops are removed;

The above are preferred embodiments of the present invention, and all changes made according to the technical scheme of the present invention that produce functional effects do not exceed the scope of the technical scheme of the present invention belong to the protection scope of the present invention.

Claims

1. A single image raindrop removing method based on a loop iteration mechanism is characterized by comprising the following steps:

2. The method for removing raindrops of a single image based on a loop iteration mechanism according to claim 1, wherein the step a is specifically implemented as follows: dicing the original raindrop-attached degraded image and the corresponding clean image according to a consistent mode to obtain W multiplied by W-sized image blocks, and simultaneously dicing every m pixel points in order to avoid overlapping dicing; after the dicing, the W multiplied by W image blocks with the raindrops and the W multiplied by W clean image blocks correspond to each other one by one.

3. The method for removing raindrops of a single image based on a loop iteration mechanism according to claim 1, wherein the step B is implemented by the following steps:

4. The method for removing raindrops from a single image based on the loop iteration mechanism according to claim 3, wherein the step B2 is implemented by the following steps:

f_t＝σ(W_xf*F₀+W_hf*H_t-1+W_cf⊙C_t-1+b_f)

i_t＝σ(W_xi*F₀+W_hi*H_t-1+W_ci⊙C_t-1+b_i)

C_t＝f_t⊙C_t-1+i_t⊙tanh(W_xc*F₀+W_hc*H_t-1+b_c)

o_t＝σ(W_xo*F₀+W_ho*H_t-1+W_co⊙C_t+b_o)

F₁＝H_t＝o_t⊙tanh(C_t)

wherein, forget the door f at the moment of t_tAnd an input gate i_tIs inputted by the feature diagram F₀Convolution of the output H of the long-short term memory network module at the last time (i.e. t-1)_t-1And cell information state C at last moment_t-1Formed by three parts, an output gate o at time t_tIs inputted from the feature map F₀Convolution of the output H of the long-short term memory network module at the last time (i.e. t-1)_t-1And time t cell information state C_tThe three parts are formed; w_*And b_*Respectively representing a weight parameter and a deviation parameter of a convolution kernel corresponding to the weight parameter and the deviation parameter, tanh represents a tangent function, sigma represents a Sigmoid function, an operator indicates a convolution operation, and an operator indicates a dot product operation; c_tFor the cell information state at the current time t, the cell information state is fed to the convolution long-short term memory network module at the next time, H_tA characteristic diagram which represents the output of the convolution long-term and short-term memory network module at the current time t; for convenience of description, refer to H_tIs marked as F₁；

step B24, the output F of the convolution long-short term memory network module₁Input to multiple context aggregation modules and attention contexts designedThe polymerization modules sequentially comprise a context polymerization module with the expansion rate of 2- > an attention context polymerization module with the expansion rate of 2- > a context polymerization module with the expansion rate of 4- > an attention context polymerization module with the expansion rate of 4- > a context polymerization module with the expansion rate of 4, and are calculated according to the following formula:

F₂＝CAU₄(SECAU₄(CAU₄(CAU₂(SECAU₂(CAU₂(F₁))))))

I_t＝Conv2(Res(F₂))

5. The method for removing raindrops from a single image based on the loop iteration mechanism according to claim 3, wherein the step B3 is implemented by the following steps:

F₃＝Dilated_r(Sep(F))

wherein, F₃For the output characteristics of the smooth hole convolution module, F is the input of the context aggregation module, Sep is the separable shared convolution layer, i.e. the separable convolution based on the channel and all the channels share the parameters, scaled_rThe (star) is a hole (expansion) convolution, and the expansion rate r is a parameter for increasing the receptive field and effectively aggregating the spaceThe following information to better extract features; the expansion rate r represents the interval of the elements in the convolution kernel by 0, when r is 1, the void convolution is the same as the ordinary convolution, the elements in the convolution kernel are adjacent to each other, and no 0 exists; when r is more than 1, r-1 0 s are needed to be inserted among elements in the convolution kernel to enlarge the receptive field; the expansion rate r in the step B24 is the expansion rate of the convolution of the hole;

F₃＝SE(Dilated_r(Sep(F)))

wherein SE (×) represents the channel attention module;

in the step B32, the attention context aggregation module and the context aggregation module, the feature F output by the step B31₃The output of a residual error module formed by self-correcting convolution is sent to be calculated according to the following formula:

F₄＝LeakyReLU(F₃+SCC(F₃))

SCC (×) is a self-correcting convolution, defined as follows:

X₁，X₂＝Conv_1×1(F₃)

T₁＝AvgPool_r(X₁)

X′₁＝Up(T₁*K₂)

Y₁＝Y′₁*K₄

wherein, Avgpdol_r(x) is the average pooling with step size r, Up (x) is the upsampling operation, x is the convolution operation,

Y₂＝X₂*K₁

6. The method for removing raindrops of a single image based on a loop iteration mechanism according to claim 1, wherein the step C is implemented by the following steps:

wherein SSIM is a structural similarity loss function; suppose a given training image pair (X)_i，Y_i) Where i ═ 1., N (total number of samples of training data), X_iIs an image block of an input image damaged by raindrops, Y_iImage patches that are their corresponding clean images, Y denotes the use of a training image pair (X)_i，Y_i) The method comprises the steps that a clean image block is generated by a time network in a predicted mode after raindrops are removed;