CN113673590A

CN113673590A - Rain removing method, system and medium based on multi-scale hourglass dense connection network

Info

Publication number: CN113673590A
Application number: CN202110929946.6A
Authority: CN
Inventors: 吴梦华; 罗玉; 凌捷
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2021-08-13
Filing date: 2021-08-13
Publication date: 2021-11-19
Anticipated expiration: 2041-08-13
Also published as: CN113673590B

Abstract

The invention provides a rain removing method based on a multi-scale hourglass dense connection network, which comprises the steps of firstly, utilizing a multi-scale hourglass parallel structure to sample extracted and input feature images for multiple times to obtain multi-scale rain streak mapping features; secondly, adding a space and channel double-attention module which is connected in parallel into the multi-scale hourglass structure, and fusing multi-level characteristics by residual jump connection to realize self-adaptive recalibration for identifying and extracting the rainprint layer characteristic information; the invention intensively connects the multi-scale hourglass attention residual error modules to carry out mutual intensive forward transmission and characteristic fusion on the characteristics extracted among different depth network layers; finally, a global rich rainprint layer characteristic diagram is obtained through multi-level characteristic aggregation, and a clear rain-free background layer image is obtained by utilizing a rain diagram linear superposition model. The method comprehensively utilizes multi-scale feature fusion, a double-attention mechanism and hierarchical feature polymerization, improves the learning accuracy of the model features, effectively extracts and removes the rain streak layer features, and obtains clear rain removing images.

Description

Rain removing method, system and medium based on multi-scale hourglass dense connection network

Technical Field

The invention relates to the technical field of computer vision image processing, in particular to a rain removing method, system and medium based on a multi-scale hourglass dense connection network.

Background

The rain stripes in the image shot in the rainy scene often block or obscure key information of the background content of the image, and the image quality is seriously reduced, so that the working performance of other follow-up computer vision systems is further hindered, and the practical application is influenced. Therefore, the image rain-removing task is an important pre-processing work of an outdoor computer vision system, and attracts extensive attention and attention of researchers. The effective rain removing technology can usually recover the blurred rained image into a clear high-quality rainless image, so that a more accurate detection or identification result is effectively provided, the performance of a computer vision system is greatly promoted and improved, the application range is wide, and the application prospect is wide.

In recent years, various researches on a single image rain removing algorithm are proposed, and the researches are mainly divided into two types, wherein the first type is a model-based method, and further can be divided into a priori-based method and a filter-based method. The prior-based method comprises methods such as sparse discrimination, a Gaussian mixture model and low-rank representation. The filter-based approach uses edge and physical property filters to get a rainless clear background image. The second category is based on data-driven deep neural network model methods. Unlike model-based methods, data-driven methods represent the image rain removal problem as a process of learning a nonlinear function and finding appropriate parameters to map the rain print portion to the background scene. Driven by deep learning, researchers have constructed mapping functions using Convolutional Neural Networks (CNNs) or generative countermeasure networks (GANs). The CNN method directly obtains a deterministic mapping function from a rain image to a clear background, and the GANs can generate a rain-free image according to an input rain image and synthesize a clear image with visual attractiveness. Fu proposes a rain removing method named DerainNet, which inputs the high-frequency part of the image into a convolutional neural network for training and learning, but the method causes the loss of the image color and luster, and the residual trace of rain stripes still exists in the background area. Zhang et al proposed a single image de-raining algorithm based on condition-generated confrontation network (CGAN), but the visual effect of the restoration was unnatural. The RESCAN iteratively repeats rain removal in multiple stages by taking the output of the previous stage of the network as the input of the next stage, but reduces the rain removal effect because the network needs to perform conversion between images and features for multiple times.

Through the analysis, the existing rain removing algorithm has the problems of rain streak residue or excessive rain removal, unnatural recovery of detail textures and the like, so that the actual application of the single image rain removing algorithm is greatly influenced. In order to solve the problems, the invention provides a single-image rain removing algorithm based on a multi-scale hourglass dense connection network. A multi-scale hourglass dense connection network called MHAR-Net is established by densely connecting a plurality of layers of attention mechanism residual modules with different scales, wherein the multi-scale hourglass structure extracts rain streak characteristic information with various scales, and simultaneously, a channel and a space double-attention module are introduced for parallel connection, so that the multi-scale information and the characteristic attention are better utilized. Residual error jump connection in the multi-scale hourglass double attention module unit realizes shallow layer feature fusion, and rain feature representation capability is improved; in addition, the multiple multi-scale hourglass attention modules are densely connected, the characteristics of the images are fully utilized, the deep layer network can utilize the characteristics of the shallow layer, the information flow is increased, the multi-level characteristics are reused and aggregated, the fact that the details and the key information of more images are extracted quickly is guaranteed, the rain removing processing in a complex rain environment is achieved, the details and the space information of the images are restored as far as possible, and the problems of color distortion, brightness loss, excessive rain removing and the like are reduced.

Disclosure of Invention

The invention aims to solve the defects of the existing image rain removing technology, and provides a single image rain removing method based on end-to-end mapping of a multi-scale hourglass dense connection network.

The invention provides a rain removing method based on a multi-scale hourglass dense connection network, which comprises the following steps:

acquiring a synthesized image rain removal sample data set to obtain a data set for training a network model;

constructing a single image rain removal network model of end-to-end mapping of the multi-scale hourglass dense connection network, wherein the network model comprises three convolution layers with convolution kernel size of 3 and four multi-scale hourglass attention residual error module units;

preprocessing the data set to obtain a training data set, training a single image rain removal network model of end-to-end mapping of the multi-scale hourglass dense connection network by using the training data set based on a Pythrch deep learning frame, and obtaining a single image rain removal network model of end-to-end mapping of the trained multi-scale hourglass dense connection network;

inputting the rain removing image into a trained single image rain removing network model for end-to-end mapping of the multi-scale hourglass dense connection network to obtain a rain removing image.

In the scheme, the method for constructing the single-image rain-removing network model of the end-to-end mapping of the multi-scale hourglass dense connection network specifically comprises the following steps:

the conversion from the image space to the feature space is realized by forming an input layer by a convolution layer with a convolution kernel size of 3 and a LeakyReLU activation function, and the input layer is used for extracting the rain image features of a shallow layer, and the process is expressed as follows:

F₀＝σ(Conv_3×3(O_rain))，

wherein, O_rainA rain image representing the input network; conv_3×3(. cndot.) represents convolution operation with convolution kernel size of 3 × 3, which is used to transform the image to feature space, F0 represents shallow feature image extracted after convolution and nonlinear operation of the image, and σ (·) is nonlinear activation function leak ReLU;

further collecting different-scale rain streak characteristics from the extracted shallow characteristic F0 by using a multi-scale hourglass structure, wherein the multi-scale hourglass structure consists of a down-sampling layer, an up-sampling layer, a pooling layer, a convolution layer and a deconvolution layer, and the mathematical expression of the multi-scale hourglass structure is as follows:

wherein X represents data information input to the network, F_u(X；η_u) An upsampling process is performed on the data. F_d(X；η_d) A down sampling process is performed on the input data. i, j denotes the number of layers of up and down sampling, η_dAnd η_uRepresenting the network parameters of downsampling and upsampling, respectively, H_iRepresenting characteristic information obtained after hourglass network processing;

extracting and transmitting different scales of rain streak characteristics by using a multi-scale hourglass residual error structure, and introducing a space and channel double attention module which is connected in parallel to recalibrate the space context characteristic dependence and weight the channel important characteristic information at the same time so as to inhibit useless channel information;

the channel attention module extracts and processes the channel characteristic information of the characteristic diagram to obtain the correlation characteristic information and the importance degree among channels, and the channel attention characteristic diagram is shown as follows:

wherein AvgPool refers to an average pooling operation, MaxPool refers to a maximum pooling operation, and MLP denotes a multi-layered perceptron constituting a shared network generation channel attention map Mapc; f is the characteristic information of the input network,

and

respectively representing the feature maps after the average pooling and maximum pooling operations in the channel attention module, W₁And W₂Representing a network weight parameter, wherein sigma represents a sigmoid nonlinear activation function;

the spatial attention module extracts the spatial context information distribution condition of the attention image, and the spatial attention characteristic diagram is represented as follows:

wherein F ∈ R^1×H×WIs the spatial feature distribution, F (i) is the feature of the corresponding point, C represents the corresponding channel number; f7 × 7 represents a convolution operation with a filter size of 7 × 7;

and

the characteristic diagram obtained by carrying out average pooling and maximum pooling operations on spatial attention is shown, and the size of the characteristic diagram is 1 multiplied by H multiplied by W; sigma represents a sigmoid function;

the feature images obtained by the channel attention module and the spatial attention module are merged and fused, and the process can be further expressed as follows:

F_out＝Concat([Map_c(F)，Map_s(F)])，

wherein Concat is the splicing operation, Map_c(F) And Map_s(F) Respectively representing the characteristic information obtained after processing by the channel attention module and the spatial attention module, F_outRepresenting output characteristic information obtained after network processing of the whole multi-scale hourglass residual error attention module unit;

the multi-scale hourglass structure and the double attention modules are connected with residual jump to form a multi-scale hourglass attention residual module unit, a residual network transfers the characteristic jump extracted from the shallow layer to a deeper network layer in a cross-layer mode, and the MHAR module is expressed as follows:

F_i＝F_MHARi([L₀，......，L_d]，[W₀，......，W_d])，

wherein [ L ]₀，......，L_d]Is the input of the convolution layer corresponding to the network, [ W ]₀，......，W_d]Is a parameter of each layer of the MHAR module, F_MHARiRepresenting the mapping process of the ith MHAR module;

adding a dense connection mechanism into a single-image rain-removing network model of end-to-end mapping of a multi-scale hourglass dense connection network, wherein the specific implementation is as follows:

F₁＝F_MHAR1(F₀)+F₀，

F₂＝F_MHAR2(F₁)+F₁+F₀，

F₃＝F_MHAR3(F₂)+F₂+F₁+F₀，

F₄＝F_MHAR4(F₃)+F₃+F₂+F₁+F₀。

conv with convolution kernel sizes of 1 × 1 and 3 × 3 using a LeakyReLU activation function_1×1And Conv_3×3The convolutional layer fuses the aggregated depth features, represented as:

R＝σ(Conv_3×3(σ(Conv_1×1(MHAR₄(F₃)+F₃+F₂+F₁+F₀))))，

wherein, sigma is a nonlinear activation function LeakyReLU, and R is a raingrain layer obtained by extraction and separation.

In the scheme, the training data set is used for training a single image rain removal network model mapped end to end of the multi-scale hourglass dense connection network based on the Pythrch deep learning framework, and the method specifically comprises the following steps:

inputting training data sets into the constructed single image rain-removing network model of end-to-end mapping of the multi-scale hourglass dense connection network in batches for training, and finally obtaining the pre-trained single image rain-removing network model of end-to-end mapping of the multi-scale hourglass dense connection network;

B＝O-R，

wherein, O is a rain image input into the network, R is a rain print layer image extracted and separated by the network, and B is a rain-free image generated by the rain removing network processing.

In the scheme, a training data set is used for training a single image rain removal network model of end-to-end mapping of the multi-scale hourglass dense connection network based on a Pythrch deep learning framework, and the single image rain removal network model specifically comprises reconstruction loss, SSIM loss and perception loss, and specifically comprises the following steps:

reconstruction loss the reconstruction loss is constructed based on mean square error as defined below:

wherein L is_MSERepresenting the mean square error, H, W representing the size, i.e., height and width, of the image, C being the number of channels of the image, G (R)_i(ii) a P) represents the ith rainless image output by the rain removing model, P is a network training parameter, and P is { w ═ w₁，w₂，......，w_n；b₁，b₂，......，b_n}，w₁，...w_nIs a weight parameter; b₁，...，b_nAs an offset vector, B_iRepresenting the corresponding ith real rain-free image; f is a Frobenius norm;

the SSIM loss is calculated by the structural similarity of the images, which is expressed by the formula:

wherein, mu_xWhich represents the mean value of the image,

representing the variance, σ, of the image_xyRepresenting the covariance of the two images, C₁And C₂Is an empirical constant of the equation, the SSIM loss function is defined as:

L_SSIM＝-SSIM(G(R；P)，B)，

wherein L is_SSIMIs a negative function;

the perceptual loss is defined as:

where V represents a non-linear neural network transformation, with the goal of minimizing the distance between high-level features;

the final blending error loss function can be expressed as:

L_loss＝λ₁L_MSE+λ₂L_SSIM+λ₃L_P，

wherein λ is₁、λ₂、λ₃Are parameters of the relative weights on the three components reconstruction loss, SSIM loss and perceptual loss.

In this scheme, λ₁、λ₂、λ₃Are set to 1, 0.2, 0.04, respectively.

In a second aspect, the invention provides a rain removing system based on a multi-scale hourglass dense connection network, comprising: a memory and a processor, wherein the memory includes a program of a rain removing method based on a multi-scale hourglass dense connection network, and the program of the rain removing method based on the multi-scale hourglass dense connection network realizes the following steps when being executed by the processor:

B＝O-R，

wherein, mu_xWhich represents the mean value of the image,

L_SSIM＝-SSIM(G(R；P)，B)，

wherein L is_SSIMIs a negative function;

the perceptual loss is defined as:

the final blending error loss function can be expressed as:

L_loss＝λ₁L_MSE+λ₂L_SSIM+λ₃L_P，

In this scheme, λ₁、λ₂、λ₃Are set to 1, 0.2, 0.04, respectively.

The third aspect of the invention provides a readable storage medium, wherein the readable storage medium comprises a rain removing method program based on a multi-scale hourglass dense connection network, and when the rain removing method program based on the multi-scale hourglass dense connection network is executed by a processor, the steps of the rain removing method based on the multi-scale hourglass dense connection network are realized.

The invention provides a single image rain removing method based on a multi-scale hourglass dense connection network, which comprises the steps of firstly, utilizing a multi-scale hourglass parallel structure to sample extracted and input feature images for multiple times, extracting feature images with different scales and obtaining multi-scale rain streak mapping features; secondly, adding space and channel double-attention modules which are connected in parallel into the multi-scale hourglass structure, and fusing multi-level characteristics by residual jump connection to form a whole multi-scale hourglass attention residual error module unit (MHAR module) so as to realize self-adaptive recalibration for identifying and extracting the rainprint layer characteristic information; in addition, the invention intensively connects the multi-scale hourglass attention residual error modules to carry out mutual intensive forward transmission and characteristic fusion on the characteristics extracted among different depth network layers; finally, a global rich rainprint layer characteristic diagram is obtained through multi-level characteristic aggregation, and a clear rain-free background layer image is obtained by utilizing a rain diagram linear superposition model. The method comprehensively utilizes multi-scale feature fusion, a double-attention mechanism and hierarchical feature polymerization, improves the learning accuracy of the model features, effectively extracts and removes the rain streak layer features, and obtains clear rain removing images.

Drawings

FIG. 1 is a single image rain removal network model framework diagram based on end-to-end mapping of a multi-scale hourglass dense connection network of the present invention;

FIG. 2 is a structural diagram of a multi-scale hourglass attention residual module unit MHAR of the rain removing method based on the multi-scale hourglass dense connection network of the invention;

FIG. 3 is a multi-scale hourglass structure diagram of the rain removal method based on the multi-scale hourglass dense connection network of the present invention;

FIG. 4 is a detail view of a channel attention structure CAM of the rain removing method based on the multi-scale hourglass dense connection network of the invention;

FIG. 5 is a detail view of a spatial attention structure SAM of the rain removing method based on the multi-scale hourglass dense connection network of the invention;

FIG. 6 is a flow chart of the rain removal method based on the multi-scale hourglass dense connection network of the present invention;

fig. 7 is a frame diagram of a rain removal system based on a multi-scale hourglass dense connection network of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The invention provides a novel single-image rain removing method based on a multi-scale hourglass dense connection network. In order to more accurately acquire and utilize the characteristics extracted from the shallow layer in the network training process, the correlation between the characteristics and the more effective multiplexing transmissibility of the characteristics, the method provides a multi-scale hourglass residual error attention module, and the modules are densely connected with one another to form the whole single-image rain removal network model architecture.

As shown in fig. 6, the invention discloses a rain removing method based on a multi-scale hourglass dense connection network, which comprises the following steps:

s102, acquiring a synthesized image rain removal sample data set to obtain a data set for training a network model;

s104, constructing a single-image rain removal network model of end-to-end mapping of the multi-scale hourglass dense connection network, wherein the network model comprises three convolution layers with convolution kernel size of 3 and four multi-scale hourglass attention residual error module units as shown in FIG. 1, and the multi-scale hourglass attention residual error module units are shown in FIG. 2;

s106, preprocessing the data set to obtain a training data set, training a single image rain removing network model of end-to-end mapping of the multi-scale hourglass dense connection network by using the training data set based on a Pythrch deep learning frame, and obtaining a single image rain removing network model of end-to-end mapping of the trained multi-scale hourglass dense connection network;

and S108, inputting the rain removing image into the trained single image rain removing network model for end-to-end mapping of the multi-scale hourglass dense connection network to obtain a rain removing image.

First, a convolution layer Conv with a convolution kernel size of 3 × 3 and a LeakyReLU activation function is used_3×3Forming an input layer, converting the rain-bearing image from an image space to a feature space, extracting features of the rain image in a shallow layer,the process can be expressed as:

F₀＝σ(Conv_3×3(O_rain))，

wherein, O_rainA rain image representing the input network; conv_3×3(. The) represents convolution layer with convolution kernel size of 3 x 3, and is used for transforming image into feature space and extracting shallow layer feature information F₀And σ is the nonlinear activation function LeakyReLU.

The multi-scale hourglass structure is used for extracting feature graphs of different scales from an input feature image, the multi-scale hourglass structure is shown in figure 3, specifically, the feature graphs are respectively sampled for multiple times by utilizing a plurality of parallel structures of an hourglass network to obtain multi-scale mapping features, the features of different network layers are fused without being constrained by the shapes and sizes of rain marks, the feature information extracted by the network can be abstracted quickly and efficiently, and a model can learn the differences among the scales more easily. The mathematical expression for the multi-scale hourglass network module structure can be expressed as:

wherein X represents input data, F_u(X；η_u) An upsampling process is performed on the data. F_d(X；η_d) A down sampling process is performed on the input data. Eta_dAnd η_uRepresenting the network parameters of downsampling and upsampling, respectively, H_iRepresenting the characteristic information obtained after the hourglass network processing. X represents the characteristic information of the input network MHAR module. For different MHAR modules, the input is the feature information extracted by the corresponding previous network. i and j represent the number of layers of up-down sampling, i.e. the number of times up-down sampling is performed. The up-down sampling times are correspondingly equal, so that the sizes of the rain image finally input into the network and the rain-free image output by the network are ensured to be unchanged.

The downsampling operation is to downsample the original input characteristic image to obtain characteristic maps with different scales, and the downsampling structure is composed of Convolution layer conversion and Batch Normalization and ReLU activation functions with Convolution kernel sizes of 3 × 3, 3 × 3 and 1 × 1 respectively. The downsampling operation may be expressed as:

F_i＝Pooling_i(Conv_i×i(ReLU(Batchnorm(F₀)))，

where i is the convolution kernel size, F0 represents shallow feature data input into the hourglass structure, and Fi is a feature image obtained by convolving with i × i convolution kernel size and downsampling Poolingi.

An upsampling operation (Upscale) upsamples the downsampled reduced-size deep feature map to recover the original-size rain image features, which can be expressed by the formula:

wherein the content of the first and second substances,

illustrating the process of the upsampling operation in the multi-scale hourglass structure,

representing convolution and upsampling by a convolution kernel of size i x i

And obtaining the characteristic image.

Finally will be

Input the fusion module, namely:

wherein, F_outA characteristic diagram showing the fused output of the multi-scale hourglass structure,

representing the size of the convolved kernel as i x i and upsampling

The resulting signature graph, H (-) represents a series of operations consisting of two sets of 3 × 3 and one set of 1 × 1 convolutions.

A double attention mechanism is introduced into the multi-scale hourglass structure, the extracted multi-scale rain streak features are utilized to pass through the double attention mechanism, and an attention module is introduced from two layers of space and a channel to omit a large amount of unimportant feature information, so that more important and useful hierarchical features are enhanced and extracted, and the calculation cost of network model training is reduced.

The multi-scale rain print features learn a weight for each feature channel through a channel attention module, wherein the weight represents the importance of a feature map, which enables a network to filter out useless feature information and focus on useful information, the channel attention module is as shown in fig. 4, and for an extracted feature map, firstly, the maximum and average pixel values on each channel of the extracted feature map are obtained to represent the channel, then the maximum and average pixel values are used as the input of a two-layer multilayer perceptron network, the weights of different channels are adaptively adjusted, then feature fusion is carried out through addition, and finally all values are normalized between 0 and 1 through a sigmoid function, so that the channel attention map is obtained:

wherein, MLP represents to form the shared network and generate the channel attention Map_cThe multilayer sensor of (1); f is the characteristic information of the input network,

and

a feature map of the channel attention module after the average pooling and maximum pooling operations is shown, respectively. W₁And W₂Representing the network weight parameter. σ denotes sigmoid nonlinear activation function. Pooling is the removal of the garbage information that is,key information is retained. The pooling layer is a down-sampling layer and is mainly used for feature dimension reduction, data compression and parameter quantity reduction, overfitting is reduced, and meanwhile the fault tolerance of the model is improved. AvgPool refers to average pooling operation; average pooling is the averaging of feature points in the neighborhood of the feature map. MaxPool refers to maximal pooling operation; maximum pooling is the maximization of feature points in the neighborhood of the feature map.

The multi-scale rainprint features are processed by a spatial attention module, as shown in fig. 5, which focuses the network on a position where there is rainfall according to the difference of the correlation degree of the spatial pixel values of each feature map, so as to facilitate the network to identify and extract the rainprints in the image. In the space attention module, global averaging and maximum pooling are carried out to capture the dependency between the spatial context information of different features, and the relationship between the hierarchical features is mined by using the feature context information. The spatial attention feature map is:

wherein F ∈ R^1×H×WIs the spatial feature distribution, F (i) is the feature of the corresponding point, C represents the corresponding channel number; f. of⁷ ^×7Represents a convolution operation with a filter size of 7 x 7;

and

the characteristic diagram obtained by carrying out average pooling and maximum pooling operations on spatial attention is shown, and the size of the characteristic diagram is 1 multiplied by H multiplied by W; σ denotes a sigmoid function.

And further splicing the feature images obtained by the channel attention module and the spatial attention module to obtain more important information, wherein the process can be further expressed as follows:

F_out＝Concat([Map_c(F)，Map_s(F)])，

wherein Concat isSplicing operation, Map_c(F) And Map_s(F) Respectively representing the characteristic information obtained after processing by the channel attention module and the spatial attention module, F_outAnd the output characteristic information obtained after the network processing of the whole multi-scale hourglass residual attention module unit is represented.

Meanwhile, in order to better acquire local shallow layer information, residual dense connecting branches are added in the multi-scale hourglass structure, the original input image is subjected to feature expansion by using a convolution core of 1x1, then an amplified feature diagram after deconvolution is fused, and the multi-scale and multi-level features are fused by combining global information and local information. The residual connection enables the forward-propagating characteristics to be preserved, namely, the result of each layer is obtained on the basis of the previous result, and the connection mode not only can prevent the disappearance of the network gradient, but also is beneficial to the convergence of the network, and effectively fuses the characteristics of different sources into a comprehensive characteristic diagram. The multi-scale hourglass residual error structure and the double attention module form a whole multi-scale hourglass attention residual error module unit, namely an MHAR module, and the specific implementation can be expressed as follows:

F_i＝F_MHARi([L₀，......，L_d]，[W₀，......，W_d])，

wherein [ L ]₀，......，L_d]Is the input of the convolution layer corresponding to the network, [ W ]₀，......，W_d]Is a parameter of each layer of the MHAR module, F_MHARiIndicating the mapping process of the ith MHAR module.

A dense connection mechanism is added in a single image rain-removing network model mapped end to end in a multi-scale hourglass dense connection network, namely four MHAR modules are stacked and densely connected to form a deeper network to extract higher-layer abstract features and aggregate multi-layer rain print features. The specific implementation can be expressed as:

F₁＝MHAR₁(F₀)+F₀，

F₂＝MHAR₂(F₁)+F₁+F₀，

F₃＝MHAR₃(F₂)+F₂+F₁+F₀，

F₄＝MHAR₄(F₃)+F₃+F₂+F₁+F₀because the number of layers of the convolution depth network is deepened, rain print characteristics are lost through multi-layer repeated up-down sampling and transmission, a multi-scale hourglass notices dense connection among residual modules, in addition, shallow layer characteristics extracted from the initial convolution layer and characteristics processed by various MHRA modules are fused, so that deep layer characteristics can utilize the shallow layer characteristics, the characteristics extracted by various hierarchical networks are fully used for realizing characteristic reuse, and the flowing of information is increased to prevent the defect of key information.

Finally, Conv with convolution kernel sizes of 1 × 1 and 3 × 3 with LeakyReLU activation function is used_1×1And Conv_3×3And the convolutional layer fuses the polymerized depth features, and then the converged depth features are decoded and restored into a clear image without rain.

R＝σ(Conv_3×3(σ(Conv_1×1(MHAR₄(F₃)+F₃+F₂+F₁+F₀) ))) disclosed herein, a rain removal system based on a multi-scale hourglass dense connection network is disclosed, the system comprising: a memory 71 and a processor 72, wherein the memory includes a program of a rain removing method based on a multi-scale hourglass dense connection network, and when the program of the rain removing method based on the multi-scale hourglass dense connection network is executed by the processor, the following steps are realized:

First, a convolution layer Conv with a convolution kernel size of 3 × 3 and a LeakyReLU activation function is used_3×3And constituting an input layer, converting the rain-bearing image from an image space to a feature space, and extracting features of the rain image in a shallow layer, wherein the process can be expressed as:

F₀＝σ(Conv_3×3(O_rain))，

The method comprises the steps of extracting feature graphs of different scales from an input feature image by using a multi-scale hourglass structure, specifically, carrying out multiple sampling on the feature graphs by using a plurality of parallel structures of an hourglass network to obtain multi-scale mapping features, and when the features of different network layers are fused, the method is not restricted by the shapes and the sizes of rain marks, can quickly and efficiently abstract feature information extracted by the network, so that a model can learn the differences among the scales more easily. The mathematical expression for the multi-scale hourglass network module structure can be expressed as:

wherein X represents input data, F_u(X；η_u) An upsampling process is performed on the data. F_d(X；η_d) A down sampling process is performed on the input data. Eta_dAnd η_uRepresenting the network parameters of downsampling and upsampling, respectively, H_iRepresenting the characteristic information obtained after the hourglass network processing. X represents the characteristic information of the input network MHAR module. For different MHAR modules, the inputs areThe characteristic information extracted from the corresponding previous network. i and j represent the number of layers of up-down sampling, i.e. the number of times up-down sampling is performed. The up-down sampling times are correspondingly equal, so that the sizes of the rain image finally input into the network and the rain-free image output by the network are ensured to be unchanged.

F_i＝Pooling_i(Conv_i×i(ReLU(Batchnorm(F₀)))，

where i is the convolution kernel size, F₀Shallow characteristic data representing the input hourglass structure, F_iConvolving and downsampling Pooling with convolution kernel size of i × i_iAnd obtaining the characteristic image.

wherein the content of the first and second substances,

representing convolution and upsampling by a convolution kernel of size i x i

And obtaining the characteristic image.

Finally will be

Input fusionA module, namely:

representing the size of the convolved kernel as i x i and upsampling

The multi-scale rain print features learn a weight for each feature channel through a channel attention module, the weight represents the importance of a feature map, the network filters useless feature information and focuses on useful information, the channel attention module firstly obtains the maximum and average pixel values on each channel of the extracted feature map to represent the channel, then uses the maximum and average pixel values as the input of a two-layer multilayer perceptron network to adaptively adjust the weights of different channels, then performs feature fusion through addition, and finally normalizes all values between 0 and 1 through a sigmoid function to obtain the channel attention map as:

wherein, MLP represents to form the shared network and generate the channel attention Map_cThe multilayer sensor of (1); f is the characteristic information of the input networkIn the form of a capsule, the particles,

and

a feature map of the channel attention module after the average pooling and maximum pooling operations is shown, respectively. W₁And W₂Representing the network weight parameter. σ denotes sigmoid nonlinear activation function. Pooling is the removal of garbage and the retention of critical information. The pooling layer is a down-sampling layer and is mainly used for feature dimension reduction, data compression and parameter quantity reduction, overfitting is reduced, and meanwhile the fault tolerance of the model is improved. AvgPool refers to average pooling operation; average pooling is the averaging of feature points in the neighborhood of the feature map. MaxPool refers to maximal pooling operation; maximum pooling is the maximization of feature points in the neighborhood of the feature map.

The multi-scale rain strip features are focused on positions with rainfall by the network through the spatial attention module according to different degrees of correlation of spatial pixel values of each feature map, so that the identification and extraction of the rain strips in the images by the network are facilitated. In the space attention module, global averaging and maximum pooling are carried out to capture the dependency between the spatial context information of different features, and the relationship between the hierarchical features is mined by using the feature context information. The spatial attention feature map is:

and

to show spatial attentionThe size of the characteristic diagram obtained by average pooling and maximum pooling operations is 1 × H × W; σ denotes a sigmoid function.

F_out＝Concat([Map_c(F)，Map_s(F)])，

wherein Concat is the splicing operation, Map_c(F) And Map_s(F) Respectively representing the characteristic information obtained after processing by the channel attention module and the spatial attention module, F_outAnd the output characteristic information obtained after the network processing of the whole multi-scale hourglass residual attention module unit is represented.

F_i＝F_MHARi([L₀，......，L_d]，[W₀，......，W_d])，

F₁＝MHAR₁(F₀)+F₀，

F₂＝MHAR₂(F₁)+F₁+F₀，

F₃＝MHAR₃(F₂)+F₂+F₁+F₀，

F₄＝MHAR₄(F₃)+F₃+F₂+F₁+F₀，

because the number of layers of the convolution depth network is deepened, rain print characteristics are lost through multi-layer repeated up-down sampling and transmission, a multi-scale hourglass notices dense connection among residual modules, in addition, shallow layer characteristics extracted from the initial convolution layer and characteristics processed by various MHRA modules are fused, so that deep layer characteristics can utilize the shallow layer characteristics, the characteristics extracted by various hierarchical networks are fully used for realizing characteristic reuse, and the flowing of information is increased to prevent the defect of key information.

R＝σ(Conv_3×3(σ(Conv_1×1(MHAR₄(F₃)+F₃+F₂+F₁+F₀) And) is a nonlinear activation function LeakyReLU, and R is a raingrain layer obtained by extraction and separation.

Finally, a pre-trained end-to-end rain removing network model can be obtained, and a rain-free clear background layer image output by the model is obtained by utilizing a rain chart linear superposition model. The rain map linear overlay model can be represented as:

B＝O-R，

The above are the main process steps of the method of the present invention, and the loss optimization method employed in the present invention will be described in detail below.

The method of the invention provides a mixed optimization objective function: the reconstruction loss, the SSIM loss and the perception loss are combined, the similarity of each pixel is kept while the minimum image reconstruction loss is ensured, and in addition, the high similarity of the global image structure is kept to the greatest extent, so that the rain removing result is more accurate. Specifically, it can be defined as: l is_loss＝L_MSE+L_SSIM+L_P，

Wherein L is_lossIs the overall objective function of the method of the present invention. Fidelity term using mean square error L_MSEAs reconstruction loss to constrain the error between the real rainless image and the network output rainless image, the accuracy of the network output rainless image is measured; l is_SSIMIs a structural similarity loss function, L, for measuring the similarity of two images_PIs a perceptual loss function that minimizes the perceptual difference between the output rain-removed image of the model and the input true rain-bearing image. λ is a scaling parameter used to adjust the relative weights on the perceptual loss component. In the experimental verification process, the method proves that the mixed optimization target function has better effect than other forms of target functions through loss function ablation research. The reconstruction loss, SSIM loss and perceptual loss are specifically defined as:

1) reconstruction loss:

wherein, P is a network training parameter, and P ═ { w ═ w₁，w₂，......，w_n；b₁，b₂，......，b_n}，w₁，...w_nIs a weight parameter; b₁，...，b_nIs an offset vector. Reconstructing the loss to constrain the error between the real rainless image and the network output rainless image, measuring the accuracy of the network output rainless image,

2) the loss of SSIM is defined as follows:

L_SSIM＝-SSIM(G(R；P)，B)，

wherein SSIM (·) is an SSIM function for calculating the similarity between two images, G (R; P) is a rain-removing image, and B is a real rain-free image. The goal of SSIM loss is to maximize the SSIM value of the image as much as possible, so L_SSIMIs a negative function.

3) Loss of perception:

perceptual loss to normalize the similarity between high-dimensional features of an image. Wherein G (R) is a rain-removed image obtained by the method of the invention, and B is a corresponding real rain-free image. V (-) represents a nonlinear CNN transformation operation. In the method of the present invention, the feature loss in the pre-trained VGG16 model is calculated.

Therefore, to canonically constrain the fitting ability of the network model to achieve optimization of the network model, the final hybrid error loss function can be expressed as:

L_loss＝λ₁L_MSE+λ₂L_SSIM+λ₃L_P，

wherein λ is₁、λ₂、λ₃Are parameters for adjusting the relative weights on the three components reconstruction loss, SSIM loss and perceptual loss. Usually lambda₁，λ₂，λ₃Are set to 1, 0.2, 0.04, respectively.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims

1. A rain removing method based on a multi-scale hourglass dense connection network is characterized by comprising the following steps:

2. The rain removing method based on the multi-scale hourglass dense connection network according to claim 1, wherein the single image rain removing network model for end-to-end mapping of the multi-scale hourglass dense connection network is constructed, and specifically comprises the following steps:

F₀＝σ(Conv_3×3(O_rain))，

wherein, O_rainA rain image representing the input network; conv_3×3(. -) convolution operation with convolution kernel size 3 x 3 to transform image to feature space, F₀The shallow layer characteristic image is extracted after the image is subjected to convolution and nonlinear operation, and sigma (-) is a nonlinear activation function LeakyReLU;

extraction of shallow features F from an extracted using a multi-scale hourglass structure₀Further gather different yardstick rain streak characteristics in, the multiscale hourglass structure comprises down-sampling, upsampling, pooling layer, convolution layer and deconvolution layer, and its mathematical expression is:

wherein, AvgPool refers to average pooling operation, MaxPool refers to maximum pooling operation, MLP indicates generation of channel attention Map for constituting shared network_cThe multilayer sensor of (1); f is the characteristic information of the input network,

and

wherein the content of the first and second substances,F∈R^1×H×Wis the spatial feature distribution, F (i) is the feature of the corresponding point, C represents the corresponding channel number; f. of^7×7Represents a convolution operation with a filter size of 7 x 7;

and

F_out＝Concat([Map_c(F),Map_s(F)])，

F_i＝F_MHARi([L₀,……,L_d],[W₀,……,W_d])，

wherein [ L ]₀,……,L_d]Is the input of the convolution layer corresponding to the network, [ W ]₀,……,W_d]Is a parameter of each layer of the MHAR module, F_MHARiRepresenting the mapping process of the ith MHAR module;

F₁＝F_MHAR1(F₀)+F₀，

F₂＝F_MHAR2(F₁)+F₁+F₀，

F₃＝F_MHAR3(F₂)+F₂+F₁+F₀，

F₄＝F_MHAR4(F₃)+F₃+F₂+F₁+F₀。

R＝σ(Conv_3×3(σ(Conv_1×1(MHAR₄(F₃)+F₃+F₂+F₁+F₀))))，

3. The method according to claim 1, wherein a Pytorch deep learning framework is used to train a single-image rain removing network model of end-to-end mapping of the multi-scale hourglass dense connection network with the training data set, and specifically comprises:

B＝O-R，

4. The method according to claim 1, wherein a Pytorch deep learning framework is used to train a single-image rain removing network model of end-to-end mapping of the multi-scale hourglass dense connection network with the training data set, and the single-image rain removing network model specifically comprises reconstruction loss, SSIM loss, and perceptual loss, and specifically comprises:

wherein L is_MSERepresenting the mean square error, H, W representing the size, i.e., height and width, of the image, C being the number of channels of the image, G (R)_i(ii) a P) represents the ith rainless image output by the rain removing model, P is a network training parameter, and P is { w ═ w₁,w₂,……，w_n；b₁,b₂,……，b_n},w₁,…w_nIs a weight parameter; b₁,…，b_nAs an offset vector, B_iRepresenting the corresponding ith real rain-free image; f is a Frobenius norm;

wherein, mu_xWhich represents the mean value of the image,

L_SSIM＝-SSIM(G(R；P),B)，

wherein L is_SSIMIs a negative function;

the perceptual loss is defined as:

the final blending error loss function can be expressed as:

L_loss＝λ₁L_MSE+λ₂L_SSIM+λ₃L_P，

5. The rain removing method based on the multi-scale hourglass dense connection network as claimed in claim 4, wherein λ₁、λ₂、λ₃Are set to 1, 0.2, 0.04, respectively.

6. A rain removal system based on a multi-scale hourglass dense connection network, the system comprising: a memory and a processor, wherein the memory includes a program of a rain removing method based on a multi-scale hourglass dense connection network, and the program of the rain removing method based on the multi-scale hourglass dense connection network realizes the following steps when being executed by the processor:

7. The rain removing system based on the multi-scale hourglass dense connection network of claim 6, wherein a Pytorch-based deep learning framework is used to train a single-image rain removing network model of end-to-end mapping of the multi-scale hourglass dense connection network with the training data set, and specifically comprises:

B＝O-R，

8. The rain removing system based on the multi-scale hourglass dense connection network of claim 6, wherein a single-image rain removing network model for end-to-end mapping of the multi-scale hourglass dense connection network is trained by using the training data set based on a Pythrch deep learning framework, and specifically comprises reconstruction loss, SSIM loss, and sensing loss, and specifically comprises:

wherein, mu_xWhich represents the mean value of the image,

L_SSIM＝-SSIM(G(R；P),B)，

wherein L is_SSIMIs a negative function;

the perceptual loss is defined as:

the final blending error loss function can be expressed as:

L_loss＝λ₁L_MSE+λ₂L_SSIM+λ₃L_P，

9. The method of claim 8, wherein the method comprises a step of removing a dense connection network based on a multi-scale hourglassRain system, characterised by λ₁、λ₂、λ₃Are set to 1, 0.2, 0.04, respectively.

10. Readable storage medium, characterized in that the readable storage medium comprises a rain removing method program based on multi-scale hourglass dense connection network, when the rain removing method program based on multi-scale hourglass dense connection network is executed by a processor, the steps of the rain removing method based on multi-scale hourglass dense connection network according to any one of claims 1 to 5 are realized.