CN114841895A

CN114841895A - Image shadow removing method based on bidirectional mapping network

Info

Publication number: CN114841895A
Application number: CN202210570043.8A
Authority: CN
Inventors: 查正军; 傅雪阳; 朱禹睿; 黄杰; 王洋
Original assignee: University of Science and Technology of China USTC
Current assignee: University of Science and Technology of China USTC
Priority date: 2022-05-24
Filing date: 2022-05-24
Publication date: 2022-08-02
Anticipated expiration: 2042-05-24
Also published as: CN114841895B

Abstract

The invention discloses an image shadow removing method based on a bidirectional mapping network, which comprises the following steps: 1. inputting a shadow image to be processed, and constructing a color extraction network to obtain color invariance information of the shadow image as guide information for shadow removal; 2. constructing a bidirectional mapping module to realize the feature extraction and feature mapping of a bidirectional mapping network; 3. and inputting the shadow image and the related color guide information into the constructed bidirectional mapping network to obtain the reconstructed shadow-free image. The invention fully considers the auxiliary supervision function of shadow generation on shadow removal to construct a bidirectional mapping network to realize the removal of the shadow in the image, and simultaneously introduces color guide information during the shadow removal so as to reduce the problem of color deviation possibly occurring during the image restoration, thereby realizing better shadow removal effect.

Description

Image shadow removing method based on bidirectional mapping network

Technical Field

The invention belongs to the technical field of image processing, particularly relates to image shadow removal, and provides an image shadow removal method based on a bidirectional mapping network.

Background

The shadow phenomenon is caused by the light source being blocked by a specific object, which is common in everyday scenes. However, such degradation of the shadow phenomenon affects the illumination and color information of the target image, thereby presenting a great challenge to other computer vision tasks, such as: target detection, target tracking, face detection, and the like. Shadow removal can acquire better visual pictures of human eyes and is also an important preprocessing link in the computer vision tasks.

The current neural network-based method already dominates the image shadow removal field and obtains good recovery effect. However, the current supervision method mainly relies on paired image training neural network, and few methods consider utilizing shadow generation as auxiliary information for shadow removal. The two processes of shadow removal and shadow generation are themselves two processes that are reciprocal. Therefore, the introduction of the shadow generation can provide natural regular term constraint for the shadow removal to improve the performance of the shadow removal. Meanwhile, when the existing shadow removal method is used for removing the shadow, certain color deviation still exists in the processing result. Therefore, auxiliary color information is also integrated in the invention to guide better shadow removal so as to reduce color deviation existing in the processing result of the previous method. The core of the method is to use the shadow generation process as auxiliary information constraint so as to obtain better shadow removal effect, so the method naturally adopts the latest reversible neural technology to complete the construction of the double mapping network.

Disclosure of Invention

The invention provides an image shadow removing method based on a bidirectional mapping network for overcoming the defects of the prior art, so that illumination, texture and color information shielded by shadows can be restored, the problem of color deviation possibly occurring in image restoration is reduced, and more efficient image shadow removal is realized.

In order to achieve the purpose, the invention adopts the following technical scheme:

the invention relates to an image shadow removing method based on a bidirectional mapping network, which is characterized by comprising the following steps:

step 1, obtaining a shadow image to be processed and a corresponding shadow mask image and a shadow-free image I thereof _ns And preprocessing is carried out to obtain a preprocessed shadow image

Preprocessed shadow mask image

And a pre-processed shadow mask image

Wherein H and W represent the height and width of the image, respectively;

from pre-processed shadow images

Obtaining an initial color map

Step 2, constructing a structural formula consisting of n ₁ A convolution kernel of a ₁ ×a ₁ And obtaining a shadow image I using the formula (1) _s Color invariance information of

C _ns ＝θ(I _s ,C _s ) (1)

Step 3, constructing a network based on bidirectional mapping, comprising: the encoding characteristic preprocessing module En, the reversible neural network-based dual-input bidirectional condition mapping module IB and the decoding characteristic post-processing module De;

step 3.1, constructing the coding feature preprocessing module En and using the coding feature preprocessing module En for the shadow image I _s And (3) carrying out feature extraction:

the encoding characteristic preprocessing module En obtains a shallow network characteristic F of the shadow image by using a formula (2):

F＝En(I _s ) (2)

3.2, performing feature transformation and extraction on a reversible neural network-based dual-input bidirectional condition mapping module IB;

step 3.2.1, defining a variable k, and initializing k to be 0; taking the shallow network feature F of the shadow image as the input feature F of the kth bidirectional condition mapping module IB _k ；

Step 3.2.2, the reversible neural network-based dual-input bidirectional condition mapping module IB utilizes the formula (3) to perform k-th input feature F _k Performing several transformation operations to obtain the (k + 1) th input feature F _k+1 ：

In formula (3), split (. cndot.) represents a separation function; concat (. cndot.) represents a concatenation function;

ρ ₁ (. cndot.) and ρ ₂ (. h) represents four mapping networks;

representing the input characteristics of the first branch when the kth module IB is mapped forward,

representing the input bits of the second branch in the k-th block IB forward mappingThe step of performing the sign operation,

represents the output characteristics of the first branch after the forward mapping transformation of equation (3),

representing the output characteristics of the second branch subjected to forward mapping transformation of the formula (3); an indication of a dot-by-dot multiplication operation,

represents a point-by-point addition operation;

step 3.2.3, after K +1 is assigned to K, judging whether K is more than K, if so, indicating that the Kth input characteristic F is obtained _K And as the final shadow-free predecoding feature H; otherwise, returning to the step 3.2.2 for sequential execution; wherein K represents the number of the bidirectional condition mapping modules IB;

step 3.3, the decoding characteristic post-processing module De utilizes the formula (4) to obtain an estimated shadow-free image

Step 4, optimizing parameters of the bidirectional mapping network;

step 4.1, the shadow-free image I _ns Inputting the character into the bidirectional mapping network, and obtaining the shallow network character F of the shadow-free image after the character is decoded and processed by a decoding character post-processing module De ^ns ；

4.2, extracting reverse characteristics of a double-input bidirectional condition mapping module IB based on a reversible neural network;

step 4.2.1, defining a variable k, and initializing k to be 0; shallow network feature F of shadow-free image ^ns As input characteristics for the kth said bidirectional conditional mapping module IB

Step 4.2.2, the reversible neural network-based dual-input bidirectional condition mapping module IB utilizes the formula (6) to perform k-th input feature F _k Performing a plurality of transformation operations to obtain the (k + 1) th input feature

In the formula (6), the reaction mixture is,

a dot-by-dot division operation is shown,

representing a point-by-point subtraction operation;

representing the input characteristics of the first branch when the kth module IB is reverse mapped,

representing the input characteristics of the second branch when the kth module IB is reversely mapped;

represents the output characteristics of the first branch after the inverse mapping transformation of equation (3),

representing the output characteristics of the second branch after the inverse mapping transformation of the formula (3);

step 4.2.3, after K +1 is assigned to K, whether K is more than K is judged, if yes, the Kth input characteristic is obtained

And as a final shadow predecode feature

Otherwise, returning to the step 3.2.2 for sequential execution;

step 4.3, shadow predecoding feature

Inputting the shadow image into the encoding characteristic preprocessing module En to output an estimated shadow image

Step 5, training:

step 5.1, establishing a target loss function L by using the formula (8):

in formula (8), λ _inverse Representing the hyperparameter in the target loss function L;

and 5.2, based on a batch of batch _ size shadow image sets, corresponding shadow mask image sets and non-shadow image sets, performing supervised training on the bidirectional mapping network by using an Adam optimizer, and calculating the target loss function L to update the network parameters until the training times reach a set threshold value, so as to obtain a global optimal dual mapping network, which is used for removing shadows of the input shadow images and the shadow mask images.

Compared with the prior art, the invention has the beneficial effects that:

1. the invention utilizes the shadow generation process as the auxiliary information constraint of the shadow removal process and introduces the color auxiliary information. Under the condition of using a small quantity of neural network parameters, a good image restoration effect can be realized, and the degradation caused by shadow can be removed. Experimental results show that the method provided by the invention has better robustness on different image data sets, and is superior to the most advanced method on a plurality of public shadow removal image data sets.

2. The invention adopts the latest reversible neural technology to complete the construction of the double mapping network, thereby ensuring that the proposed double mapping network keeps high reversibility. The same network can not only remove the shadow, but also generate a shadow image through a reverse reasoning process, and the shadow generation can be utilized to assist the shadow removing process to only use one parameter sharing network, so that the neural network parameters required by training the model can be effectively saved, and the downstream model deployment is facilitated.

Drawings

FIG. 1 is a flow chart of an image shadow removal method based on a bidirectional mapping network according to the present invention;

FIG. 2 is a schematic diagram of a color network according to the present invention;

FIG. 3 is a network frame diagram of an image shadow removal method based on a bidirectional mapping network according to the present invention;

FIG. 4a is a schematic diagram of a forward mapping structure of a dual-input reversible module according to the present invention;

FIG. 4b is a schematic diagram of a reverse mapping structure of the dual-input reversible module according to the present invention;

FIG. 5 is a mapping network used by the present invention;

FIG. 6 is a graph of visual contrast effects in a test set according to the present invention;

FIG. 7 is a graph of comparative performance results of the shadow removal method of the present invention on a real world dataset ISTD dataset.

Detailed Description

In this embodiment, as shown in fig. 1, an image shadow removing method based on a bidirectional mapping network is performed as follows:

Preprocessed shadow mask image

And a pre-processed shadow mask image

Wherein H and W represent the height and width of the image, respectively; in the preprocessing, the same preprocessing process including operations such as clipping, flipping, rotating and the like is performed on the input shadow image data and the shadow mask image data.

From pre-processed shadow images

An initial color map can be obtained

Initial color map C of shadow image _s Is through the original image I _s Divided by the mean calculated along the RGB channels of the image.

Step 2, constructing a structural formula consisting of n ₁ A convolution kernel of a ₁ ×a ₁ The color extraction network θ of the convolutional layer composition of (2) is shown in FIG. 2, and a shadow image I is obtained by using the formula (1) _s Color invariance information of

C _ns ＝θ(I _s ,C _s ) (1)

Step 3, constructing a network based on bidirectional mapping, comprising: the encoding characteristic preprocessing module En, the reversible neural network-based dual-input bidirectional condition mapping module IB, the decoding characteristic post-processing module De and the overall network framework are shown in FIG. 3;

step 3.1, constructing a coding feature preprocessing module En and using the coding feature preprocessing module En for the shadow image I _s And (3) carrying out feature extraction:

the encoding characteristic preprocessing module En obtains the shallow network characteristic F of the shadow image by using the formula (2):

F＝En(I _s ) (2)

step 3.2, feature transformation and extraction of the reversible neural network-based dual-input bidirectional condition mapping module IB are performed, as shown in fig. 4 a:

In equation (3), split (·) represents a separation function, operating along the channel dimension of a feature; concat (·) represents a series function, operating along the channel dimension of the feature;

ρ ₁ (. and ρ) ₂ (. cndot.) represents four mapping transformation networks, the structures of which are consistent, as shown in FIG. 5;

representing the input characteristics of the second branch when the kth module IB is mapped forward,

represents a point-by-point addition operation; compared with a simple convolutional layer, the reversible neural network has good mathematical reversibility and information losslessness, namely, for the constructed bidirectional condition mapping module IB, the input can be transformed into corresponding output through the mapping of the formula (3), and meanwhile, for the output of the IB module, the input can also be obtained through the inverse mapping of the formula (6). The invention can tightly couple the shadow generation process and the shadow removal process due to the guarantee of reversibility, and the reversible neural network is also used to have the advantage of saving the parameter quantity of the neural network.

Step 3.2.3, after K +1 is assigned to K, judging whether K is more than K, if so, indicating that the Kth input characteristic F is obtained _K And as the final shadow-free predecoding feature H; otherwise, returning to the step 3.2.2 for sequential execution; wherein K represents the maximum number of modules IB;

step 3.3, the constructed decoding characteristic post-processing module De is used for obtaining the shadow-free image estimated by the network

The decoding characteristic post-processing module De utilizes the formula (4) to obtain the estimated shadow-free image

Step 4, optimizing parameters of the bidirectional mapping network;

step 4.1, the shadow-free image I _ns Inputting the image into a bidirectional mapping network, and obtaining a shallow network characteristic F of a shadow-free image after the image is processed by a decoding characteristic post-processing module De ^ns ；

In the formula (6), the reaction mixture is,

a dot-by-dot division operation is shown,

representing a point-by-point subtraction operation;

representing the second branch output bits after inverse mapping of equation (3)Performing identification;

And as a final shadow predecode feature

Otherwise, returning to the step 3.2.2 for sequential execution; where K represents the maximum number of modules IB, and in this embodiment, K is 4;

step 4.3 shadow Pre-decoding feature

Inputting into a coding feature preprocessing module En to output an estimated shadow image

As shown in fig. 4 b:

step 5, training:

step 5.1, establishing a target loss function L by using the formula (8):

in formula (8), λ _inverse Representing the hyperparameter in the target loss function L; wherein λ _inverse The value of (A) is 0.4.

And 5.2, based on a batch of batch _ size shadow image sets, the corresponding shadow mask image sets and the shadow-free image sets, carrying out supervised training on the bidirectional mapping network by using an Adam optimizer, and calculating an objective loss function L for updating network parameters until the training times reach a set threshold value, so as to obtain a global optimal dual mapping network for removing the shadows of the input shadow images and the shadow mask images.

In order to quantitatively evaluate the effect of the invention and verify the effectiveness of the invention, the method of the invention is compared with algorithms such as ST-CGAN, DSC, DHAN and the like in 2 public real-world data sets. Three performance indexes, namely PSNR (Peak Signal-to-Noise Ratio), SSIM (structural Similarity Index metric) structure Similarity distance and RMSE (root Mean Square error), are selected as evaluation indexes.

Numerical index evaluation is divided into three parts:

the first part is based on error-sensitive image quality assessment, and the evaluation criterion is PSNR as shown in formula (9):

in the formula (9), x and y are the network output image and the target image, respectively, MaxValue represents the maximum dynamic range value that can be obtained by the image, and H and W are the height and width of the image.

The second part is based on image quality evaluation with similar structure, and the evaluation criterion is SSIM as shown in formula (10):

in the formula (10), x and y are respectively the network output image and the target image, mu _x Is the mean value of x, μ _y Is the average value of the values of y,

is the variance of x and is the sum of the differences,

variance of y, σ _xy Is the covariance of x and y, c ₁ And c ₂ Is a constant.

The third part is to use the RMSE index under Lab color space to evaluate the color recovery error, as shown in formula (11);

in equation (11), x and y are the network output image and the target image respectively, Lab (-) means to convert the image x or y to Lab color space, and H and W are the height and width of the image.

Fig. 6 shows the comparison effect of the processing result of the present invention on the real-world shadow picture with other 6 shadow-removing methods. From left to right, the first two columns are respectively an original shadow picture Input and a reference shadow-free picture group Truth; following are different methods of removing shadows: guo et al, SP + M-Net, Param + M + D-Net, G2R, Jin et al, Fu et al, and the shadow removal method Ours as proposed by the present invention. Obvious shadow residual artifacts can be seen in the processing results of the method of Param + M + D-Net and G2R, and the obvious shadow residual artifacts are also proved to have a large promotion space in visual effect. Other methods, like Param + M + D-Net, also introduce very significant artifacts at the shadow edge positions in the processed results. In contrast, the method of the present invention can successfully remove the degradation effect caused by the shadow and successfully recover the content and color information in the original shadow region, which also shows that the shadow removal method of the present invention has superior performance compared with other methods.

FIG. 7 shows the quantitative index performance comparison results of the bidirectional mapping network-based shadow removal method (Ours) of the present invention on the real world dataset ISTD with other shadow removal methods (including Guo et al, ST-CGAN, Mask-ShadowGAN, DSC, DHAN, G2R, Fu et al). The shadow area (S), the non-shadow area (NS) and the whole image Area (ALL) are divided under each data set, and then the three quantitative evaluation indexes are used for measuring the shadow removing effect performance of different methods. Quantitative assessment for different regions has different meanings and purposes: the evaluation of the S area can measure the recovery effect of different methods on the shadow area; the evaluation of the NS region may measure whether different methods affect the non-shadow region; the metric measure for the ALL area is to comprehensively evaluate the performance of the different methods for the shadow removal capability. From the quantitative comparison result of fig. 7, the method of the present invention obtains a comprehensive and optimal performance result on three indexes of PSNR, SSIM, and RMSE, and particularly, the performance of the method of the present invention greatly surpasses that of the second method on the PSNR index.