WO2020103171A1

WO2020103171A1 - Bi-level optimization method for image deblurring

Info

Publication number: WO2020103171A1
Application number: PCT/CN2018/117635
Authority: WO
Inventors: 李革; 张毅伟; 王荣刚; 王文敏; 高文
Original assignee: 北京大学深圳研究生院
Priority date: 2018-11-21
Filing date: 2018-11-27
Publication date: 2020-05-28
Also published as: CN109544475A

Abstract

A Bi-Level optimization method for image deblurring, which is designed to enable two levels of loss functions to alternative perform model optimization. The Bi-Level optimization mechanism is divided into two steps. In the first step, a basic model is trained using MSE loss conditions, and in the second step, two-level loss interactive iteration is adopted to perform fine-tuning on the model. At the initial stage of training, the divergence between a restoration effect and a clear image is relatively large, the effect of noise can be ignored, while during the later stage of training, the noise is continuously amplified and the negative effect becomes more obvious. Therefore, the perceptual loss is introduced for noise suppression, and in addition, the MSE loss is changed to L ₁ loss to ensure structural continuity. The present method can achieve accurate texture restoration and matching of depth features with pixel values.

Description

Bi-Level optimization method for image deblurring

Technical field

The invention relates to the field of digital image processing, in particular to a Bi-Level optimization method for image deblurring. This method proposes a Bi-Level optimization method during the restoration of a blurred image.

Background technique

Deblurring technology is the subject of extensive research in the field of image and video processing. To a certain extent, blurring caused by camera shake seriously affects the imaging quality and visual perception of images. As an important branch of image preprocessing, the improvement of deblurring technology directly affects the performance of other computer vision algorithms, such as foreground segmentation, object detection, behavior analysis, etc. At the same time, it also affects the image coding performance. Therefore, it is imperative to study a high-performance deblurring algorithm.

Documents 1-3 are the background information of the deep learning deblurring algorithm compared in the present invention: Document 1: Kupyn O, Budzan V, Mykhailych M, et al. DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks [J]. 1711.07064, 2017. Reference 2: Nah S, Kim T, H, Lee K, M. Deep multi-scale convolutional neural network for dynamic scene deblurring [C] // CVPR.2017, 1 (2): 3. Reference 3: Sun J, Cao W, Xu Z, et al. Learning. A convolutional neural network for non-uniform motion blur removal [C] // Proceedings of the IEEE conference Computer on Vision Vision and Pattern Recognition. 2015: 769-777.

In general, image deblurring algorithms can be divided into traditional algorithms based on probability models and deblurring algorithms based on deep learning. The traditional algorithm uses a convolution model to explain the cause of blur. The process of camera shake can be mapped to a blur kernel trajectory PSF (Point Spread Function). The problem of restoring a clear image when the blur kernel is unknown is an ill-posed problem, so it is usually necessary to estimate the blur kernel first, and then use the evaluated blur kernel to perform the deconvolution operation to obtain the restored image. The deep learning-based deblurring algorithm uses the deep network structure to obtain the latent information of the image, and then realize the blurred image restoration. The deep learning deblurring algorithm can realize two operations of fuzzy kernel estimation and non-blind deconvolution to restore the image, and it can also use the generated confrontation method to restore the image. This patent aims to solve the shortcomings of the deep learning deblurring algorithm:

1) Texture restoration is not accurate,

2) The problem that the depth feature does not match the pixel value.

Disclosure of invention

The purpose of the present invention is to propose a Bi-Level optimization method for image deblurring. The Bi-Level optimization method is to optimize GAN (Generative Adversarial Network), aiming to solve the shortcomings of existing deep learning deblurring algorithms . By comparing the existing optimal algorithms, the present invention improves the original performance of the image complex image by 1.3 dB on average.

The technical solutions provided by the present invention are as follows:

MSE loss can guarantee the identity of the optimization process at the pixel level and the feature level, but the problem is that a lot of noise is introduced; and the perceived loss is a good alternative to a certain extent, but it cannot guarantee the optimization Identity. In order to solve the problem of non-identical loss of condition and reduce the complexity of optimization, the present invention proposes a Bi-Level optimization method.

Specifically, the design allows two levels of loss functions to alternate with each other for model optimization. Considering that the L ₁ loss and the MSE loss satisfy the same relationship, L1 will introduce more noise but can retain more texture than MSE, and the perceived loss has an excessive smoothing effect. In the present invention, three kinds of losses are optimized. Introduced at the same time. At the same time, in order to balance the role of various condition losses in the optimization process, the present invention normalizes these losses to the same magnitude according to (Equation 4). As shown in Figure 2, the Bi-Level optimization method is divided into two steps. In the first step, a basic model is trained with MSE loss conditions, and in the second step, a two-level loss interactive iteration is used to fine-tune the model. This is because in the early stage of training, the divergence between the restoration effect and the clear image is relatively large, and the role of noise can be ignored. However, in the late stage of training, the noise is continuously amplified to make its negative effect more obvious. Change MSE loss to L ₁ loss to ensure structural continuity.

The present invention has the following technical effects:

1. The texture restoration is accurate. In the process of image restoration, the similarity of image pixel values can ensure the similarity of depth features, but the contrary is not true; that is, the use of feature-level perceptual loss function may cause image detail information repair failure.

2. The depth feature matches the pixel value. Considering that the pixel level MSE and L1 often cause noise amplification, and the perceived loss can effectively suppress the noise. By combining the above loss functions, the invention patent can guarantee to a certain extent. Pixel texture recovery is accurate, and it can ensure that the depth feature matches the pixel value.

Brief description of the drawings

Figure 1 the process of generating an adversarial network;

Figure 2 Bi-Level optimization process;

Figure 3 Bi-Skip-Net structure diagram;

Figure 4 generator design: Bi-Skip-Net + residual;

Figure 5 Subjective comparison of the present invention with other algorithms, in which:

Figure 5a is a comparison diagram of blurred images;

Figure 5b is a comparison diagram of the recovery effects of Nah et al;

Figure 5c is a comparison diagram of the recovery effects of Kupyn et al;

Figure 5d is a comparison graph of the restoration effect of the Bi-Level optimization method.

Best way to implement the invention

The following describes the present invention in detail with reference to the drawings and embodiments, but does not limit the scope of the present invention in any way.

Figure 1 generates the adversarial network process, Figure 4 generator design: Bi-Skip-Net + residual, Table 1 is the discriminator parameter table, as shown in the figure,

Table 1. Discriminator parameter table

The specific steps of the Bi-Level optimization method for image deblurring of the present invention are as follows:

(1) The process of generating confrontation network is used to restore the blurred image, and the generator and discriminator are designed using Figure 4 and Table 1;

(2) As shown in FIG. 1, the blurred image is input to the generator to obtain the restored image; after that, the restored image and the clear image are input to the discriminator to distinguish the clear image. As shown in FIG. 4, the present invention uses Bi-Skip-Net to train the image residuals, and uses the blurred image + image residuals mode to restore the image.

(3) Use the following loss function to train the network;

among them

To fight the loss function,

For the conditional loss function, λ is the weight of the conditional loss function.

By maximizing

To optimize discriminator D;

Optimize generator G by minimizing Equation 3;

among them

The design is as follows:

Among them, L and S respectively represent the output and true value of the model at different levels, and the value of α is 1 or 2, the entire conditional loss function is regulated by the number of channels c, width w and height h.

(4) Use the Li-Level optimization method shown in Figure 2 to optimize the network;

The Li-Level optimization method of the present invention includes two steps (the number of epochs in the training process is N).

Step1: When the number of iterations is less than 1 / 3N, the present invention adopts pixel level mean square error (MSE) as the loss function to train the model;

Step2: When the number of iterations is greater than or equal to 1 / 3N, the present invention uses a pixel-level L1 loss function and a feature-level perceptual loss function to alternately train the model. The present invention performs loss function replacement every two iterations during the experiment.

(5) Take the trained network as the final restoration model.

Figure 3 is the structure diagram of Bi-Skip-Net. As shown in Figure 3, as shown in the figure, Bi-Skip-Net consists of a compression path (D *), a double-span connection path (S *) and an expansion path (U *) 3 parts. The compression path extracts the depth and shallow features of the image; the double-span connection path connects the image features with the upsampling features in the expansion path; the expansion path implements feature upsampling.

Fig. 5 is a subjective comparison between the present invention and other algorithms. Among them, Fig. 5a is a comparison of blurred images; Fig. 5b is a comparison of restoration effects of Nah et al .; Fig. 5c is a comparison of restoration effects of Kupyn et al .; Fig. 5d is Bi -A comparison diagram of the restoration effect of the Level optimization method, that is, a comparison diagram of the restoration effect of the present invention. Among them, Figure 5a, Figure 5b, Figure 5c, and Figure 5d are three identical photos, and two points are specifically selected with two boxes in each picture. The enlarged views of these two boxes are correspondingly placed in this Below the picture, you can see the restoration effect clearly. The comparison results are shown in Table 2. The test comparison between the present invention and other algorithms on the GoPro data set.

Table 2. Comparison between the present invention and other algorithms on GoPro dataset

It should be noted that the purpose of the disclosed embodiments is to help further understand the present invention, but those skilled in the art can understand that various replacements and modifications are possible without departing from the spirit and scope of the present invention and the appended claims of. Therefore, the present invention should not be limited to the contents disclosed in the embodiments, and the scope of protection claimed by the present invention is subject to the scope defined by the claims.

Industrial applicability

The de-blurring algorithm based on deep learning of the present invention utilizes the deep network structure to obtain the latent information of the image, and then realizes the restoration of the blurred image. The deep learning deblurring algorithm can realize two operations of fuzzy kernel estimation and non-blind deconvolution to restore the image, and it can also use the generational confrontation method to restore the image.

Claims

Bi-Level optimization method for image deblurring, the specific steps are:

The first step is to train a basic model with MSE loss conditions;

The second step is to use two-level loss interaction iteration to fine-tune the model.

The Bi-Level optimization method for image deblurring according to claim 1, characterized in that:

(1) The generator is designed in the following way: Bi-Skip-Net + residual, the discriminator parameters are based on Table 1:

Table 1. Discriminator parameter table

## Floor Parameter dimension Step 1 conv 32x3x5x5 2 2 conv 64x32x5x5 1 3 conv 64x64x5x5 2 4 conv 128x64x5x5 1 5 conv 128x128x5x5 4 6 conv 256x128x5x5 1 7 conv 256x256x5x5 4 8 conv 512x256x5x5 1 9 conv 512x512x4x4 4 10 fc 512x1x1x1 -

(2) Use Bi-Skip-Net to train the image residuals, and use the blurred image + image residuals mode to restore the image;

(3) Train a basic model with MSE loss conditions, and use the following loss function to train the network;

among them

To fight the loss function,

Is the conditional loss function, and λ is the weight of the conditional loss function.

By maximizing

To optimize discriminator D;

Optimize generator G by minimizing Equation 3;

among them

The design is as follows:

Among them, L and S respectively represent the output and true value of the model at different levels, and the value of α is 1 or 2, the entire conditional loss function is regulated by the number of channels c, width w and height h;

(4) Use double-level loss interactive iteration to fine-tune the model, specifically using Li-Level optimization mechanism to optimize the network;

(5) Take the trained network as the final restoration model.

The Bi-Level optimization method for image deblurring according to claim 2, characterized in that:

The Bi-Skip-Net is composed of three parts: compression path (D *), double-span connection path (S *) and expansion path (U *); the compression path extracts the depth and shallow features of the image; double-span The connection path connects the image features with the upsampling features in the expansion path; the expansion path implements feature upsampling.
The Bi-Level optimization method for image deblurring according to claim 2, wherein the Li-Level optimization mechanism in step (4) includes two steps (set the number of epochs in the training process to N),

Step1: When the number of iterations is less than 1 / 3N, the present invention adopts pixel level mean square error (MSE) as the loss function to train the model;

Step2: When the number of iterations is greater than or equal to 1 / 3N, the present invention uses a pixel-level L1 loss function and a feature-level perceptual loss function to alternately train the model. The present invention performs loss function replacement every two iterations during the experiment.