WO2020087607A1

WO2020087607A1 - Bi-skip-net-based image deblurring method

Info

Publication number: WO2020087607A1
Application number: PCT/CN2018/117634
Authority: WO
Inventors: 李革; 张毅伟; 王荣刚; 王文敏; 高文
Original assignee: 北京大学深圳研究生院
Priority date: 2018-11-02
Filing date: 2018-11-27
Publication date: 2020-05-07
Also published as: CN109410146A

Abstract

The present invention relates to the field of digital image processing, in particular to a Bi-Skip-Net-based image deblurring method for realizing blurred image restoration by means of a Bi-Skip-Net, which aims to solve the problems in existing deep learning deblurring algorithms of high time complexity, inaccurate texture restoration, and a square effect of a restored image, etc. In the disclosure of the present invention, a Bi-Skip-Net serves as a generative network of a GAN (Generative Adversarial Network), which aims to overcome the defects in existing deep learning deblurring algorithms. Comparing the present invention with existing optimal algorithms, the time complexity is improved by 0.1s, and the image restoration performance is improved by 1dB on average.

Description

An Image Deblurring Method Based on Bi-Skip-Net

Technical field

The invention relates to the field of digital image processing, in particular to a Bi-Skip-Net-based image deblurring method. The method is to realize blurred image restoration through a Bi-Skip-Net network.

Background technique

Deblurring technology is the subject of extensive research in the field of image and video processing. To a certain extent, blurring caused by camera shake seriously affects the imaging quality and visual perception of images. As an important branch of image preprocessing, the improvement of deblurring technology directly affects the performance of other computer vision algorithms, such as foreground segmentation, object detection, behavior analysis, etc. At the same time, it also affects the image coding performance. Therefore, it is imperative to study a high-performance deblurring algorithm.

Documents 1-3 introduces the deblurring technology of image and video processing, deep learning deblurring algorithm; Document 1: Kupyn O, Budzan V, Mykhailych M, et al. DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks [J] .arXiv preprint arXiv: 1711.07064, 2017. Reference 2: Nah S, Kim T, H, Lee K, M. Deep multi-scale convolutional neural network for dynamic scene deblurring [C] // CVPR.2017, 1 (2): 3. Reference 3: Sun J, Cao W, Xu Z, et al. Learning a convolutional neural network for non-uniform motion blur removal [C] // Proceedings of the IEEE Conference Computer on Vision Vision and Pattern Recognition. 2015: 769-777. .

In general, image deblurring algorithms can be divided into traditional algorithms based on probability models and deblurring algorithms based on deep learning. The traditional algorithm uses a convolution model to explain the cause of blur. The process of camera shake can be mapped to a blur kernel trajectory PSF (Point Spread Function). The problem of restoring a clear image when the blur kernel is unknown is an ill-posed problem, so it is usually necessary to estimate the blur kernel first, and then use the evaluated blur kernel to perform the deconvolution operation to obtain the restored image. The deep learning-based deblurring algorithm uses the deep network structure to obtain the latent information of the image, and then realize the blurred image restoration. The deep learning deblurring algorithm can realize two operations of fuzzy kernel estimation and non-blind deconvolution to restore the image, and at the same time, it can also use the generated confrontation mechanism to restore the image. This patent aims to solve the shortcomings of the deblurring algorithm:

1) High time complexity,

2) The texture restoration is not accurate,

3) There is a grid effect in the restored image.

Disclosure of invention

The present invention proposes a Bi-Skip-Net network as a GAN (Generative Adversarial Network) generating network, aiming to solve the shortcomings of the existing deep learning deblurring algorithm. By comparing with the existing optimal algorithm, the present invention improves the time complexity by 0.1s and the original performance of the image complex image by an average of 1dB.

The technical solutions provided by the present invention are as follows: (Note: The technical solutions need to be explained in natural language, and cannot be described in a "picture" way. The technical solutions are best written in the following manner: Step 1: Step 2 ...

The invention adopts a generation-antagonism network mechanism to realize the restoration of blurred images, and a Bi-Skip-Net network is designed as a generator therein. Specific steps are as follows:

1): Input the blurred image, and obtain the shallow features through the convolutional layer with a convolution kernel size of 7x7 and a step size of 1;

2): Pass the shallow features through 3 residual blocks to get the depth features at the current scale;

3): The deep feature is down-sampled and the residual mode is used to obtain the shallow feature at the next scale;

4): According to the specified number of downsampling times n, repeat steps 2 and 3 to obtain shallow features and depth features at different scales, and do not acquire depth features at the smallest scale;

5): Take the shallow features of the smallest scale as basic features;

6): The shallow features of the previous scale are passed through a convolutional layer with a size of 1x1 and a step size of 1 to obtain shallow dimensionality reduction features; the corresponding depth features are passed through a convolutional core with a size of 3x3 and a step size Obtain the depth dimensionality reduction feature for the convolutional layer of 2 and connect it with the basic feature in series for upsampling; connect the upsampled feature and the shallow layer dimensionality reduction feature in series to obtain the basic feature at the current scale;

7): Repeat step 6 until the sampling operation ends;

8): The obtained basic features are passed through a convolution layer with a convolution kernel size of 7x7 and a step size of 1 to obtain residual features;

9): Add the residual features to the input image to obtain the restored image;

…

The Bi-Skip-Net plus residual mode is used as the generator.

In step 4), the number of downsampling is 5 according to the regulations.

Among them, the blurred image is obtained by the generator to obtain the restored image, the task of discrimination is to distinguish the restored image and the clear image as much as possible; and the task of the generator is to deceive the discriminator as much as possible to reduce the ability to distinguish between the two images.

The Bi-Skip-Net network is composed of three parts: contract path (D), Skip path (S) and expand path (U). The contract layer performs downsampling to achieve feature compression, the Skip layer is used to connect deep and shallow features, and the expand layer performs upsampling. D *, S *, U * are the features under the corresponding downsampling scale.

In the feature operation at the sampling scale, in the contract path, the current feature obtains deep features through 3 residual blocks (3xResBlock), and uses the residual mode of pooling and convolution to obtain the next feature. Scale features; in the Skip path, shallow features are compressed by 1x1 convolution, and depth features are compressed by 3x3 convolution; in the expand path, feature connection is achieved by concat, and features are achieved by 3x3 deconvolution Upsampling.

The present invention has the following technical effects: Since the present invention uses the Bi-Skip-Net network as the GAN (Generative Adversarial Network) generating network, compared with the prior art, it has the following technical effects:

1. Low time complexity; compared with the traditional method, the traditional de-motion blur method uses two steps of fuzzy kernel estimation and non-blind deconvolution, and these two steps require multiple iterations to achieve a good recovery effect. Because of this, it also takes a long time to process a single motion blurred image; and the model designed by the present invention can avoid the time loss caused by multiple iteration optimization.

2. Accurate texture recovery; compared with traditional methods, inaccurate fuzzy kernel estimation will cause error recovery of image information in the restoration process, and non-blind rewinding operations often cause ringing effects in the texture portion; The double-span connection network has extracted deep and shallow features at each layer scale. Through feature connection, the network can recover more detailed information to a certain extent.

3. There is no grid effect in the restored image. Compared with the existing deep learning methods, most of the existing deep learning methods are implemented by deconvolution layers during the upsampling process. Since each deconvolution has a certain aliasing effect, This causes some jaggedness in the final restored image, that is, the checker effect mentioned in the present invention.

In order to better understand the concept and principle of the present invention, the following describes the present invention in detail with reference to the accompanying drawings and embodiments. However, the description of specific embodiments does not limit the protection scope of the present invention in any way.

Brief description of the drawings

Figure 1 is a mechanism for generating an adversarial network of the present invention;

2 is a structural diagram of the Bi-Skip-Net network of the present invention;

FIG. 3 is a characteristic operation at a sampling scale of the present invention;

Figure 4 generator design: Bi-Skip-Net + residual;

Figures 5a-d are subjective comparisons between the present invention and other algorithms; where,

Figure 5a: Blurred image;

Figure 5b: The recovery effect of Nah et al;

Figure 5c: The recovery effect of Kupyn et al;

Figure 5d: Bi-Skip-Net recovery effect.

Best way to implement the invention

FIG. 1 is a mechanism for generating an adversarial network adopted by the present invention. Among them, the blurred image is obtained by the generator to obtain the restored image, the task of discrimination is to distinguish the restored image and the clear image as much as possible; and the task of the generator is to deceive the discriminator as much as possible to reduce the ability to distinguish between the two images.

The specific steps of the embodiments of the present invention are as follows:

(1) Design the generator and discriminator. The principle is shown in Figure 4. The blurred image of the building is obtained through the Bi-Skip-Net generator to obtain a clear picture of the building; any other blurred image can be generated using this model Clear picture.

(2) Use the following loss function to train the network,

among them

To fight the loss function,

For the conditional loss function, λ is the weight of the conditional loss function.

By maximizing

To optimize discriminator D;

Optimize generator G by minimizing Equation 3;

among them

The design is as follows:

Among them, L and S respectively represent the output and true value of the model at different levels, and the value of α is 1 or 2, the entire conditional loss function is regulated by the number of channels c, width w and height h.

(3) Take the trained network as the final restoration model.

As shown in FIG. 1, the method of the embodiment of the present invention adopts a generative adversarial network mechanism to achieve blurred image restoration. Figure 2 shows the structure of the Bi-Skip-Net network. Taking the network structure shown in Figure 2, a Bi-Skip-Net network is designed as the generator.

The discriminator parameters in the Bi-Skip-Net network structure diagram are shown in Table 1. Discriminator parameter table.

Table 1. Discriminator parameter table

###	层Floor	参数维度 Parameter dimension		步长Step
11	convconv	32x3x5x532x3x5x5	22
22	convconv	64x32x5x564x32x5x5	11
33	convconv	64x64x5x564x64x5x5	22
44	convconv	128x64x5x5128x64x5x5	11
55	convconv	128x128x5x5128x128x5x5	44
66	convconv	256x128x5x5256x128x5x5	11
77	convconv	256x256x5x5256x256x5x5	44
88	convconv	512x256x5x5512x256x5x5	11
99	convconv	512x512x4x4512x512x4x4	44
1010	fcfc	512x1x1x1512x1x1x1	--

As shown in FIG. 2, the Bi-Skip-Net network designed by the embodiment of the present invention is composed of three parts: a contract path (D), including D0, D1, D2, and D3; a Skip path (S), including S0, S1, S2 And S3; and the expand path (U), including U0, U1, U2, and U3. The contract layer performs downsampling to achieve feature compression, the Skip layer is used to connect deep and shallow features, and the expand layer performs upsampling. Among them, D * (D0, D1, D2, and D3), S * (S0, S1, S2, and S3), and U * (U0, U1, U2, and U3) are features under the corresponding downsampling scale.

Figure 3 is a feature operation at a sampling scale. As shown in Figure 3, in the contract path, that is, the compression path, the current feature obtains deep features through 3 residual blocks (3xResBlock), and uses pooling and volume. The residual mode of product addition is used to obtain the next-scale features; in the Skip path, that is, across the connection path, the shallow features are compressed by 1x1 convolution, and the deep features are compressed by 3x3 convolution; in the expand path, Through concate, namely concatenate, to achieve feature connection, and through 3x3 deconvolution to achieve feature upsampling.

Figure 4 Generator design: Bi-Skip-Net + residuals, as shown in Figure 4, and finally adopt Bi-Skip-Net plus residual mode as the generator.

The comparison results between the implementation of the present invention and other algorithms are shown in Table 2. The test comparison between the present invention and other algorithms on the GoPro dataset.

Table 2. Comparison between the present invention and other algorithms on GoPro dataset

Figures 5a-d are subjective comparisons between the present invention and other algorithms. Figure 5a is a blurred image, Figure 5b is the restoration effect of Nah et al., Figure 5c is the restoration effect of Kupyn et al., And Figure 5d is the restoration effect of Bi-Skip-Net of the present invention. The text "HARDWARE" in the lower left corner of the picture cannot be recognized or blurred in the other three pictures, and the invention can clearly restore and recognize. It can be seen from the subjective comparison of human beings that the present invention has obvious repairing effect on blurred images.

It should be noted that the purpose of the disclosed embodiments is to help further understand the present invention, but those skilled in the art can understand that various replacements and modifications are possible without departing from the spirit and scope of the present invention and the appended claims. of. Therefore, the present invention should not be limited to the contents disclosed in the embodiments, and the scope of protection claimed by the present invention is subject to the scope defined by the claims.

Industrial applicability

The invention is applied to the field of digital image processing, and a blurred image restoration method is realized through a Bi-Skip-Net-based image deblurring method.

Claims

An image deblurring method based on Bi-Skip-Net includes the following steps:

1) Input the blurred image, and obtain the shallow features through the convolution layer with a convolution kernel size of 7x7 and a step size of 1;

2) Pass the shallow features through 3 residual blocks to get the depth features at the current scale;

3) The deep feature is down-sampled and the residual mode is used to obtain the shallow feature at the next scale;

4) According to the specified number of downsampling times n, repeat steps 2 and 3 to obtain shallow features and depth features at different scales, and do not acquire depth features at the minimum scale;

5) Take the shallowest features of the smallest scale as basic features;

6) Pass the shallow features of the previous scale through the convolution kernel with a size of 1x1 and a step size of 1 to obtain shallow dimensionality reduction features; pass the corresponding depth features through the convolution kernel with a size of 3x3 and a step size of The convolutional layer of 2 obtains depth dimensionality reduction features and up-samples with the basic features in series; connects the up-sampled features and shallow layer dimensionality reduction features in series to obtain the basic features at the current scale;

7) Repeat step 6 until the sampling operation ends;

8) Pass the obtained basic features through a convolution layer with a convolution kernel size of 7x7 and a step size of 1 to obtain residual features;

9) Add the residual features to the input image to obtain the restored image;

10) Use Bi-Skip-Net plus residual mode as the generator.
The image deblurring method according to claim 1, characterized in that:

Step 4) According to the specified number of downsampling times is 5.
The image deblurring method according to claim 1, characterized in that:

The Bi-Skip-Net network is composed of three parts: contract path (D), Skip path (S) and expand path (U); Contract layer performs downsampling to achieve feature compression, and Skip layer is used to connect deep features and shallow For layer features, the expand layer performs upsampling; where D *, S *, and U * are features at the corresponding downsampling scale.
The image deblurring method according to claim 3, characterized in that:

In the feature operation at the sampling scale, in the contract path, the current feature obtains deep features through 3 residual blocks (3xResBlock), and uses the residual mode of pooling and convolution to obtain the next feature. Scale features; in the Skip path, shallow features are compressed by 1x1 convolution, and depth features are compressed by 3x3 convolution; in the expand path, feature connection is achieved by concat, and features are achieved by 3x3 deconvolution Upsampling.
The image deblurring method according to claim 1, characterized in that:

The generator described in step 10) is designed as follows,

①Use the following loss function to train the network,

among them
To fight the loss function,
Is the conditional loss function, λ is the weight of the conditional loss function;

By maximizing
To optimize discriminator D;

Optimize generator G by minimizing Equation 3;

among them
The design is as follows:

Among them, L and S respectively represent the output and true value of the model at different levels, and the value of α is 1 or 2, the entire conditional loss function is regulated by the number of channels c, width w and height h;

②Take the trained network as the final restoration model.