CN115564653A

CN115564653A - Multi-factor fusion image super-resolution method

Info

Publication number: CN115564653A
Application number: CN202211209153.8A
Authority: CN
Inventors: 张云飞; 曹黎俊; 王孝群; 蔡占毅
Original assignee: Jiangsu Jiyuan Medical Technology Co ltd
Current assignee: Jiangsu Jiyuan Medical Technology Co ltd
Priority date: 2022-09-30
Filing date: 2022-09-30
Publication date: 2023-01-03

Abstract

The invention discloses a multi-factor fused image super-resolution method, which comprises the following steps: 1) Acquiring an image to be processed; 2) Performing up-sampling on the image through an interpolation algorithm, and recording the operation as BI; 3) Performing upsampling on an image through a convolution operation, wherein the operation is marked as ZI; 4) Upsampling the image by a pixel reorganization operation, which is denoted as MI; 5) And inputting the results subjected to the up-sampling of the three operations into the weight convolution layer, and outputting a final super-resolution image. The invention discloses a deep learning network integrating a plurality of upsampling modes, which integrates common interpolation, zero insertion convolution and pixel recombination, and can form a better output result by utilizing the advantages of different methods.

Description

Multi-factor fusion image super-resolution method

Technical Field

The invention belongs to the technical field of image processing, and particularly relates to a multi-factor fusion image super-resolution method.

Background

The image super-resolution is a very important research problem in the fields of computer vision and image processing, and is widely applied to practical scenes such as medical image analysis, biological feature recognition, video monitoring, safety and the like. The traditional image interpolation amplification method can only improve the resolution of an image through the content of the image, does not bring more information, and has the side effects of noise amplification, increased calculation complexity, fuzzy result and the like. With the development of deep learning technology, the image hyper-segmentation method based on deep learning obtains the currently optimal performance and effect on a plurality of test tasks.

Disclosure of Invention

The purpose of the invention is as follows: aiming at the prior art, a multi-factor fusion image super-resolution method is provided.

A multi-factor fused image super-resolution method comprises the following steps:

1) Acquiring an image to be processed;

2) Performing up-sampling on the image through an interpolation algorithm, and recording the operation as BI;

3) Performing up-sampling on an image through convolution operation, wherein the operation is marked as ZI;

4) Upsampling the image by a pixel reorganization operation, which is denoted as MI;

5) And inputting the results subjected to the up-sampling of the three operations into the weight convolution layer, and outputting a final super-resolution image.

Preferably, in step 2), the image I is obtained by a bicubic interpolation algorithm, the operation is denoted as BI, and the interpolated data is denoted as OBI (I).

Preferably, in step 3), the convolutional network is denoted as O, the operation is denoted as ZI, and the output data of the convolutional network is denoted as O (ZI (I)).

Preferably, in step 4), 4 sets of convolution are performed on the image I to generate 4 channels of data c1, c2, c3, c4, and if the size of the data is consistent with that of the image to be processed, the output upsampling data c is:

c(2*i,2*j)＝c1(i,j)

c(2*i+1,2*j)＝c2(i,j)

c(2*i+1,2*j+1)＝c3(i,j)

c(2*i,2*j+1)＝c4(i,j)

wherein i, j are pixel coordinates; the pixel reorganization operation is denoted as MI, and the output data is denoted as O (MI (I)).

Preferably, the final result SR (I) is obtained by weighting and fusing the plurality of results by the dynamic weights α, β, γ:

SR(I)＝αO(BI(I))+βO(ZI(I))+γO(MI(I))

preferably, in step 5), a 1 × 1 weight convolution layer is added, the final result SR (I) is input, and the final super-resolution image is output.

Has the advantages that: the invention designs a deep learning network integrating a plurality of upsampling modes, integrates common interpolation, zero insertion convolution and pixel recombination, and can utilize the advantages of different methods to form a better output result.

Detailed Description

The present invention is further explained below.

The method is explained based on the structure of the front-end up-sampling super-resolution network, but the method is applicable to all super-resolution networks with up-sampling structure layers.

For the up-sampling layer in the network, in order to obtain the image with the same resolution by the following up-sampling mode, an interpolation method of respectively enlarging the length and the width by one time is adopted. And calculating the interpolated image by adopting various interpolation strategies.

And respectively adopting a nearest neighbor interpolation method, bilinear interpolation, bicubic interpolation of a 4x4 pixel neighborhood and bicubic interpolation of an 8x8 pixel neighborhood to obtain 4 groups of up-sampled images, including three rgb channels, and obtaining a data block of 12 channels in total. Then, 64 convolution filters of 3x3x12 are connected to obtain data of 64 channels; and then a convolution filter of 1x1x64 is carried out to obtain an image up-sampling result integrating a plurality of interpolation methods. The operation is recorded as BI, the image is recorded as I, the data after interpolation is recorded as BI (I), the subsequent convolution network is recorded as O1, and the output data is recorded as O ₁ (BI (I)). The loss function uses pixel difference loss to optimally compute the network model.

The zero insertion convolution also considers the condition that the length and the width of the image are doubled, and after the original image is doubled by zero insertion, the original image passes through 64 3x3 convolution filters and a 1x1x64 channel transformation filter to output an up-sampling result. The operation is denoted as ZI, the same subsequent network is adopted, and the output data is denoted as O ₂ (ZI (I)). The loss function uses pixel difference loss to optimally compute the network model.

And (4) performing pixel recombination interpolation, and performing 4 groups of convolution on the image I by using 4 groups of convolution, wherein the convolution comprises 64 3x3 convolution filters and a 1x1x64 channel transformation filter to generate 4 channels of data c1, c2, c3 and c4, and the size of the data is consistent with that of the original image. The output up-sampled data c is:

c(2*i,2*j)＝c1(i,j)

c(2*i+1,2*j)＝c2(i,j)

c(2*i+1,2*j+1)＝c3(i,j)

c(2*i,2*j+1)＝c4(i,j)

where i, j are pixel coordinates. When the pixel rearrangement operation is recorded as MI, the output data is recorded as O ₃ The (MI (I)) loss function optimally computes the network model using pixel difference loss.

For the network O, the same parameter structure exists for the above three super-resolution methods, and the network parameters are not equal due to the respective individual optimization. In order to improve the operation efficiency, a mode of fusing three groups of network parameters for optimizing again is adopted, and O = (O) ₁ +O ₂ +O ₃ ) The parameters of/3,O are the average values of the O1, O2, O3 parameters. The output of the new network is noted as: o (BI (I)) + O (ZI (I)) + O (MI (I)), which is the 3-group rgb data after ultrafractionation.

Because of different up-sampling means, the recovery results of the details are different and have advantages, in order to synthesize the results of different up-sampling, 64 3x3x9 convolution filters are connected in sequence to obtain data of a 64-channel; and then a convolution filter of 1x1x64 is used to obtain an image up-sampling result integrating three super-resolution methods. The loss function uses pixel difference loss to optimally compute the entire network model.

The last convolution filter, 1x1x64, is equivalent to performing a weighted fusion calculation on the differently up-sampled outputs of the 3 channels using dynamic weights. Thus, a final super-resolution image with better comprehensive effect than that of a plurality of up-sampling modes is obtained. The loss function can adopt the traditional pixel difference loss and the like, and the network optimization method is also unchanged.

For the Unet and its variant networks, such as DRN networks, the method is also applicable, at the key up-sampling layer, a plurality of up-sampling operations are used for replacement, and a weight fusion operation is added to the final output.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. A multi-factor fused image super-resolution method is characterized by comprising the following steps:

1) Acquiring an image to be processed;

3) Performing upsampling on the image through zero insertion convolution operation, wherein the operation is marked as ZI;

4) Upsampling the image by a pixel reorganization operation denoted as MI;

2. The multi-factor fused image super-resolution method according to claim 1, wherein in step 2), the images to be processed are up-sampled by using a nearest neighbor interpolation algorithm, a bilinear interpolation algorithm, a bicubic interpolation algorithm of a 4x4 pixel neighborhood, and a bicubic interpolation algorithm of an 8x8 pixel neighborhood, respectively, to obtain 4 groups of up-sampled images, each group of up-sampled images includes three rgb channels, a data block of 12 channels is obtained in total, and 64 convolution filters of 3x3x12 are connected behind the data block to obtain a data block of 64 channels; and obtaining an image up-sampling result integrating multiple interpolation methods through a convolution filter of 1x1x64, wherein the operation is marked as BI, the interpolated data is marked as BI (I), and the convolution network is marked as O ₁ And the output data is recorded as O ₁ (BI(I))。

3. The method for super-resolution of multi-factor fused images as claimed in claim 2, wherein in step 3), after the original image is multiplied by zero-insertion convolution, the up-sampled result is output through 64 3x3 convolution filters and a 1x1x64 channel transform filter, the operation is denoted as ZI, and the convolution network is denoted as O ₂ And the output data is marked as O ₂ (ZI(I))。

4. The multi-factor fused image super-resolution method as claimed in claim 3, wherein in step 4), the image I is convolved by 4 sets to generate 4 channels of data c1, c2, c3, c4, and if the size of the data is consistent with that of the image to be processed, the upsampled data c is output as:

c(2*i,2*j)＝c1(i,j)

c(2*i+1,2*j)＝c2(i,j)

c(2*i+1,2*j+1)＝c3(i,j)

c(2*i,2*j+1)＝c4(i,j)

wherein i, j are pixel coordinates; record the pixel reorganization operation as MI and the output data as O ₃ (MI(I))。

5. The multi-factor fused image super resolution method according to claim 4, wherein the final output result SR (I) is obtained by weighting and fusing a plurality of results by dynamic weights α, β, γ:

SR(I)＝αO ₁ (BI(I))+βO ₂ (ZI(I))+γO ₃ (MI(I))。

6. the method for super-resolution of multi-factor fused images as claimed in claim 5, wherein in step 5), in order to synthesize the results of different upsampling, the final output result SR (I) is followed by 64 convolution filters of 3x3x9 to obtain 64 channels of data; and then obtaining an image up-sampling result integrating three super-resolution methods through a convolution filter of 1x1x 64.