Disclosure of Invention
The invention aims to: an objective is to propose a super-resolution image reconstruction method based on filter fusion, so as to solve the above problems in the prior art. A further object is to propose a system implementing the above method.
The technical scheme is as follows: a super-resolution image reconstruction method based on filter fusion comprises the following steps:
step 1, constructing a training sample set for fusion use of a post-filter;
step 2, generating a training convolutional network;
step 3, fusing a preset filter into a filter to obtain a new deployment stage model, wherein the filter has the capability of extracting all the features learned in a training stage;
and 4, performing super-resolution reconstruction on the low-resolution image to be reconstructed by using a new deployment stage model to obtain a reconstructed high-resolution image.
In a further embodiment, the step 1 is further: a low resolution image set for training and a high resolution image tag set for comparison of the reconstruction results are constructed.
The construction mode of the low-resolution image test set is as follows: firstly, the existing high-resolution image is downsampled by N times, namely bicubic interpolation is carried out by a factor of N times to obtain a low-resolution test image set, wherein N is a natural number. Then expanding the low-resolution image, performing rotation transformation of 90 degrees, 180 degrees and 270 degrees to obtain low-resolution images with different angles, and performing overlapping sampling on the low-resolution images with different angles to obtain a low-resolution image block with the size of N multiplied by N to generate a final training set, thereby solving the problem that the number of low-resolution image samples is not large enough in the practical problem. N is preferably 4.
The construction mode of the high-resolution image tag set is as follows: the same overlap sampling is performed on the downsampled high-resolution image to obtain an n×n sized high-resolution image block. In supervised machine learning, image blocks obtained from high resolution sampling are used as a reference set for comparison results after low resolution image reconstruction, so in simulation training, high resolution images are labeled as label images.
In a further embodiment, the step 2 is further: the super-resolution reconstruction problem of the image is solved by designing an extremely light convolution network based on filter fusion, the implementation mode is that a low-resolution image LR is taken as input, shallow layer characteristics are extracted through a convolution layer, deep layer characteristics of the image are learned through a stacked CACB module, and finally the extracted shallow layer characteristics and the deep layer characteristics are fused, and a high-resolution image is obtained through up-sampling in a sub-pixel convolution mode. The CACB module consists of four fusion convolution layers, and retains one-fourth of the features of each fusion convolution layer to finally perform feature fusion for final feature fusion; the structural details of the fusion convolutional layers involved in the module are divided into a training phase and a deployment phase.
In a further embodiment, the step 3 is further: in the training stage, the number of trunk modules is reduced as much as possible in the light-weight design, and the capability of extracting features of a single square filter with k being equal to k is limited, so that a multi-branch asymmetric filter is designed to perform more powerful feature learning; an example of a fusion convolution layer in which the left side of the multi-branched asymmetric filter is a training stage is divided into three asymmetric filters with different sizes, and the construction forms are k,1 k and k 1; parameters of the plurality of filters can be obtained after training; wherein K is a positive integer.
The deployment stage is used for carrying out the fusion deployment process of the filters, and the three filters in the training stage are fused into a k-by-k filter by weighting, and the fusion convolution layer in the deployment stage is the fused filter and has all the feature extraction capability learned by the three filters in the training stage.
A super-resolution image reconstruction system based on filter fusion comprises
A first module for constructing a neural network training learning sample set; the module obtains different low-resolution images by performing N times downsampling on the high-resolution images, and expands the generated low-resolution images in a way of performing 90-degree, 180-degree and 270-degree rotation transformation on the obtained low-resolution images to obtain low-resolution images with different angles, and then performing overlapping sampling on each low-resolution image to obtain a group of overlapping low-resolution image blocks serving as a low-resolution training set. And carrying out the same overlapping sampling on the corresponding high-resolution images, and taking the obtained picture set as an image tag for comparing the low-resolution image reconstruction results in the supervised machine learning.
A second module for establishing a training convolutional network; the module establishes a light convolution network to learn the mapping from a low-resolution image to a high-resolution image, takes a low-resolution image LR as input, extracts shallow features through a convolution layer, learns deep features of the image through a stacked CACB module, fuses the extracted shallow features and deep features, and upsamples in a sub-pixel convolution mode to obtain the high-resolution image; the CACB module consists of four fusion convolution layers, the structural details of the fusion convolution layers are divided into a training stage and a deployment stage, and one-fourth of the features of each fusion convolution layer are reserved for final feature fusion.
A third module for fusing the filter deployment phase model; in the training stage, the number of trunk modules is reduced as much as possible in the light-weight design, and the capability of extracting features of a single k square filter is limited, so that a multi-branch asymmetric filter is designed to perform stronger feature learning; an example of a fusion convolution layer in which the left side of the multi-branched asymmetric filter is a training stage is divided into three asymmetric filters with different sizes, and the construction forms are k,1 k and k 1; parameters of the plurality of filters can be obtained after training; wherein K is a positive integer.
The deployment stage is used for carrying out the fusion deployment process of the filters, and the three filters in the training stage are fused into a k-by-k filter by weighting, and the fusion convolution layer in the deployment stage is the fused filter and has all the feature extraction capability learned by the three filters in the training stage.
A fourth module for implementing a low resolution image reconstruction; the module is used for inputting the low-resolution image to be reconstructed into the trained lightweight convolutional network, and reconstructing the high-resolution image by utilizing a new deployment phase model obtained by filter fusion.
The beneficial effects are that: the invention provides a super-resolution image reconstruction method and a super-resolution image reconstruction system based on filter fusion, which utilize the construction of a training convolutional network, train a low-resolution image to reconstruct into a high-resolution image in a deployment model of the filter fusion through supervised machine learning.
Detailed Description
The applicant believes that, aiming at the problems of the method for reconstructing the low-resolution image into the high-resolution image, in the existing high-efficiency super-resolution technology, although good reconstruction performance is obtained, too many modules are stacked, so that the burden on equipment such as a mobile phone is large, and the problems of too slow running time, large calculated amount, too much parameter amount and the like are caused.
In order to solve the problems in the prior art and deploy a super-resolution algorithm into equipment such as a mobile phone and the like, the invention provides a super-resolution image reconstruction method based on filter fusion and a system for realizing the method.
The present invention will be described in more detail with reference to the following examples and the accompanying drawings.
In this application, we propose a super-resolution image reconstruction method based on filter fusion, as shown in fig. 1, by establishing a low-resolution training set and a high-resolution testing set, and training a mapping from a low-resolution image to a high-resolution image by using convolutional network learning, the reconstruction from the low-resolution image to the high-resolution image is further realized, which specifically includes the following steps:
step 1, constructing a training sample set used in fusion with a plurality of filters in the later stage; the training sample set comprises a low-resolution image training set and a high-resolution image set, and the training set is obtained by overlapping and sampling images and obtaining corresponding resolution image blocks.
The training sample set is divided into a low-resolution training set and a high-resolution image block, wherein the low-resolution training set is obtained by the following steps: firstly, performing N times downsampling on a high-resolution image to obtain different low-resolution images; then, expanding the obtained low resolution image; and finally, carrying out overlapping sampling on each obtained low-resolution image to obtain a group of overlapped low-resolution image blocks, and taking the overlapped low-resolution image blocks as a low-resolution training set.
If a high-resolution image with the size of 2K is given, 4 times of bicubic interpolation downsampling is carried out on the high-resolution image to obtain low-resolution images with different sizes, then 90 degrees, 180 degrees and 270 degrees of rotation transformation is carried out on the low-resolution images to obtain four images with different directions, and then overlapping sampling is carried out on the images to obtain a group of low-resolution image blocks with the size of 64x64 to form a training sample of the invention.
The high-resolution image block acquisition mode is as follows: first, the high-resolution image corresponding to the 4-fold downsampling operation is subjected to overlap sampling, and then a group of corresponding overlapped high-resolution image blocks is obtained as a high-resolution tag image.
Taking 100 images in the DIV2K as test images, and performing 4 times downsampling on the test images to obtain a group of low-resolution images, and taking the corresponding high-resolution images as labels to serve as a verification set of machine learning period achievements.
Step 2, generating a training convolution network which can obtain a plurality of filter parameters after training; wherein the training convolutional network is used to enable the ability to learn a low resolution image to high resolution image mapping.
The training convolutional network structure is shown in fig. 2, and the construction process is as follows: firstly, taking a low-resolution image LR as input, extracting shallow features through a convolution layer, then learning deep features of the image through a stacked CACB module, finally fusing the extracted shallow features and deep features, and up-sampling in a sub-pixel convolution mode to obtain a high-resolution image.
The structural details of the CACB module are shown in FIG. 3, and the CACB module is composed of four fusion convolution layers, and one-fourth of the characteristics of each fusion convolution layer are reserved for final characteristic fusion; the structural details of the fusion convolutional layers involved in the module are divided into a training phase and a deployment phase.
The data is put into a convolution network for training, the size of each image block is set to 64x64 in the verification of the example, the number of pictures in each batch is 64 during training, and the momentum and weight attenuation is set to 0.9 and 0.0001. The initial learning rate was set to 0.0001. The maximum iteration number set in this example is 400000, optimization is performed by using a gradient descent method, and iteration is stopped when the iteration number reaches the maximum number.
And 3, merging a plurality of filters into a filter to obtain a new deployment stage model, wherein the filter has the capability of extracting all the features learned in a training stage.
In the training stage, the number of trunk modules is reduced as much as possible in the light-weight design, and the capability of extracting features of a single square filter with k being equal to k is limited, so that a multi-branch asymmetric filter is designed to perform more powerful feature learning; an example of a fused convolution layer in which the left side of a multi-branched asymmetric filter is the training stage is divided into three asymmetric filters of different sizes, as shown in FIG. 4, constructed in the form k,1 k, k 1; parameters of the plurality of filters can be obtained after training; k is a positive integer.
The deployment stage is used for carrying out the fusion deployment process of the filters, and the three filters in the training stage are fused into a k-by-k filter by weighting, and the fusion convolution layer in the deployment stage is the fused filter and has all the feature extraction capability learned by the three filters in the training stage.
Step 4, performing super-resolution reconstruction on the low-resolution image to be reconstructed by using a new deployment stage model to obtain a reconstructed high-resolution image; the new deployment phase model is a deployment phase model which fuses a plurality of filters after machine learning.
As shown in the quantitative evaluation of peak signal-to-noise ratio and light weight index on the DIV2K verification set of the method of the present invention and the IMDN method of table 1 below, by experimental demonstration, compared with the existing IMDN method, in order to test the performance of a network, it is not difficult to find out 100 low resolution pictures for reconstruction in the formed test data, and the quality index reconstructed by the method of the present invention can be almost the same as that of the IMDN method, but the light weight index is far smaller than that of the IMDN.
TABLE 1
Method
|
Quantity of parameters
|
Run time
|
Calculated amount
|
PSNR
|
The invention is that
|
687056
|
0.030s
|
67G
|
29.00
|
IMDN
|
893936
|
0.040s
|
75G
|
29.01 |
Based on the above method, a system for implementing the above method may be constructed, including: a first module for constructing a neural network training learning sample set; the module obtains different low-resolution images by downsampling the high-resolution images by 4 times, and expands the generated low-resolution images in a way of rotating the obtained low-resolution images by 90 degrees, 180 degrees and 270 degrees to obtain low-resolution images with different angles, and then overlapping and sampling each low-resolution image to obtain a group of overlapping low-resolution image blocks serving as a low-resolution training set. And carrying out the same overlapping sampling on the corresponding high-resolution images, and taking the obtained picture set as an image tag for comparing the low-resolution image reconstruction results in the supervised machine learning.
A second module for establishing a training convolutional network; the module establishes a light convolution network to learn the mapping from a low-resolution image to a high-resolution image, takes a low-resolution image LR as input, extracts shallow features through a convolution layer, learns deep features of the image through a stacked CACB module, fuses the extracted shallow features and deep features, and upsamples in a sub-pixel convolution mode to obtain the high-resolution image; the CACB module consists of four fusion convolution layers, the structural details of the fusion convolution layers are divided into a training stage and a deployment stage, and one-fourth of the features of each fusion convolution layer are reserved for final feature fusion.
A third module for fusing the filter deployment phase model; in the training stage, the number of trunk modules is reduced as much as possible in the light-weight design, and the capability of extracting features of a single k square filter is limited, so that a multi-branch asymmetric filter is designed to perform stronger feature learning; an example of a fusion convolution layer in which the left side of the multi-branched asymmetric filter is a training stage is divided into three asymmetric filters with different sizes, and the construction forms are k,1 k and k 1; parameters of the plurality of filters may be obtained after training.
The deployment stage is used for carrying out the fusion deployment process of the filters, and the three filters in the training stage are fused into a k-by-k filter by weighting, and the fusion convolution layer in the deployment stage is the fused filter and has all the feature extraction capability learned by the three filters in the training stage.
A fourth module for implementing a low resolution image reconstruction; the module is used for inputting the low-resolution image to be reconstructed into the trained lightweight convolutional network, and reconstructing the high-resolution image by utilizing a new deployment phase model obtained by filter fusion.
As described above, although the present invention has been shown and described with reference to certain preferred embodiments, it is not to be construed as limiting the invention itself. Various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.