CN113506215A

CN113506215A - Super-resolution image reconstruction method and device based on wide activation and electronic equipment

Info

Publication number: CN113506215A
Application number: CN202110691006.8A
Authority: CN
Inventors: 张艳红; 侯芸; 董元帅; 周晶; 钱振宇; 田佳磊; 仝鑫隆
Original assignee: Checsc Highway Maintenance And Test Technology Co ltd; China Highway Engineering Consultants Corp
Current assignee: Checsc Highway Maintenance And Test Technology Co ltd; China Highway Engineering Consultants Corp; CHECC Data Co Ltd
Priority date: 2021-06-22
Filing date: 2021-06-22
Publication date: 2021-10-15
Anticipated expiration: 2041-06-22
Also published as: CN113506215B

Abstract

The invention provides a super-resolution image reconstruction method and device based on wide activation, electronic equipment and a storage medium. The super-resolution image reconstruction method based on wide activation comprises the following steps: acquiring a low-resolution LR image to be reconstructed; inputting a low-resolution LR image to be reconstructed into a trained self-adaptive resampling super-resolution image reconstruction model based on a wide-activation residual structure to obtain a high-resolution HR image output by the self-adaptive resampling super-resolution image reconstruction model based on the wide-activation residual structure; the adaptive resampling super-resolution image reconstruction model based on the wide-activation residual structure is used for reconstructing an LR image into a corresponding HR image at least based on an interpolation algorithm and the wide-activation residual structure.

Description

Super-resolution image reconstruction method and device based on wide activation and electronic equipment

Technical Field

The invention relates to the field of computer vision and image processing, in particular to a super-resolution image reconstruction method and device based on wide activation, electronic equipment and a storage medium.

Background

The spatial resolution is an important image quality evaluation index, and represents how much scene information an image in a unit range contains. In general, high resolution images can provide clearer, richer detail information than low resolution images, aid in a full understanding of the image content and facilitate performance of post-processing tasks. Super-Resolution (SR) reconstruction is a post-processing technique for improving the intrinsic Resolution of an image from a software level, and is widely applied in the fields of medical imaging, intelligent monitoring, virtual reality and the like. The goal of the image SR reconstruction is to recover a High Resolution (HR) image of the same scene from one or more Low Resolution (LR) images.

In the prior art, there are three types of interpolation methods, modeling/reconstruction methods, and machine learning methods, wherein the machine learning methods include a conventional shallow learning method and a deep learning method.

Although the deep learning method greatly promotes the development of image SR reconstruction in recent years, the idea of improving reconstruction performance by increasing the scale of the neural network has met with a certain bottleneck. Meanwhile, most applications require that the SR reconstruction method has certain timeliness, and the calculation amount and storage resource consumption of a large-scale neural network can hinder the practical deployment of the algorithm to a certain extent. On the other hand, the conventional interpolation algorithm usually assumes that the image signal is a continuous signal with limited bandwidth, and the reconstruction result is easy to generate blurring, artifacts and the like.

Disclosure of Invention

The invention provides a super-resolution image reconstruction method, a super-resolution image reconstruction device, electronic equipment and a storage medium based on wide activation, which overcome the problems that the traditional interpolation algorithm in the prior art is easy to generate fuzziness and artifacts and the problem that the algorithm actual deployment is influenced by the ultra-large calculation amount and storage resource consumption of a neural network method, and construct an efficient SR reconstruction model with an accurate reconstruction effect by combining the strong nonlinear expression capability of a deep learning method and the efficient execution speed of the interpolation method, thereby improving the balance between the efficiency and the effect of the SR reconstruction of an image and promoting the actual application and deployment of the related technology.

Specifically, the embodiment of the invention provides the following technical scheme:

in a first aspect, an embodiment of the present invention provides a super-resolution image reconstruction method based on wide activation, including:

acquiring a low-resolution LR image to be reconstructed;

inputting the LR image to be reconstructed into a trained self-adaptive resampling super-resolution image reconstruction model based on a wide-activation residual structure to obtain a high-resolution HR image output by the self-adaptive resampling super-resolution image reconstruction model based on the wide-activation residual structure;

the adaptive resampling super-resolution image reconstruction model based on the wide-activation residual structure is used for reconstructing the LR image into the corresponding HR image at least based on an interpolation algorithm and the wide-activation residual structure.

Further, the wide-activation-based super-resolution image reconstruction method further comprises the following steps:

the self-adaptive resampling super-resolution image reconstruction model based on the wide-activation residual structure comprises a self-adaptive interpolation kernel estimation layer and a self-adaptive resampling layer;

the adaptive interpolation kernel estimation layer is used for generating an adaptive interpolation kernel for each spatial position of the HR image so as to be used for carrying out fine finishing on the image by the subsequent adaptive resampling layer;

the adaptive resampling layer is configured to adaptively apply the adaptive interpolation kernel to a corresponding location of the LR image to reconstruct the HR image after generating the adaptive interpolation kernel.

the method further comprises the following steps:

based on the training sample images, training an adaptive resampling super-resolution image reconstruction model based on a wide-activation residual structure to optimize the model parameters.

the self-adaptive interpolation kernel estimation layer further comprises a shallow feature extraction layer, a plurality of stacked wide activation residual modules, a deep feature extraction layer and a Pixel Shuffle upsampling layer,

wherein the shallow feature extraction layer, the plurality of stacked wide activation residual modules, and the deep feature extraction layer constitute the wide activation residual structure and are used to perform nonlinear reasoning of the adaptive interpolation kernel estimation layer to output a feature map of the LR image.

the adaptive resampling layer is configured to, after generating the adaptive interpolation kernel, adaptively apply the adaptive interpolation kernel to a corresponding location of the LR image to reconstruct the HR image, including:

scaling the input LR image to a desired size of the HR image based on the interpolation algorithm to obtain X_intWherein X is_intRepresenting a temporary image of the same size as the target HR image, obtained directly by interpolation from the LR image; and

applying the weight of the adaptive interpolation kernel to X for each channel of the input LR image_intAnd the positions of the image blocks with the interval s, wherein s is 2, 3 or 4.

wherein the plurality of stacked wide activation residual modules comprises two layers of identical dilation convolutions with a dilation rate r >1, a ReLU activation function is employed between the two layers of dilation convolutions,

and increasing the number of channels of the output feature map during the first layer of convolution, inputting the number of channels into the ReLU activation function, and reducing the number of channels of the output feature map during the second layer of convolution.

the Pixel Shuffle upsampling layer employs a sub-Pixel convolution network and is used to spatially correspond the generated interpolation kernel to the HR image.

In a second aspect, an embodiment of the present invention further provides a wide-activation-based super-resolution image reconstruction apparatus, including:

an image acquisition unit for acquiring a low-resolution LR image to be reconstructed;

the reconstruction unit is used for inputting the low-resolution LR image to be reconstructed into a trained self-adaptive resampling super-resolution image reconstruction model based on a wide-activation residual structure to obtain a high-resolution HR image output by the self-adaptive resampling super-resolution image reconstruction model based on the wide-activation residual structure; and

In a third aspect, an embodiment of the present invention further provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor executes the program to implement the steps of the wide-activation-based super-resolution image reconstruction method as described above.

In a fourth aspect, an embodiment of the present invention further provides a storage medium including a computer program stored thereon, wherein the computer program is configured to, when executed by a processor, implement the steps of the method for wide-activation-based super-resolution image reconstruction as described above.

According to the technical scheme, the super-resolution image reconstruction method, the super-resolution image reconstruction device, the electronic equipment and the storage medium based on wide activation provided by the embodiment of the invention overcome the problems that the traditional interpolation algorithm is easy to generate fuzziness and artifacts in the prior art and the problem that the algorithm actual deployment is influenced by the ultra-large calculation amount and the storage resource consumption of a neural network method, and the high-efficiency SR reconstruction model with the accurate reconstruction effect is constructed by combining the strong nonlinear expression capability of a deep learning method and the high-efficiency execution speed of the interpolation method, so that the balance between the efficiency and the effect of the SR reconstruction of the image is improved, and the practical application and deployment of the related technology are promoted.

Drawings

In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

Fig. 1 is a flowchart of a super-resolution image reconstruction method based on wide activation according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of an adaptive resampling super-resolution image reconstruction model based on a wide-activation residual structure according to an embodiment of the present invention;

fig. 3 is a schematic diagram of an internal structure of a Wide Activation Residual Block (WARB) according to an embodiment of the present invention; and

FIG. 4 is a graph of performance/parameter compromise contrast effect of an exemplary image SR model according to an embodiment of the present invention, where the data is for the case of SR × 4 reconstruction performed on a Manga109 data set;

fig. 5 is a schematic structural diagram of a wide-activation-based super-resolution image reconstruction apparatus according to an embodiment of the present invention; and

fig. 6 is a schematic diagram of an electronic device according to an embodiment of the invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The various terms or phrases used herein have the ordinary meaning as is known to those skilled in the art, and even then, it is intended that the present invention not be limited to the specific terms or phrases set forth herein. To the extent that the terms and phrases referred to herein have a meaning inconsistent with the known meaning, the meaning ascribed to the present invention controls; and have the meaning commonly understood by a person of ordinary skill in the art if not defined herein.

Although the deep learning method in the prior art has greatly promoted the development of the image SR reconstruction in recent years, the idea of increasing the reconstruction performance by increasing the neural network size has met with a certain bottleneck. Meanwhile, most applications require that the SR reconstruction method has certain timeliness, and the calculation amount and storage resource consumption of a large-scale neural network can hinder the practical deployment of the algorithm to a certain extent. On the other hand, the conventional interpolation algorithm usually assumes that the image signal is a continuous signal with limited bandwidth, and the reconstruction result is easy to generate blurring, artifacts and the like.

In view of the above, in a first aspect, an embodiment of the present invention provides a super-resolution image reconstruction method based on wide activation, which overcomes the problems that a conventional interpolation algorithm in the prior art is prone to blur and artifact, and that the algorithm actual deployment is affected by the huge calculation amount and storage resource consumption of a neural network method, and combines the powerful nonlinear expression capability of a deep learning method and the efficient execution speed of the interpolation method to construct an efficient SR reconstruction model with an accurate reconstruction effect, thereby improving the balance between the efficiency and the effect of the image SR reconstruction, and promoting the actual application and deployment of the related technologies.

The wide-activation based super-resolution image reconstruction method of the present invention is described below with reference to fig. 1.

Fig. 1 is a flowchart of a super-resolution image reconstruction method based on wide activation according to an embodiment of the present invention.

In this embodiment, it should be noted that the super-resolution image reconstruction method based on wide activation may include the following steps:

s1: acquiring a low-resolution LR image to be reconstructed;

s2: inputting a low-resolution LR image to be reconstructed into a trained self-adaptive resampling super-resolution image reconstruction model based on a wide-activation residual structure to obtain a high-resolution HR image output by the self-adaptive resampling super-resolution image reconstruction model based on the wide-activation residual structure;

the adaptive resampling super-resolution image reconstruction model based on the wide-activation residual structure is used for reconstructing an LR image into a corresponding HR image at least based on an interpolation algorithm and the wide-activation residual structure.

An adaptive resampling super-resolution image reconstruction model based on a wide-activation residual structure provided by an embodiment of the present invention is described below with reference to fig. 2.

Fig. 2 is a schematic structural diagram of an adaptive resampling super-resolution image reconstruction model based on a wide-activation residual structure according to an embodiment of the present invention.

In this embodiment, it should be noted that the method for reconstructing a super-resolution image based on wide activation may further include: the self-adaptive resampling super-resolution image reconstruction model based on the wide-activation residual structure comprises a self-adaptive interpolation kernel estimation layer and a self-adaptive resampling layer; the adaptive interpolation kernel estimation layer is used for generating an adaptive interpolation kernel for each spatial position of the HR image so as to be used for carrying out fine finishing on the image by a subsequent adaptive resampling layer; the adaptive resampling layer is used for adaptively applying the adaptive interpolation kernel to the corresponding position of the LR image after generating the adaptive interpolation kernel to reconstruct the HR image.

Specifically, the entire model consists of two parts, a first part that estimates the adaptive interpolation kernel and a second part that applies the estimated interpolation kernel to adjust the upsampled image.

In this embodiment, it should be noted that the method for reconstructing a super-resolution image based on wide activation may further include: the adaptive interpolation kernel estimation layer further comprises a shallow feature extraction layer, a plurality of stacked wide activation residual modules, a deep feature extraction layer and a Pixel Shuffle upsampling layer, wherein the shallow feature extraction layer, the plurality of stacked wide activation residual modules and the deep feature extraction layer form a wide activation residual structure and are used for executing nonlinear reasoning of the adaptive interpolation kernel estimation layer to output feature mapping of an LR image.

Specifically, in the interpolation kernel estimation section, a content-based interpolation kernel is calculated for each position in the image using a data-driven method. For example, a full Convolutional neural Network (FCN) is used to calculate the weight values of the interpolation kernel, which includes a shallow feature extraction layer (3 × 3 convolution), a set of stacked Wide Activation Residual Blocks (WARB), a deep feature extraction layer (3 × 3 convolution), and a Pixel buffer upsampling layer.

Specifically, before the upsampling layer, the network is mainly used for the nonlinear reasoning of the adaptive interpolation kernel estimation, and the output of the nonlinear reasoning is a set of feature maps in the LR image space

Where h and w represent the height and width of the feature map, respectively, k is the spatial size of the interpolation kernel (assuming the interpolation kernel is square), and s is the upsampling factor. In the spatial dimension, x_LWith the same resolution as the LR image. In order for the estimated interpolation kernel to correspond to the HR image, it needs to be upsampled (implemented by the Pixel Shuffle layer). Representing the interpolated kernel after up-sampling as

Consistent with the HR image in the spatial dimension. In this embodiment, x_HEach spatial position in the image corresponds to a k × k vector, which can be reorganized into a rectangular interpolation kernel. Based on this, the adaptive interpolation kernel estimation process is for each spatial bit of the HR imageAnd generating an adaptive interpolation kernel for fine finishing of subsequent images.

The internal structure of the Wide Activation Residual Block (WARB) provided by an embodiment of the present invention is described below in conjunction with fig. 3.

Fig. 3 is a schematic diagram of an internal structure of a Wide Activation Residual Block (WARB) according to an embodiment of the present invention.

Specifically, in the adaptive interpolation kernel reasoning stage, the network comprises a shallow feature extraction layer, a group of stacked wide activation residual modules WARB, a deep feature extraction layer and an upsampling layer, wherein the shallow feature extraction layer and the deep feature extraction layer both adopt 3 × 3 convolutions with an expansion rate r of 2, the WARB comprises two layers of identical 3 × 3 convolutions, a ReLU activation function is adopted between the shallow feature extraction layer and the deep feature extraction layer, 4 WARB modules are arranged in total, and all convolutions adopt a dense convolution mode with a stride of 1. The number of the whole model output channels is 32, and the wide activation rate inside the WARB is set to be 4, namely, the dimension of the wide activation output channel is 32 multiplied by 4 which is 128. For the Pixel Shuffle layer, we use Sub-Pixel convolution Network (ESPCNN) to implement. The model parameters are initialized by an Xavier method, and the size of the mini-batch is set to be 16. In model training, the fixed LR training image blocks have a size of 48 × 48, and the HR image blocks of the corresponding scales have sizes of 96 × 96(s ═ 2), 144 × 144(s ═ 3), and 192 × 192(s ═ 4), respectively. The optimizer adopts Adam optimization method, wherein beta₁＝0.9，β₂0.999, e 1 e-8. The learning rate is initially 1e-4, halved every 10 ten thousand iterations, for a total of 50 ten thousand iterations.

In this embodiment, it should be noted that the method for reconstructing a super-resolution image based on wide activation may further include: the Pixel Shuffle upsampling layer employs a sub-Pixel convolution network and is used to spatially correspond the generated interpolation kernel to the HR image.

In this embodiment, it should be noted that the method for reconstructing a super-resolution image based on wide activation may further include: the adaptive resampling layer is configured to, after generating the adaptive interpolation kernel, adaptively apply the adaptive interpolation kernel to a corresponding location of the LR image to reconstruct the HR image, including: will output based on interpolation algorithmThe incoming LR image is scaled to the desired size of the HR image to obtain X_intWherein X is_intRepresenting a temporary image of the same size as the target HR image, obtained directly by interpolation from the LR image; and applying the weight of the adaptive interpolation kernel to X for each channel of the input LR image_intAnd the positions of the image blocks with the interval s, wherein s is 2, 3 or 4.

Specifically, after the interpolation kernel weight parameters are estimated, they are adaptively applied to the corresponding positions of the LR input image to reconstruct the HR image. Neighboring pixels of the HR image can be resampled from the same set of pixels of the LR image, but the final intensity values are different because each pixel in the HR image space has a different respective interpolation kernel.

More specifically, the LR input image is scaled to the desired size of the HR image by using an interpolation algorithm (common polynomial interpolation techniques such as nearest neighbor interpolation NN, Bilinear interpolation Bilinear, or Bicubic interpolation Bicubic may be selected), and X is obtained_intAt this time, X_intWith adaptive interpolation kernel x_HAnd the HR image have the same spatial dimensions, which is beneficial for the subsequent adaptive resampling operation.

In this embodiment, x_HEach space position corresponds to an interpolation kernel weight of k X k, and the adaptive resampling operation directly applies the interpolation kernel weight to X_intAs in the following formula (1):

where y is the resampled HR image. That is, for each channel of the input image, an interpolation kernel weight of k × k is applied to X_intAt the image block position of the middle interval s, this is similar to the operation of multiplying the expansion rate r of the expansion convolution interval.

In this embodiment, it should be noted that the method for reconstructing a super-resolution image based on wide activation may further include: the multiple stacked wide activation residual modules comprise two layers of same expansion convolutions with expansion rate r >1, ReLU activation functions are adopted between the two layers of expansion convolutions, the number of channels of output feature mapping is increased during the first layer of convolution and then input into the ReLU activation functions, and the number of channels of output feature mapping is reduced during the second layer of convolution.

Although deep learning models become more expressive as the depth of the network increases, for low-level visual tasks, the increase in depth of the network may also result in underutilization of shallow features. The traditional method for solving the insufficient utilization of the shallow feature is to introduce a step Connection (Skip Connection) or a Concatenation Operation (localization Operation) to directly transmit the shallow feature to a position deeper in the network. A common approach to facilitating the full utilization of shallow features is wide activation. Specifically, the number of channels of the output feature map is increased during the first layer of convolution, and then the number of channels is input into the ReLU activation function, and the number of channels of the output feature map is decreased during the second layer of convolution. In this way, the parameter size and the amount of computation of the entire module remains unchanged, but the shallow feature flows through the ReLU activation function in favor of it. While the first approach facilitates network training, the so-called short Connection makes the network resemble the integration of multiple shallow networks. While the second approach, while avoiding this problem, does not take advantage of network training efficiently.

In order to make an effective compromise between the two, the present invention employs a wide activation residual structure as shown in fig. 3. In addition, in order to increase the model receptive field, the general 3 × 3 Convolution in fig. 2 may be replaced by a Dilated Convolution (scaled Convolution) with a dilation rate r >1, and the experimental result verifies the effectiveness of the proposed WARB module in estimating the adaptive interpolation kernel and promoting the model reconstruction performance.

The proposed model is a typical end-to-end mapping from the LR image to the HR image. The estimation of the model parameters is achieved by minimizing the reconstruction error between the model output HR image f (x) and the true HR image y, where f (-) represents the mapping function of the entire network. In the field of image restoration, L₂The objective function is generally more popular because it can provide a maximization index. However, recent studies have shown that L₁The objective function is more favorable for model convergence, so L is also adopted₁The objective function trains the model. Given trainingData set

Where | D | represents the total number of samples, L₁The objective function is expressed as shown in the following equation (2):

where θ represents the parameter set for the entire network. It is noted that although the L1 objective function is not differentiable at x ═ 0, the model was trained using Batch (Batch) data. The probability of an error of 0 between the output of the model and the HR image on a Batch is small and has little effect on actual training.

In this embodiment, it should be noted that the method for reconstructing a super-resolution image based on wide activation may further include: and training a self-adaptive resampling super-resolution image reconstruction model based on the wide-activation residual error structure based on the training sample image so as to optimize the model parameters.

Specifically, the model provided by the embodiment of the present invention is referred to as a Wide Activation Residual sampling Network (war rn), that is, a model for reconstructing a super-resolution image based on Wide Activation Residual structure and adaptive Resampling. The training process for the entire model is given below as follows:

inputting: training set

A model parameter θ;

and (3) outputting: optimizing a model parameter theta;

initialization: initializing model parameters by using an Xavier method, wherein T is 5 multiplied by 10⁵，lr＝1e-4，bSize＝16；

While t<T

Random extraction of bSize 48 × 48 small image blocks from LR images

From HRExtracting corresponding HR image blocks from corresponding positions of the image

Data of one batch

Inputting to the model, calculating the model output

Calculating a model loss function according to a formula (2), and executing a feedback propagation algorithm to update a model parameter theta;

end up

The WARRN model provided by the embodiment of the invention is realized by TensorFlow1.11.0, and 50 ten thousand steps are iterated on one NVIDIA GeForce GTX 1080Ti GPU in total.

More specifically, embodiments of the present invention use a standard DIV2K training set to train the model, which contains 800 high quality training images, with image degradation using a typical bi-cubic downsampling. During model training, the size of the LR image block is fixed to be 48 × 48, and HR image blocks with corresponding sizes are sampled at corresponding positions of the HR image. The model performance evaluation uses five common test data sets, Set5, Set14, B100, Urban100, and Manga 109. The test data sets contain rich image contents, cover most natural scenes in daily life, such as people, animals, buildings, natural landscapes and the like, and can effectively evaluate the generalization performance of the model on different types of images.

As shown in fig. 4, when performing SR × 4 reconstruction on the Manga109 dataset, the performance advantage of WARRN can be observed more clearly compared to the trade-off effect of the method in terms of performance/parameters. Wherein the horizontal axis is the parameter (M) and the vertical axis is the PSNR peak signal-to-noise ratio dB.

More specifically, the model performance evaluation of the embodiment of the invention adopts two indexes of typical peak signal-to-noise ratio (PSNR) and structural similarity measurement (SSIM). PSNR is generally determined by the maximum possible pixel value that can be obtained for an image and the mean square error MSE between the images, and given a reconstructed image y and a corresponding reference image x, PSNR is defined as shown in the following formula:

where c represents the number of bits of a binary pixel, 2^c-1 is the pixel peak. Natural image data generally represents pixels using an unsigned 8-bit integer, so c is 8, and PSNR generally takes a value between 20 and 40 in dB. MSE is the mean square error between the predicted image and the reference image, and the calculation formula is shown as the following formula:

where H and W are the image height and width, respectively. For a multi-channel color image, the calculation of formula (4) is applied to each channel.

In contrast, SSIM is able to better reflect differences in image structure details, the calculation of which is based primarily on image brightness, contrast, and structural similarity. Given a reconstructed image y and a corresponding reference image x, the definition of SSIM is shown in equation (5) below:

SSIM(x，y)＝[L(x，y)]^α·[C(x，y)]^β·[S(x，y)]^γ

where L (-), C (-), and S (-), are the brightness comparison, contrast comparison, and structure comparison functions, respectively, and α, β, and γ are the three control parameters that are all greater than 0, used to adjust the relative importance of the three functions. The detailed definitions of these three functions are shown in the following formulas:

wherein, mu_xAnd mu_yRespectively representing the pixel mean, σ_xAnd σ_yIs the standard deviation of the pixel, σ_xyRepresenting the covariance between the two. C₁、C₂And C₃Are three constants to prevent systematic errors with a denominator of 0. In practical applications, α ═ β ═ γ ═ 1 and C are generally given₃＝C₂And/2, then equation (5) can be written as shown in equation (9) below:

it is obvious from the definition that both evaluation indexes have symmetry, i.e., PSNR (x, y) ═ PSNR (y, x) and SSIM (x, y) ═ SSIM (y, x). The larger the value of both is, the better, but the value of PSNR has no upper limit, and the value range of SSIM is 0 to 1.0.

Specifically, in the aspect of visual effect comparison, the visual effects of several typical image SR methods are compared, including Bicubic interpolation Bicubic, SRCNN, VDSR, DRRN, laprn, MemNet, and the like. The visual effect comparison shows that all the methods amplify the Butterfly image of Set5 by 4 times, and it can be seen that the reconstruction results of other methods have obvious blurring effect at the edge position, but the edge processing effect of the provided WARRN model is relatively good. Meanwhile, quantitative indexes of the corresponding method on Butterfly are displayed at the bottom of the local magnified image, and it can be seen that the quantitative evaluation of the WARRN is also the best, which indicates that the reconstruction accuracy is the highest.

However, the above comparisons are only quantitative comparison results for a particular sample, so the following table is provided to give quantitative evaluation results of several typical SR methods on the above standard test data set, including three common SR reconstruction scales: SR × 2, SR × 3 and SR × 4, which show quantitative data that are more statistically significant.

TABLE 1

It is clear from the quantitative evaluation comparison of table 1 that the maximum values of the comparison cells are WARRN, i.e. the mentioned WARRN shows significant and stable performance advantage on all SR scales.

Based on the same inventive concept, in another aspect, an embodiment of the present invention provides a super-resolution image reconstruction apparatus based on wide activation.

The super-resolution image reconstruction device based on wide activation provided by the present invention is described below with reference to fig. 5, and the super-resolution image reconstruction device based on wide activation described below and the super-resolution image reconstruction method based on wide activation described above may be referred to correspondingly.

Fig. 5 is a schematic structural diagram of a super-resolution image reconstruction apparatus based on wide activation according to an embodiment of the present invention.

In this embodiment, it should be noted that the wide-activation-based super-resolution image reconstruction apparatus 1 includes: an image acquisition unit 10 for acquiring a low-resolution LR image to be reconstructed; the reconstruction unit 20 is configured to input the low-resolution LR image to be reconstructed into a trained adaptive resampling super-resolution image reconstruction model based on the wide-activation residual structure, so as to obtain a high-resolution HR image output by the adaptive resampling super-resolution image reconstruction model based on the wide-activation residual structure; and an adaptive resampling super-resolution image reconstruction model based on the wide-activation residual structure is used for reconstructing the LR image into a corresponding HR image based on at least an interpolation algorithm and the wide-activation residual structure.

Since the super-resolution image reconstruction device based on wide activation provided by the embodiment of the present invention can be used for executing the super-resolution image reconstruction method based on wide activation described in the above embodiment, the working principle and the beneficial effect are similar, and therefore detailed description is not provided herein, and specific contents can be referred to the description of the above embodiment.

In this embodiment, it should be noted that each module in the apparatus according to the embodiment of the present invention may be integrated into a whole or may be separately disposed. The units may be combined into one unit, or further divided into a plurality of sub-units.

In another aspect, a further embodiment of the present invention provides an electronic device based on the same inventive concept.

In this embodiment, it should be noted that the electronic device may include: a processor (processor)610, a communication Interface (Communications Interface)620, a memory (memory)630 and a communication bus 640, wherein the processor 610, the communication Interface 620 and the memory 630 communicate with each other via the communication bus 640. The processor 610 may invoke logic instructions in the memory 630 to perform a wide activation based super-resolution image reconstruction method comprising: acquiring a low-resolution LR image to be reconstructed; inputting a low-resolution LR image to be reconstructed into a trained self-adaptive resampling super-resolution image reconstruction model based on a wide-activation residual structure to obtain a high-resolution HR image output by the self-adaptive resampling super-resolution image reconstruction model based on the wide-activation residual structure; the adaptive resampling super-resolution image reconstruction model based on the wide-activation residual structure is used for reconstructing an LR image into a corresponding HR image at least based on an interpolation algorithm and the wide-activation residual structure.

In addition, the logic instructions in the memory 630 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor is implemented to perform a wide activation based super resolution image reconstruction method, the method comprising: acquiring a low-resolution LR image to be reconstructed; inputting a low-resolution LR image to be reconstructed into a trained self-adaptive resampling super-resolution image reconstruction model based on a wide-activation residual structure to obtain a high-resolution HR image output by the self-adaptive resampling super-resolution image reconstruction model based on the wide-activation residual structure; the adaptive resampling super-resolution image reconstruction model based on the wide-activation residual structure is used for reconstructing an LR image into a corresponding HR image at least based on an interpolation algorithm and the wide-activation residual structure.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Moreover, in the present invention, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Furthermore, in the present disclosure, reference to the description of the terms "embodiment," "this embodiment," "yet another embodiment," or the like, means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present disclosure. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A super-resolution image reconstruction method based on wide activation is characterized by comprising the following steps:

acquiring a low-resolution LR image to be reconstructed;

2. The wide-activation-based super-resolution image reconstruction method according to claim 1, wherein the wide-activation-residual-structure-based adaptive resampling super-resolution image reconstruction model comprises an adaptive interpolation kernel estimation layer and an adaptive resampling layer;

3. The wide-activation-based super-resolution image reconstruction method according to claim 1, further comprising:

4. The wide-activation-based super-resolution image reconstruction method according to claim 2, wherein the adaptive interpolation kernel estimation layer further comprises a shallow feature extraction layer, a plurality of stacked wide-activation residual modules, a deep feature extraction layer and a Pixel Shuffle upsampling layer,

5. The wide-activation-based super-resolution image reconstruction method according to claim 4, wherein the adaptive resampling layer is configured to adaptively apply the adaptive interpolation kernel to a corresponding position of the LR image after generating the adaptive interpolation kernel to reconstruct the HR image by:

scaling the input LR image to a desired size of the HR image based on the interpolation algorithm to obtain an intermediate image X_intWherein X is_intRepresenting a temporary image of the same size as the target HR image, obtained directly by interpolation from the LR image; and

6. The wide-activation-based super-resolution image reconstruction method according to claim 4, wherein the plurality of stacked wide-activation residual modules comprises two layers of identical dilation convolutions with dilation rate r >1, and ReLU activation function is used between the two layers of dilation convolutions,

7. The wide-activation-based super-resolution image reconstruction method according to claim 4, wherein the Pixel Shuffle upsampling layer adopts a sub-Pixel convolution network and is used for making the generated interpolation kernel correspond to the HR image in spatial dimension.

8. A super-resolution image reconstruction device based on wide activation is characterized in that,

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor, when executing the program, implements the steps of the wide-activation based super-resolution image reconstruction method according to any one of claims 1 to 7.

10. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the wide-activation based super-resolution image reconstruction method according to any one of claims 1 to 7.