CN112288630A

CN112288630A - Super-resolution image reconstruction method and system based on improved wide-depth neural network

Info

Publication number: CN112288630A
Application number: CN202011161696.8A
Authority: CN
Inventors: 杜娟; 魏文澜; 范赐恩; 邹炼; 沈家蔚; 周紫玉
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2020-10-27
Filing date: 2020-10-27
Publication date: 2021-01-29

Abstract

The invention discloses a super-resolution image reconstruction method and a super-resolution image reconstruction system based on an improved wide-depth neural network. The method mainly comprises three large modules of feature extraction, nonlinear mapping and reconstruction, wherein the feature extraction module mainly utilizes the cascade module structure designed by the invention and introduces the ideas of deep separable convolution and point-by-point convolution, thereby simplifying the structure, greatly reducing network parameters and calculation amount, and lightening the storage pressure and the calculation complexity; the nonlinear mapping module mainly utilizes the self-adaptive weight sharing source module designed by the invention, and introduces jump connection, thereby improving the performance of the network. The invention provides a new network structure on the basis of the existing image super-resolution reconstruction network, can effectively ensure the effect after image reconstruction, and simultaneously has the advantages of less network parameters, small calculated amount, high processing speed and strong transportability.

Description

Super-resolution image reconstruction method and system based on improved wide-depth neural network

Technical Field

The invention relates to the field of computer vision and image super-resolution reconstruction, in particular to a super-resolution image reconstruction method based on an improved wide-depth neural network.

Background

The image super-resolution reconstruction technology is a new technology capable of effectively improving the resolution of an image, and the method carries out pixel reconstruction on a low-resolution image by adopting a specific image high-low resolution space mapping algorithm under the condition that only a low-resolution blurred image is kept, so that a clear image in a high-resolution space of the image is obtained.

High resolution images typically include greater pixel density, richer texture details, and higher confidence than low resolution images. The high-resolution image can provide more contents and details, not only meets the requirement of people on high-definition visual effect in daily life, but also is beneficial to the deep development of related researches in other fields. In practice, however, under the constraints of many factors, such as hardware limitation, insufficient network bandwidth, insufficient storage space, etc., we cannot generally directly obtain an ideal high-resolution image with sharp edges and no block blurring.

The most straightforward way to improve the image resolution is to improve the optical hardware in the acquisition system, but this is limited by the constraints of difficult and expensive manufacturing process. Therefore, from the viewpoint of software and algorithms, a technique for realizing super-resolution image reconstruction has been a hot research topic in a plurality of fields such as image processing and computer vision.

The image super-resolution reconstruction technology based on the traditional method can not meet the existing requirements, so that the seeking of a novel high-precision image super-resolution reconstruction algorithm is an urgent need at present. The super-resolution reconstruction neural network constructed through deep learning can accurately extract and characterize information contained in an image based on a special hierarchical structure and an effective convolution calculation mode of the super-resolution reconstruction neural network, has extremely high robustness and adaptability to diverse targets, and is the mainstream research direction at present. However, most of the super-resolution reconstruction neural networks constructed based on deep learning are too bulky or require too many computing resources, so that the light-weight super-resolution reconstruction method becomes a research hotspot.

There are some existing patents (including patent granted and patent published) about lightweight super-resolution reconstruction, as follows:

1) the application numbers are: the invention relates to a super-resolution image reconstruction method based on a recursive residual error network, which is a Chinese patent of CN201810638253.X, and the invention adopts local residual error learning instead of global residual error learning used by VDSR to train a neural network, and introduces a recursive structure into a residual error unit. However, the method still uses a residual error network, and the depth of the residual error network is deep and the parameter quantity is large. And the residual network is suitable for solving the computer vision problem of the high layer, and the super-resolution belongs to the computer vision problem of the low layer.

2) The application numbers are: CN201910272182.0, China patent of invention for super-resolution image reconstruction method based on lightweight network, the invention lightens the network structure and quantifies the parameters. In the aspect of network structure, a Shufflenet structure is used to replace a convolution, so that the calculation amount is large, and the performance is reduced due to the channel shuffling method.

With the development of deep learning, the development of the computer vision field is very rapid. The light super-resolution reconstruction networks are also numerous, and a plurality of image super-resolution reconstruction models based on the depth residual error network make great breakthrough in image reconstruction accuracy. However, while the algorithm model seeks to obtain higher image reconstruction accuracy in a unilateral manner, the number of network layers is also increased continuously, and the defects of too complex network structure, too much weight parameter quantity and too high model calculation complexity are exposed, so that the application range of the image super-resolution reconstruction technology in an actual scene is fundamentally limited. There is much room for improvement in these networks. The innovation point of the invention is that the network structure is redesigned, and the jump connection of the depth separable volume and the self-adaptive weight is utilized to achieve the purposes of reducing the network size, reducing the calculation complexity and removing redundant contents, thereby being applied to a mobile terminal or achieving the purpose of real-time processing.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a super-resolution image reconstruction method based on network structure light weight.

The invention provides a new network model: by introducing deep separable convolution and jump connection, a Mobile Shared Source Network (MSSN) can carry out faster super-resolution image reconstruction. In this way, the parameters and the calculation amount of the network are greatly reduced, and the method is suitable for the transplantation of the mobile terminal:

a super-resolution image reconstruction method based on an improved wide-depth neural network is characterized by comprising the following steps:

step 1, MSSN network extracts low-layer low-resolution image features, and the input of MSSN is I_LRFrom input I using a convolutional layer with kernel size m x m_LRMiddle extracted feature F₀，

F₀＝f_e(I_LR)

f_eFor low resolution images I_LRThe feature extraction function of (1);

step 2, the MSSN network provides a deep feature extraction module cascade Block (screening Block); one cascade block consists of 4 Mobile Adaptive Weighted Residual Units (MAWRU), and one MAWRU consists of a 1 × 1 point-by-point convolution, a parameter normalization layer, an activation function ReLU6, a 3 × 3 deep convolution, a parameter normalization layer, an activation function ReLU6, a 1 × 1 point-by-point convolution and a parameter normalization layer;

step 3, the MSSN network provides a nonlinear mapping Module self-Adaptive weight sharing Source Module (Adaptive Weighted Share-Source Module), and the Module is used for carrying out nonlinear mapping;

step 4, reconstructing the MSSN network; a convolution layer with the kernel size of 3 multiplied by 3 is used for a low-resolution image input by a network, and then a pixel shuffle (pixel shuffle) layer is utilized for up-sampling; performing the same operation on the output of the nonlinear mapping module, using a convolution layer with the kernel size of 3 multiplied by 3, and then performing upsampling by using a pixel shuffle (pixel shuffle) layer; adding the two up-sampled outputs to obtain a reconstructed super-resolution image; calculating a loss function of an L1 norm by obtaining the reconstructed image and a standard high-resolution image, and updating parameters of the model by using a back propagation algorithm; then, continuously updating the model parameters through different high-low resolution image pairs;

and 5, calculating a certain image needing super-resolution reconstruction through the network model trained in the step 4 to obtain an enlarged image after reconstruction, wherein the trained network model is finally determined model parameters through continuous updating.

In the above super-resolution image reconstruction method based on the lightweight network,

the MSSN network provides a specific processing procedure of the feature extraction module concatenation block in step 2 as follows:

step 2.1, defining N output feature maps of the previous layer, that is, the number of channels is N, performing convolution operation on a 1 × 1 point-by-point convolution kernel to obtain a feature map with the number of output channels being r × N, where r is an expansion multiple of the number of channels; performing parameter Normalization (Weight Normalization) on the output feature map, and decomposing the obtained Weight vector w into direction vectors

And vector modulo g:

Is a unit vector, determines the direction of w; g is a scalar quantity, which determines the length of w; because | w | ═ g |, the euclidean norm of the weight vector is fixed by the weight decomposition mode, so that the regularization effect is realized; after a new characteristic diagram is obtained, processing is carried out by utilizing an activation function ReLU 6;

step 2.2, the output feature map of the previous layer has r × N, that is, the number of channels is r × N, and a convolution operation is performed by using a 3 × 3 deep convolution kernel to obtain a feature map with the number of output channels being r × N; after a new characteristic diagram is obtained, parameter normalization and activation function ReLU6 processing are carried out;

step 2.3, the output feature map of the previous layer has r × N, that is, the number of channels is r × N, and a 1 × 1 point-by-point convolution kernel is used for performing convolution operation to obtain a feature map with the number of output channels being N; after a new characteristic diagram is obtained, parameter normalization is carried out;

step 2.4, the feature graph obtained in the step 2.3 is multiplied by a scale factor of the self-adaptive weight, and the feature graph input in the step 2.1 is multiplied by a scale factor of the self-adaptive weight to be added to obtain a new feature graph; the scale factor parameters will iterate on their own during training;

and 2.5, cascading the continuous 4 MAWRUs to obtain a cascading block.

the nonlinear mapping Module in the step 3, the Adaptive Weighted Share-Source Module (Adaptive Weighted Share-Source Module), hereinafter referred to as AWSSM), performs the efficient image feature nonlinear mapping specifically as follows:

by using the cascade Block (CB for short) in step 2, the output characteristic diagram of step 1 is x₀The output of the nth CB is x_nAt x₀And each x_nAll have an adaptive weight scale factor in between

X is to be₀Scale factor with adaptive weight

Multiplied by x_nAnd

adding the multiplied feature maps to obtain a feature map, namely an input feature map of the next CB; non-linear mapping moduleThe process is as follows,

F_NM＝f_AWSSM(F₀)

wherein f is_AWSSMRepresenting an AWSSM-based nonlinear mapping module comprising a plurality of Cascadeng blocks; defining the Cascadeng Block as f_CBA function, which can be expressed as

Wherein x_nIs the output of the nth framing Block,

is a scale factor of the adaptive weight.

A super-resolution image reconstruction system based on an improved wide-depth neural network is characterized by comprising the following steps:

an image feature extraction unit: the image feature extraction unit extracts low-layer low-resolution image features, and the MSSN input is I_LRFrom input I using a convolutional layer with kernel size m x m_LRMiddle extracted feature F₀，

F₀＝f_e(I_LR)

f_eFor low resolution images I_LRThe feature extraction function of (1);

a cascade block extraction unit: the cascade Block extraction unit extracts a deep feature extraction module cascade Block (cascade Block); one cascade block consists of 4 Mobile Adaptive Weighted Residual Units (MAWRU), and one MAWRU consists of a 1 × 1 point-by-point convolution, a parameter normalization layer, an activation function ReLU6, a 3 × 3 deep convolution, a parameter normalization layer, an activation function ReLU6, a 1 × 1 point-by-point convolution and a parameter normalization layer;

the nonlinear mapping Module is an Adaptive weight sharing Source Module (Adaptive Weighted Share-Source Module): using the module to perform nonlinear mapping;

a reconstruction unit: the reconstruction unit uses a convolution layer with the kernel size of 3 multiplied by 3 to a low-resolution image input by a network, and then carries out up-sampling by utilizing a pixel shuffle (pixel shuffle) layer; performing the same operation on the output of the nonlinear mapping module, using a convolution layer with the kernel size of 3 multiplied by 3, and then performing upsampling by using a pixel shuffle (pixel shuffle) layer; adding the two up-sampled outputs to obtain a reconstructed super-resolution image; calculating a loss function of an L1 norm by obtaining the reconstructed image and a standard high-resolution image, and updating parameters of the model by using a back propagation algorithm; then, continuously updating the model parameters through different high-low resolution image pairs;

an image reconstruction unit: and (4) calculating a certain image needing super-resolution reconstruction through the network model trained in the step (4) by the image reconstruction unit to obtain an enlarged image after reconstruction, wherein the trained network model is finally determined model parameters through continuous updating.

In the above super-resolution image reconstruction system based on the improved wide-depth neural network, the specific processing procedure of the cascade block extraction unit includes:

And vector modulo g:

Is a unit vector, determines the direction of w; g is a scalar quantity, which determines the length of w; since w | ═ g |, this weight decomposition takes placeThe Euclidean norm of the weight vector is fixed, so that the regularization effect is realized; after a new characteristic diagram is obtained, processing is carried out by utilizing an activation function ReLU 6;

and 2.5, cascading the continuous 4 MAWRUs to obtain a cascading block.

In the above super-resolution image reconstruction system based on the improved wide-depth neural network, the specific process of the nonlinear mapping module for the adaptive weight sharing source module to perform the nonlinear mapping is as follows:

by using the Cascade Block (CB) extracted by the cascade Block extraction unit, the output feature map of the image feature extraction unit is x₀The output of the nth CB is x_nAt x₀And each x_nAll have an adaptive weight scale factor in between

X is to be₀Scale factor with adaptive weight

Multiplied by x_nAnd

adding the multiplied feature maps to obtain a feature map, namely an input feature map of the next CB; the non-linear mapping module has the following specific process,

F_NM＝f_AWSSM(F₀)

Wherein x_nIs the output of the nth framing Block,

is a scale factor of the adaptive weight.

Compared with the prior art, the invention has the following advantages and beneficial effects: 1. the invention utilizes a newly designed deep characteristic extraction module by redesigning a network structure: cascade Block (cascade Block), nonlinear mapping module: the Adaptive Weighted Share-Source Module (Adaptive Weighted Share-Source Module) and the reconstruction Module can realize faster and more efficient image reconstruction. 2. The invention utilizes the depth separable convolution, reduces the parameter quantity under the condition of not influencing the final reconstruction effect too much, and accelerates the calculation speed. 3. The MAWRU in the present invention uses self-applied weights, allowing more information to be extracted without increasing the parameters.

Drawings

FIG. 1 is a flow chart of the present invention.

Fig. 2 is a block diagram of a Mobile Adaptive Weighted Residual Unit (Mobile Adaptive Weighted Residual Unit).

Fig. 3 is an overall network configuration diagram.

FIG. 4 is a schematic of the process of the present invention.

Detailed Description

Example 1

The invention provides a super-resolution image reconstruction method based on network structure lightweight, which mainly considers the defects of the prior art, so that the super-resolution reconstruction neural network structure is too huge and cannot be transplanted to other equipment with limited resources for use. The invention can reduce the model parameter quantity to be less than 1M and the calculation quantity to be less than 100G when the image of 720p (1280 multiplied by 720) needs to be reconstructed under the condition of no loss of image quality.

The evaluation criterion for defining the image quality is the peak signal-to-noise ratio (PSRN), and when the image is amplified by 2 times, the PSRN is greater than 34dB and is considered to be acceptable. When the image is magnified 4 times, the PSRN is greater than 28dB and is considered acceptable.

Fig. 1 is a general flow framework of the present invention, and a specific description of the flow of the present invention will be provided below.

Step 1, extracting shallow low-resolution image features by the MSSN network. MSSN input is I_LRWe use a convolution layer with kernel size 3 x 3 from input I_LRMiddle extracted feature F₀，

F₀＝f_e(I_LR)

f_eFor low resolution images I_LRThen extracting the extracted features F₀For use in subsequent steps.

Fig. 1 represents the reconstruction process, and step 4 comprises the training process. After the training is finished, any parameter does not need to be changed, and the actual use only needs to be rebuilt according to the steps shown in the attached drawing 1. The purpose of step 4 is simply to know what the specific parameters of each module are.

In step 2, the MSSN network provides a deep feature extraction module: the specific processing procedure of the cascade block is as follows:

in the cascade block, the idea of depth separable convolution is introduced, including depth convolution (DWconv) and point-by-point convolution (PWconv), and a Mobile Adaptive Weighted Residual Unit (MAWRU) is proposed, see fig. 2.

Step 2.1, assuming that there are N output feature maps in the previous layer, that is, the number of channels is N, and performing convolution operation using a 1 × 1 point-by-point convolution kernelAnd obtaining a characteristic diagram with the output channel number of r multiplied by N, wherein r is the expansion multiple of the channel number. Performing parameter Normalization (Weight Normalization) on the output feature map, and decomposing the obtained Weight vector w into direction vectors

And vector modulo g:

Is a unit vector, determines the direction of w; g is a scalar quantity, determining the length of w. Because | | | w | | | | g | |, the euclidean norm of the weight vector is fixed by the weight decomposition mode, and the regularization effect is realized. After the new feature map is obtained, the processing is performed using the activation function ReLU6 (the ReLU function limits the maximum output value to 6).

And 2.2, the output feature maps of the previous layer are r × N, namely the number of channels is r × N, and the feature maps with the number of output channels r × N are obtained by performing convolution operation by using a depth convolution kernel of 3 × 3. After the new feature map is obtained, parameter normalization and activation function ReLU6 processing are performed.

And 2.3, performing convolution operation by using a 1 × 1 point-by-point convolution kernel to obtain a feature map with the number of output channels being N, wherein the number of output feature maps of the previous layer is r × N, namely the number of channels is r × N. And after a new characteristic diagram is obtained, carrying out parameter normalization.

And 2.4, adding the feature map obtained in the step 2.3 multiplied by a scale factor of the self-adaptive weight and the feature map input in the step 2.1 multiplied by a scale factor of the self-adaptive weight to obtain a new feature map. The scale factor parameters will iterate on their own in the training.

And 2.5, Cascading 4 continuous MAWRUs to obtain a cascade Block.

The nonlinear mapping module AWSSM in the step 3 performs the following specific processes:

X is to be₀Scale factor with adaptive weight

Multiplied by x_nAnd

and adding the multiplied feature maps to obtain a feature map, namely the input feature map of the next CB. For better description, the following description will be made in detail with reference to fig. 3 and the formulas.

The non-linear mapping module has the following specific process,

F_NM＝f_AWSSM(F₀)

wherein f is_AWSSMRepresents an AWSSM-based nonlinear mapping module comprising a plurality of binning blocks. Defining the Cascadeng Block as f_CBA function, which can be expressed as

Wherein x_nIs the output of the nth framing Block,

is a scale factor of the adaptive weight.

And 4, reconstructing the MSSN network. For the low-resolution image input by the network, a convolution layer with the kernel size of 3 x 3 is used, and then pixel shuffle (pixel shuffle) layers are used for up-sampling. The output of the non-linear mapping module performs the same operation, using a convolution layer with kernel size of 3 × 3, and then using a pixel shuffle (pixel shuffle) layer to perform upsampling. And adding the two up-sampled outputs to obtain a reconstructed super-resolution image.

And 5, calculating the image needing super-resolution reconstruction through the trained network model to obtain an enlarged image after reconstruction.

The above is the detailed procedure of the present invention. The invention provides a new network structure, which reduces the network size, the calculation complexity and the redundant content.

Example 2

The invention correspondingly provides a super-resolution image reconstruction system based on the improved wide-depth neural network, which is characterized by comprising the following components:

F₀＝f_e(I_LR)

f_eFor low resolution images I_LRThe feature extraction function of (1);

The specific processing process of the cascade block extraction unit comprises the following steps:

And vector modulo g:

and 2.5, cascading the continuous 4 MAWRUs to obtain a cascading block.

The nonlinear mapping module self-adapting weight sharing source module performs a specific nonlinear mapping process as follows:

X is to be₀Scale factor with adaptive weight

Multiplied by x_nAnd

F_NM＝f_AWSSM(F₀)

wherein f is_AWSSMRepresenting an AWSSM-based nonlinear mapping module comprising a plurality of Cascadeng blocks;defining the Cascadeng Block as f_CBA function, which can be expressed as

Wherein x_nIs the output of the nth framing Block,

is a scale factor of the adaptive weight.

The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims

1. A super-resolution image reconstruction method based on an improved wide-depth neural network is characterized by comprising the following steps:

F₀＝f_e(I_LR)

f_eFor low resolution images I_LRThe feature extraction function of (1);

2. The super-resolution image reconstruction method based on the lightweight network according to claim 1, wherein:

And vector modulo g:

where v is a vector of the same dimension as w,v is the Euclidean norm, so

and 2.5, cascading the continuous 4 MAWRUs to obtain a cascading block.

3. The super-resolution image reconstruction method based on the lightweight network according to claim 1, wherein:

X is to be₀Scale factor with adaptive weight

Multiplied by x_nAnd

F_NM＝f_AWSSM(F₀)

Wherein x_nIs the output of the nth framing Block,

is a scale factor of the adaptive weight.

4. A super-resolution image reconstruction system based on an improved wide-depth neural network, comprising:

F₀＝f_e(I_LR)

f_eFor low resolution images I_LRThe feature extraction function of (1);

5. The super-resolution image reconstruction system based on the improved wide-depth neural network as claimed in claim 1, wherein: the specific processing procedure of the cascaded block extraction unit comprises the following steps:

step 2.1, defining N output feature maps of the previous layer, that is, the number of channels is N, performing convolution operation on a 1 × 1 point-by-point convolution kernel to obtain a feature map with the number of output channels being r × N, where r is an expansion multiple of the number of channels; the output characteristic diagram is subjected to parameter Normalization (Weight Normalization), andthe resulting weight vector w resolves the direction vectors

And vector modulo g:

and 2.5, cascading the continuous 4 MAWRUs to obtain a cascading block.

6. The super-resolution image reconstruction system based on the improved wide-depth neural network as claimed in claim 1, wherein:

the nonlinear mapping module self-adapting weight sharing source module performs the specific process of nonlinear mapping as follows:

X is to be₀Scale factor with adaptive weight

Multiplied by x_nAnd

F_NM＝f_AWSSM(F₀)

Wherein x_nIs the output of the nth framing Block,

is a scale factor of the adaptive weight.