CN112200724A - Single-image super-resolution reconstruction system and method based on feedback mechanism - Google Patents

Single-image super-resolution reconstruction system and method based on feedback mechanism Download PDF

Info

Publication number
CN112200724A
CN112200724A CN202011139130.5A CN202011139130A CN112200724A CN 112200724 A CN112200724 A CN 112200724A CN 202011139130 A CN202011139130 A CN 202011139130A CN 112200724 A CN112200724 A CN 112200724A
Authority
CN
China
Prior art keywords
image
resolution
super
features
deep
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011139130.5A
Other languages
Chinese (zh)
Other versions
CN112200724B (en
Inventor
王进
吴一鸣
王柳
陈泽宇
陈沅涛
张经宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changsha University of Science and Technology
Original Assignee
Changsha University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changsha University of Science and Technology filed Critical Changsha University of Science and Technology
Priority to CN202011139130.5A priority Critical patent/CN112200724B/en
Publication of CN112200724A publication Critical patent/CN112200724A/en
Application granted granted Critical
Publication of CN112200724B publication Critical patent/CN112200724B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4053Super resolution, i.e. output image resolution higher than sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a single-image super-resolution reconstruction system and a single-image super-resolution reconstruction method based on a feedback mechanism, wherein the feedback mechanism is adopted, and a first iteration of a low-resolution image is formed through a shallow layer feature extraction module, a first deep layer feature extraction module and a first reconstruction module; according to the invention, the deep feature mapping extracted by the first iteration is refined into the shallow feature mapping of the second iteration through the feature refining module, the second deep feature extraction module and the second reconstruction modeling module, and deeper features of the low-resolution image can be extracted under the condition of not deepening the network depth, so that the training effect of the image network model is improved.

Description

Single-image super-resolution reconstruction system and method based on feedback mechanism
Technical Field
The invention relates to the technical field of computer image super-resolution processing, in particular to a single image super-resolution reconstruction system and method based on a feedback mechanism.
Background
The Single Image Super-Resolution (Single Image Super-Resolution) reconstruction algorithm aims to restore a low-Resolution picture into a high-Resolution Image with good visual effect through a series of algorithms. In fact, single image super resolution is an ill-posed algorithmic problem, i.e., for any low resolution image, there may be an infinite number of high resolution images corresponding thereto. Conventional single image super-resolution methods include interpolation-based methods, reconstruction model-based methods, and learning-based methods. The interpolation-based method utilizes a basis function or an interpolation kernel to approximate lost image high-frequency information, and common interpolation methods include nearest neighbor interpolation, bilinear interpolation, bicubic interpolation algorithm and the like. The method based on the reconstruction model enables the ill-posed problem to become solvable by inputting the priori knowledge of the image into the super-resolution reconstruction process of the image; including kernel estimation methods and sparsity of image gradients as a priori knowledge of the algorithm, etc. The method based on learning learns the mapping relation between the low-resolution images and the high-resolution images through a training image data set to predict the lost high-frequency information in the low-resolution images so as to achieve the aim of reconstructing the high-resolution images; including using machine learning, sparse representation, and coupled dictionary training.
Although the traditional single-image super-resolution method can realize high-resolution image reconstruction, with the increase of the magnification factor, the high-frequency information which can be provided by the artificially defined prior knowledge and the observation model for reconstruction is less and less, so that the traditional method is difficult to break through the reconstruction effect. The Super-Resolution algorithm of a single image based on deep learning is the Super-Resolution reconstruction (SRCNN) method based on a Convolutional Neural network proposed by Dong et al in 2015 at the earliest time, and the model adopts a three-layer convolution structure to realize a reconstruction method from low Resolution to high Resolution and obtain a better reconstruction effect than the traditional method. New algorithms are continuously proposed in the follow-up process, but the current single-image super-resolution algorithm based on deep learning has the following defects: only forward propagation is considered for feature learning, even with RNN networks like DRCN, the transfer from shallow to deep features is feed forward propagation and does not exploit feedback mechanisms that are ubiquitous in the human visual system.
Disclosure of Invention
The present invention is directed to solving at least one of the problems of the prior art. Therefore, the invention provides a single-image super-resolution reconstruction system and method based on a feedback mechanism.
In a first aspect of the present invention, a single-image super-resolution reconstruction system based on a feedback mechanism is provided, which is used for training an image network model, and includes:
the shallow feature extraction module is used for extracting shallow features of the low-resolution image subjected to mean shift segmentation;
a first deep feature extraction module for extracting deep features of the low resolution image from the shallow features;
the first reconstruction module is used for reconstructing the deep features output by the first deep feature extraction module to obtain a first super-resolution image, and the first super-resolution image is used for calculating a loss function of the image network model;
the characteristic refining module is used for carrying out cascade and convolution operations on the shallow characteristic and the deep characteristic extracted by the first deep characteristic extraction module to obtain a refined characteristic;
a second deep feature extraction module for extracting deep features of the low resolution image from the refined features;
and the second reconstruction module is used for reconstructing the deep features output by the second deep feature extraction module to obtain a second super-resolution image, and the second super-resolution image is used for calculating a loss function of the image network model and outputting the loss function as the super-resolution image of the image network model.
According to the embodiment of the invention, at least the following technical effects are achieved:
the system adopts a feedback mechanism, and forms the first iteration of the low-resolution image through a shallow layer feature extraction module, a first deep layer feature extraction module and a first reconstruction module; the system can refine the deep feature mapping extracted by the first iteration into the shallow feature mapping of the second iteration by the aid of the second iteration of the low-resolution image formed by the feature refining module, the second deep feature extraction module and the second reconstruction modeling module, and can extract deeper features of the low-resolution image under the condition of not deepening the network depth, so that the training effect of the image network model is improved.
In a second aspect of the present invention, a method for reconstructing a single-image super-resolution based on a feedback mechanism is provided, which includes the following steps:
the first iteration:
extracting shallow features of the low-resolution image subjected to mean shift segmentation through convolution operation;
extracting deep features of the low-resolution image from the shallow features, and performing first reconstruction on the deep features to obtain a first super-resolution image, wherein the first super-resolution image is used for calculating a loss function of the image network model;
and (3) second iteration:
carrying out cascade and convolution operations on the deep features and the shallow features in the first iteration to obtain refined features;
and extracting deep features of the low-resolution image from the refined features, and performing secondary reconstruction on the deep features to obtain a second super-resolution image, wherein the second super-resolution image is used for calculating a loss function of the image network model and outputting the loss function as a super-resolution image of the image network model.
According to the embodiment of the invention, at least the following technical effects are achieved:
the method adopts a feedback mechanism, can refine the deep feature mapping extracted by the first iteration into the shallow feature mapping extracted by the second iteration, and can extract deeper features of the low-resolution image under the condition of not deepening the network depth, thereby improving the training effect of the image network model.
In a third aspect of the present invention, an image network model is provided, which uses the feedback mechanism-based single image super-resolution reconstruction system according to the first aspect of the present invention or the feedback mechanism-based single image super-resolution reconstruction method according to the second aspect of the present invention during training.
According to the embodiment of the invention, at least the following technical effects are achieved:
in the model, the deep-layer extracted feature mapping can refine the shallow feature mapping of the next iteration, and deeper features can be extracted under the condition of not deepening the network depth.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a schematic structural diagram of a single-image super-resolution reconstruction system based on a feedback mechanism according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an implementation of the system of FIG. 1;
fig. 3 is a schematic structural diagram of a Ghost residual dense module according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a Ghost Module according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of an attention module according to an embodiment of the present invention;
FIG. 6 is a schematic structural diagram of a feature refining module provided in an embodiment of the present invention;
fig. 7 is a schematic diagram of an experimental result provided in an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
Referring to fig. 1 to 6, an embodiment of the present invention provides a single image super-resolution reconstruction system based on a feedback mechanism, for training an image network model, including: the shallow layer feature extraction module, the first deep layer feature extraction module, the first reconstruction module, the feature refining module, the second deep layer feature extraction module and the second reconstruction module are specifically as follows:
the shallow feature extraction module is used for extracting shallow features of the low-resolution image after mean shift segmentation.
The red, green and blue color gamut in the low-resolution image is divided into three layers of characteristic channels through mean shift, and the separated characteristic channels are input into a shallow layer characteristic extraction module to be subjected to shallow layer characteristic extraction. In addition, the three layers of characteristic channels which pass through the mean shift are subjected to up-sampling operation by using a bilinear interpolation algorithm, so that preparation is made for subsequent image reconstruction.
As an alternative implementation, the shallow feature extraction module performs 2 convolution operations, where the convolution kernel of the first convolutional layer is 3 × 3, the convolution kernel of the second convolutional layer is 1 × 1, and the number of feature channels of the two convolutional layers is 64, which can be represented by the following formula:
F0=fshallow(ILR) (1)
wherein, ILRRepresenting an input low resolution picture, fshallow(. represents a shallow feature extraction function, F0Representing a shallow feature extraction module.
The first deep feature extraction module is used for extracting deep features of the low-resolution image from the shallow features.
As an alternative implementation, the first deep feature extraction module may be composed of a module based on RDB (residual dense module) in the conventional deep feature extraction technology field.
As an optional implementation manner, the first deep feature extraction Module is provided with a plurality of Ghost residual error dense modules connected with each other, and each Ghost residual error dense Module comprises a plurality of Ghost modules connected with each other in a dense manner, and is used for extracting deep features of the low-resolution image from the shallow features. For ease of understanding, the Ghost residual dense block is denoted using GRDB, both hereinafter and in the drawings. The GRDB is designed based on a conventional RDB Module, and the GRDB provided in this embodiment uses a Ghost Module to replace a 3 × 3 convolution structure in the RDB Module. As in fig. 3, the primary role of GRDB is to extract edge and texture details in feature mapping, which can be represented by the following equation:
Figure BDA0002737713870000061
wherein the content of the first and second substances,
Figure BDA0002737713870000062
respectively, the outputs from the first GRDB to the last GRDB in the first deep feature extraction module. In the present embodiment, m is set to 8 (note that, the GRDB is exemplified by 8, but is not limited to 8). The number of Ghost modules is set to 8 in one GRDB (note that Ghost modules are 8 for example, but not limited to 8). Since 8 Ghost modules are connected densely, i.e. the feature map of the first 7 is input by the join operation, and the final output is a feature map of 64 layers.
As shown in fig. 4, each Ghost Module is composed of a 1x1 convolution and a 3x3 convolution, and aims to remove redundant channels of feature maps, learn discarded redundant channels from retained feature maps by using convolution, and finally join the learned feature maps with the original feature maps to achieve the number of channels in input. The method not only can abandon similar feature mapping channels in convolution operation and concentrate on useful feature mapping channels, but also can reduce the parameter amount and the calculation amount of the network and is suitable for a lightweight network. Such as: the input feature mapping is 64 layers, and 32-layer feature mapping (Feat1) with half redundancy removed is obtained after 1x1 convolution; then, carrying out 3x3 grouping convolution on the 32-layer feature maps after the 1x1 convolution, grouping the 32-layer feature maps into 32, and outputting the 32-layer feature maps (Feat 2); and finally, carrying out a connection operation on the 32-layer feature map (Feat1) subjected to the 1x1 convolution to remove redundancy and the 32-layer feature map (Feat2) subjected to the 3x3 convolution operation, wherein the connected output feature map keeps the same channel number as the input feature map. The above process can be represented by the following formula:
FGM=concat(Feat1,Feat2) (3)
Feat1=W1×1(IF) (4)
Feat2=W3×3(Feat1) (5)
where, Feat1 denotes primary convolution (primary-conv), consisting of a convolution with a convolution kernel size of 1x1 and a ReLU activation function; i isFRepresenting the input feature mapping of the Ghost Module; feat2 denotes the lap-operation, consisting of a convolution with a convolution kernel size of 3x3 and a ReLU activation function; fGMRepresents the output of the Ghost Module.
As an alternative embodiment, an attention module (SCM) integrating spatial and channel attention mechanisms is added at the end of each GRDB. The purpose of adding the attention module is to enable the network to focus more on adjusting useful information and high-frequency information in space and channels, enhance the expression capability of feature mapping, effectively recover more high-frequency details such as textures and contours and obtain a better high-resolution image reconstruction effect. The application range of the Attention mechanism can be divided into Channel Attention (Channel Attention) and Spatial Attention (Spatial Attention), and the present embodiment combines the two functions in the present system. The 8 th Ghost Module output is sent to the attention Module, the residual factor is multiplied by the output feature mapping point of the attention Module, then the obtained product is added to the input of the GRDB in a matrix manner, and the addition result is used as the output of the GRDB, which can be represented by the following formula:
FGhost=α(FSCM(W1×1(Gd)))+G0 (6)
wherein G is0Represents the input of GRDB; w1×1(Gd) Represents a convolution with a convolution kernel size of 1x1, including the ReLU activation function; fSCMAn attention module is represented; α represents an occupied weight factor, α is 0.2; gdIndicates the last oneThe output of the GhostModule can be represented by the following formula:
Gd=concat(FGM(Gd-1),Gd-1,…,G1,G0) (7)
wherein G is1~Gd-1Represents the output from the 1 st to the d-1 st Ghost modules, FGMFor the operation of Ghost Module, concat represents the cascading operation.
As shown in fig. 4 and 5, GRDB integrates spatial attention with channel attention, which uses 3x3 and 1x1 convolutions. For example, the channel attention firstly performs global average pooling on input feature maps, firstly averagely pools feature maps with the original size of 48x48x64 into 64-channel single-pixel points with the size of 1x1x64, then compresses the channel number from 64 layers into 4 layers through convolution with the size of 1x1, transmits the channel number into a linear rectification function (ReLU) for activation, and finally reconstructs the compressed 4 layers into the original 64 layers through convolution with the size of 1x1, wherein the output of the channel attention is a channel weight vector with the size of 1x1x 64. Spatial attention first performs the incoming feature map with a packet convolution with a convolution kernel of 3x3 packet 64, and then a further convolution process of 1x 1. Adding pixel points to the space and channel attention outputs, and adjusting the weight of each channel and each pixel by a logistic function to be 0-1. And performing point multiplication operation on the output of the Sigmoid function and the input of the attention module so as to reserve important channel and space pixel points. The above process can be represented by the following formula:
FSCM=xc·σ(FSA+FCA) (8)
FSA=Wp(Wg(xc)) (9)
FCA=WU(δ(WD(FGAP(xc)))) (10)
wherein, FSCMRepresents the output of the attention module, xcRepresents the input to the attention module and σ represents the sigmoid function. FSAOutput representing spatial attention manipulation, WpRepresenting a pixel-by-pixel convolution operation with a convolution kernel size of 1x1, WgRepresenting a packet convolution with a convolution kernel size of 3x3, packet 64. FCAOutput, W, representing channel attention operationUDenotes a packed convolution operation with a channel number of 4 and a convolution kernel of 1x1, WDIndicating an expanding convolution operation with the number of channels expanded back to 64 and a convolution kernel of 1x1, with δ indicating the ReLU activation function. FGAPRepresenting a global average pooling operation, FGAPCan be represented by the following formula:
Figure BDA0002737713870000081
wherein, H and W respectively represent the height and width of the input feature mapping, and i and j respectively represent the relative horizontal and vertical coordinates of the pixel points.
The first reconstruction module is used for reconstructing the deep features output by the first deep feature extraction module to obtain a first super-resolution image, and the first super-resolution image is used for calculating a loss function of the image network model.
The first reconstruction module is mainly used for combining and reconstructing the deep features output by the first deep feature extraction module and the low-resolution image subjected to bilinear interpolation upsampling processing to obtain a first super-resolution image. The reconstruction process includes first transpose convolution operation and one layer convolution operation, the convolution kernel size is 3 × 3, then the extracted features and the low resolution picture are added by using the result after interpolation up-sampling, and the result is output as a reconstructed super-resolution picture, which can be represented by the following formula:
Figure BDA0002737713870000091
wherein, ILRRepresenting an input low resolution picture, fUPWhich represents an interpolation up-sampling operation, is,
Figure BDA0002737713870000092
representing the output of the first deep feature extraction module, fRBWhich is representative of the image reconstruction function,
Figure BDA0002737713870000093
and (3) outputting a super-resolution picture representing the t-th iteration, wherein t is 1 or 2 (note that the value is 1, and the value of the second reconstruction modeling block is 2).
Calculation of the loss function: using the super-resolution picture output of two iterations, and performing L1 loss function calculation respectively with the original low-resolution picture and then taking an average value, which can be represented by the following formula:
Figure BDA0002737713870000094
wherein the content of the first and second substances,
Figure BDA0002737713870000095
for the L1 loss function, theta is the network parameter of the image network model, T-2 is the total iteration number, T is the iteration, IHRAnd
Figure BDA0002737713870000096
respectively representing an original high-resolution picture (a picture used for training an image network model) and a super-resolution reconstructed picture. The image network model and the loss function are not described in detail here.
The characteristic refining module is used for carrying out cascade and convolution operations on the shallow characteristic and the deep characteristic extracted by the first deep characteristic extraction module to obtain refined characteristics.
As an alternative embodiment, as shown in fig. 6, the feature refining module performs 2 cascades and 2 convolutions, the first cascade concatenates the deep features of several GRDB outputs and inputs the concatenation result to the first convolution layer; the second concatenation concatenates the output of the first convolutional layer with the shallow feature and inputs the concatenated result into the second convolutional layer.
Based on the above embodiment, the last 4 GRDB outputs of the previous iteration are imported into the join operation (note that 4 are taken as an example, but not limited to 4, and not limited to the last 4), and the joined feature map is subjected to a 1x1 convolution operation, and the number of feature channels is compressed from 64x4 layers after joining to 64 layers through 1x1 convolution; and then, the feature mapping after convolution is transmitted to the iteration, the connection operation is carried out on the feature mapping after the connection and the shallow feature output of the iteration, the feature mapping after the connection is subjected to 1x1 convolution, the number of feature layers is compressed to 64 layers from 64x2 layers through the convolution operation, and finally the feature layers are transmitted to a second deep feature extraction module of the iteration. Can be represented by the following formula:
Figure BDA0002737713870000101
wherein the content of the first and second substances,
Figure BDA0002737713870000102
respectively representing the m-b GRDB to last GRDB outputs in the first deep feature extraction module iteration, frefine(. represents a characteristic refining function, FGFMRepresenting the output of the feature refining module.
The second deep feature extraction module is used for extracting deep features of the low-resolution image from the refined features.
As an optional implementation manner, the same as the first deep feature extraction module specifically includes: the second deep feature extraction Module comprises a plurality of GRDBs which are connected with each other, and each GRDB comprises a plurality of densely connected Ghost modules. Can be represented by the following formula:
Figure BDA0002737713870000103
wherein the content of the first and second substances,
Figure BDA0002737713870000104
respectively representing the output from the first GRDB to the last GRDB in the second deep feature extraction module.
Likewise, each GRDB incorporates an integrated space and channel attention module (SCM). And will not be described in detail herein.
The second super-resolution image is used for calculating a loss function of the image network model and outputting the loss function as the super-resolution image of the image network model.
The calculation of the loss function is shown as formula (13), and in the same way as the first reconstruction module, the second reconstruction module firstly comprises a transposition convolution operation and a layer convolution operation, the sizes of convolution kernels are all 3x3, then the extracted features and the low-resolution picture are added by utilizing the result of interpolation upsampling, and the result is output as the reconstructed super-resolution picture. The process and advantages thereof will not be described in detail herein.
The system of the embodiment improves the method for balancing the visual effect, the parameter amount and the running time of the reconstructed image, which are ubiquitous in the existing super-resolution reconstruction algorithm, and has the following beneficial effects:
(1) in the system, a feedback mechanism is added, and a first iteration of a low-resolution image is formed through a shallow layer feature extraction module, a first deep layer feature extraction module and a first reconstruction module; the system can refine the deep feature mapping extracted by the first iteration into the shallow feature mapping of the second iteration by the aid of the second iteration of the low-resolution image formed by the feature refining module, the second deep feature extraction module and the second reconstruction modeling module, and can extract deeper features of the low-resolution image under the condition of not deepening the network depth, so that the training effect of the image network model is improved.
(2) The system designs a GRDB consisting of densely connected Ghost modules, and then forms a deep feature extraction Module by the interconnected GRDBs, so that the system can remove redundant channels in feature channels while achieving the feature extraction function of common convolution, and reduce the parameter quantity in the network.
(3) The system adds an attention mechanism into each GRDB, so that the network can be more concentrated on adjusting useful information and high-frequency information in space and channels, the expression capability of feature mapping is enhanced, more high-frequency details such as textures and contours are effectively recovered, and a better high-resolution image reconstruction effect is obtained. And because of the existence of the Ghost Module, the increase of the parameter quantity generated by increasing the attention mechanism has no influence.
(4) Compared with the existing models such as VDSR, DRRN, CARN, IMDN and the like, the system not only has optimal visual effect, but also is superior to the super-resolution model on the objective evaluation standards of PSNR and SSIM.
The embodiment of the invention provides a single-image super-resolution reconstruction method based on a feedback mechanism, which is used for training an image network model and comprises the following steps:
s100, first iteration:
extracting shallow features of the low-resolution image subjected to mean shift segmentation through convolution operation;
extracting deep features of the low-resolution image from the shallow features, and performing first reconstruction on the deep features to obtain a first super-resolution image, wherein the first super-resolution image is used for calculating a loss function of the image network model;
s200, second iteration:
carrying out cascade and convolution operations on the deep features and the shallow features in the first iteration to obtain refined features;
and extracting deep features of the low-resolution image from the refined features, and performing secondary reconstruction on the deep features to obtain a second super-resolution image, wherein the second super-resolution image is used for calculating a loss function of the image network model and outputting the second super-resolution image as a super-resolution image of the image network model.
As an alternative embodiment, the extracting deep features of the low-resolution image from the shallow features includes: and extracting deep features of the low-resolution image from the shallow features through a plurality of interconnected GRDBs, wherein each GRDB comprises a plurality of densely connected Ghost modules.
As an alternative embodiment, the extracting deep features of the low resolution image from the refined features includes: and extracting deep features of the low-resolution image from the refined features through a plurality of interconnected GRDBs, wherein each GRDB comprises a plurality of densely connected Ghost modules.
As an optional implementation manner, before the first reconstruction and the second reconstruction, the method further includes: an attention mechanism that integrates spatial and channel attention is performed on deep features.
It should be noted that the present embodiment and the above system embodiment are based on the same inventive concept, and therefore, the related contents of the above system embodiment are also applicable to the present embodiment, and are not described herein again.
An embodiment of the present invention provides an image network model, and the model uses the feedback mechanism-based single image super-resolution reconstruction method described in the above method embodiment or the feedback mechanism-based single image super-resolution reconstruction system described in the above system embodiment during training. The training process and the testing process (including the processing of the training set) of the image network model provided by the embodiment are abstracted as follows:
1. training an image network model;
1.1, downsampling the high-resolution picture data set by an interpolation-based method to obtain a corresponding low-resolution picture data set.
1.2, setting the magnification factor (2 times, 3 times, 4 times and the like) of the image network model and the target path of the high-resolution and low-resolution picture data sets.
And 1.3, inputting the processed picture data set into an image network model in blocks for carrying out adjacent interpolation amplification.
And 1.4, inputting the picture data set into the system of the embodiment, and executing corresponding operation to obtain a super-resolution image.
And 1.5, training a convolution operator by using a loss function for the super-resolution image by taking the corresponding picture data set as supervision training.
And 1.6, obtaining a network model with corresponding magnification through a plurality of rounds of training.
2. Using a network model;
and 2.1, reconstructing the low-resolution image into a high-resolution image by using the trained corresponding magnification image network model according to the magnification to be amplified.
As shown in fig. 7, based on the above embodiments, the embodiments of the present invention provide a set of simulation experiments, which are specifically as follows:
simulating an environment;
the platform used was the ubantu16.08 operating system, the memory size was 128GB, the CPU used intel to strong E5-2670, the GPU used intemada TITANX, and trained in the pytorch0.4.0 deep learning environment of the GPU version. In the embodiment, a network parameter weight is initialized by adopting a method of Hommin et al, an Adam algorithm is adopted to optimize network parameters, the batch processing size is set to be 16, the image block size is set to be 48x48, the initial learning rate is 10-4, the learning rate is reduced to half of the original rate every 200 times of iterative training, and 1000 times of total iteration is performed.
Simulating a data set;
the adopted training set is DIV2K, the data set is published in NITRE challenge in 2017 and is used as a high-quality image data set for an image repairing task, and each image achieves the standard of 2K resolution ratio through 800 training sets, 100 verification sets and 100 test set pictures. The picture types of the DIV2K include characters, handmade products, environments (cities, villages), animals and plants, natural scenery, and the like. This example uses only 800 training set pictures for training and performs data enhancement before training. Data enhancement adopts three modes, namely 1, randomly rotating the picture by 90 degrees, 180 degrees and 270 degrees; 2. horizontally or vertically turning the picture; 3. the original image is reduced by a factor of reduction factors of 0.9, 0.8, 0.7 and 0.6. The training set after data enhancement is 10 times of the original picture, i.e. 8000 pictures. And finally, carrying out bilinear downsampling operation of different multiples (2 times, 3 times and 4 times) on the high-definition picture after data enhancement to obtain a low-resolution picture, and forming a training data pair with the original high-definition picture.
The test set adopted: set5, Set14, BSD100, Urban100, and Manga109 five widely used super resolution benchmark test sets for model performance evaluation. Wherein Urban100 contains 100 challenging Urban scene pictures, containing dense high-frequency feature details. Manga109 is 109 cartoon cover pictures, has high-frequency and low-frequency information and text information, and examines the comprehensive processing capacity of the model on the text and the pictures.
Experimental results;
the test set is adopted for testing, and common performance evaluation indexes are used: peak Signal to Noise Ratio (PSNR) and Structural Similarity (SSIM), and selecting Y channel for performance evaluation under YCbCr color coding format. In this embodiment, the models trained with different magnifications are used to perform the test of the corresponding magnification, and the test result is compared with the super-resolution model advanced in the coming year, including Bicubic (Bicubic), srncn, DRRN, IDN, cari, IMDN, and other models. Table 1 below shows the results of quantitative comparison of different super-resolution models using PSNR and SSIM at three different magnifications x2, x3, and x4, where bold and dash lines indicate the best results in the above algorithm:
Figure BDA0002737713870000151
TABLE 1
As can be seen from table 1, most of the PSNR and SSIM in the above test set of the model of this embodiment exceeds other super-resolution models or achieves suboptimal performance, and it is known that the difficulty of image super-resolution reconstruction increases with the increase of the magnification, and it can be seen that the model of this embodiment is superior to most of the models under the conditions of 3 times and 4 times of magnification. Under the condition of 4-fold amplification, in the test set Manga109, the PSNR value of the model of the embodiment is improved by 2.99dB compared with the SRCNN of the classical model and is improved by 0.13dB compared with the IMDN of the latest model. The model of the embodiment can introduce the high-frequency details which are difficult to learn in the high magnification factor, because the characteristics of deep extraction of the previous iteration are introduced into the iteration through a feedback mechanism, the learning depth of the high-frequency information is deepened, and therefore a good reconstruction effect is obtained in the high magnification factor. The effect of the model in the Urban100 test set is obviously higher than that of other data sets because the Urban100 data set contains pictures of Urban buildings and high-frequency details are more. The attention mechanism in the model of the embodiment can screen and reserve the characteristics containing more high-frequency information in the channels and the space, so that a better reconstruction effect can be obtained in an Urban test set. The GRDB in the model of this embodiment uses the densely connected Ghost Module, which can remove the redundant channel in the feature channel and reduce the amount of parameters in the network.
DRCN IDN SRMDNF CARN SRRAM This example
x3-params 1774 553 1528 1592 1127 1197
PSNR 27.15 27.42 27.57 28.06 28.12 28.23
TABLE 2
As can be seen from table 2, in the case that the amplification factor is 3 times and the test set is Urban100, the model of this embodiment obtains the best PSNR score under the condition that the parameter amount is kept low, which indicates that the model of this embodiment is more suitable for the mobile terminal device with less storage space under the premise of better reconstruction effect.
To further illustrate the experimental effect, in this embodiment, pictures in the Urban100 data set are selected for comparison, the data set contains relevant information of cities and buildings, high-frequency details are rich, and the super-resolution reconstruction is challenging. This example uses VSDR, DRCN, DRRN, laprn, MemNet, IDN, carry, IMDN and the model of this example (named FGRDN in the figure) to reconstruct them, the effect is shown in fig. 7.
As shown in fig. 7, the double-enlarged img _67 picture is a building with a glass outer frame structure, and contains a lot of transverse and oblique high-frequency information, most of the reconstruction models fail to recover clear transverse black lines, such as VDSR, DRCN, DRRN, laprn, MemNet, IDN, carin, and IMDN models, which can recover transverse black lines clearly, but some black lines have jaggy feeling, and the detail recovery is not sufficient. And the model of the embodiment can reconstruct more details, and the transverse black lines are clearer. The img _76 picture which is enlarged three times is a display combined building with a human face display, the longitudinal and oblique high-frequency details are denser, the extraction of the high-frequency details of the model is more challenging, most of reconstruction cannot clearly recover oblique display frame lines, and as shown in the model IDN, the low-frequency details are recognized as the high-frequency details, and a light-color area displayed by the display is recovered as the display frame by mistake, so that the visual perception is seriously influenced. The model of the embodiment can recover more high-frequency details and display a clear black border of the display. In the model of the embodiment, good effects are obtained on the two selected visual effect pictures by comparing the PSNR and SSIM evaluation indexes with the recovery effect of the existing model.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an illustrative embodiment," "an example," "a specific example," or "some examples" or the like mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims (10)

1. A single-image super-resolution reconstruction system based on a feedback mechanism is used for training an image network model and is characterized by comprising the following components:
the shallow feature extraction module is used for extracting shallow features of the low-resolution image subjected to mean shift segmentation;
a first deep feature extraction module for extracting deep features of the low resolution image from the shallow features;
the first reconstruction module is used for reconstructing the deep features output by the first deep feature extraction module to obtain a first super-resolution image, and the first super-resolution image is used for calculating a loss function of the image network model;
the characteristic refining module is used for carrying out cascade and convolution operations on the shallow characteristic and the deep characteristic extracted by the first deep characteristic extraction module to obtain a refined characteristic;
a second deep feature extraction module for extracting deep features of the low resolution image from the refined features;
and the second reconstruction module is used for reconstructing the deep features output by the second deep feature extraction module to obtain a second super-resolution image, and the second super-resolution image is used for calculating a loss function of the image network model and outputting the loss function as the super-resolution image of the image network model.
2. The single image super-resolution reconstruction system based on the feedback mechanism of claim 1, wherein the first deep feature extraction Module comprises a plurality of connected Ghost residual dense modules, each Ghost residual dense Module comprises a plurality of densely connected Ghost modules.
3. The single image super-resolution reconstruction system based on the feedback mechanism of claim 1, wherein the second deep feature extraction Module comprises a plurality of connected Ghost residual dense modules, each Ghost residual dense Module comprises a plurality of densely connected Ghost modules.
4. The single image super-resolution reconstruction system based on the feedback mechanism as claimed in any one of claims 2 or 3, wherein each Ghost residual dense module incorporates an attention module integrating spatial and channel attention mechanisms.
5. The single image super-resolution reconstruction system based on the feedback mechanism as claimed in claim 2, wherein: the feature refining module executes 2-time cascade connection and 2-time convolution, the first cascade connection connects deep features output by a plurality of Ghost residual error intensive modules, and a connection result is input to a first convolution layer; the second concatenation concatenates the output of the first convolutional layer with the shallow feature and inputs the concatenated result into the second convolutional layer.
6. A single image super-resolution reconstruction method based on a feedback mechanism is used for training an image network model and is characterized by comprising the following steps:
the first iteration:
extracting shallow features of the low-resolution image subjected to mean shift segmentation through convolution operation;
extracting deep features of the low-resolution image from the shallow features, and performing first reconstruction on the deep features to obtain a first super-resolution image, wherein the first super-resolution image is used for calculating a loss function of the image network model;
and (3) second iteration:
carrying out cascade and convolution operations on the deep features and the shallow features in the first iteration to obtain refined features;
and extracting deep features of the low-resolution image from the refined features, and performing secondary reconstruction on the deep features to obtain a second super-resolution image, wherein the second super-resolution image is used for calculating a loss function of the image network model and outputting the loss function as a super-resolution image of the image network model.
7. The method for single-image super-resolution reconstruction based on the feedback mechanism of claim 6, wherein the extracting deep features of the low-resolution image from the shallow features comprises: and extracting deep features of the low-resolution image from the shallow features through a plurality of connected Ghost residual error dense modules, wherein each Ghost residual error dense Module comprises a plurality of densely connected Ghost modules.
8. The method for single-image super-resolution reconstruction based on the feedback mechanism of claim 7, wherein the extracting deep features of the low-resolution image from the refined features comprises: and extracting deep features of the low-resolution image from the refined features through a plurality of connected Ghost residual error dense modules, wherein each Ghost residual error dense Module comprises a plurality of densely connected Ghost modules.
9. The method for single-image super-resolution reconstruction based on the feedback mechanism of claim 8, wherein before the first reconstruction and the second reconstruction, the method further comprises: an attention mechanism process that integrates spatial and channel attention is performed on the deep features.
10. An image network model, characterized in that the image network model uses the feedback mechanism-based single image super-resolution reconstruction system according to any one of claims 1 to 5 or the feedback mechanism-based single image super-resolution reconstruction method according to any one of claims 6 to 9 when training.
CN202011139130.5A 2020-10-22 2020-10-22 Single-image super-resolution reconstruction system and method based on feedback mechanism Active CN112200724B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011139130.5A CN112200724B (en) 2020-10-22 2020-10-22 Single-image super-resolution reconstruction system and method based on feedback mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011139130.5A CN112200724B (en) 2020-10-22 2020-10-22 Single-image super-resolution reconstruction system and method based on feedback mechanism

Publications (2)

Publication Number Publication Date
CN112200724A true CN112200724A (en) 2021-01-08
CN112200724B CN112200724B (en) 2023-04-07

Family

ID=74010824

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011139130.5A Active CN112200724B (en) 2020-10-22 2020-10-22 Single-image super-resolution reconstruction system and method based on feedback mechanism

Country Status (1)

Country Link
CN (1) CN112200724B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112819771A (en) * 2021-01-27 2021-05-18 东北林业大学 Wood defect detection method based on improved YOLOv3 model
CN113409191A (en) * 2021-06-02 2021-09-17 广东工业大学 Lightweight image super-resolution method and system based on attention feedback mechanism
CN113658046A (en) * 2021-08-18 2021-11-16 中科天网(广东)科技有限公司 Super-resolution image generation method, device, equipment and medium based on feature separation
CN113658044A (en) * 2021-08-03 2021-11-16 长沙理工大学 Method, system, device and storage medium for improving image resolution
CN116503506A (en) * 2023-06-25 2023-07-28 南方医科大学 Image reconstruction method, system, device and storage medium
WO2024021081A1 (en) * 2022-07-29 2024-02-01 宁德时代新能源科技股份有限公司 Method and apparatus for detecting defect on surface of product

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101485206A (en) * 2006-04-30 2009-07-15 惠普开发有限公司 Robust and efficient compression/decompression providing for adjustable division of computational complexity between encoding/compression and decoding/decompression
CN109272452A (en) * 2018-08-30 2019-01-25 北京大学 Learn the method for super-resolution network in wavelet field jointly based on bloc framework subband
CN109903228A (en) * 2019-02-28 2019-06-18 合肥工业大学 A kind of image super-resolution rebuilding method based on convolutional neural networks
CN110599401A (en) * 2019-08-19 2019-12-20 中国科学院电子学研究所 Remote sensing image super-resolution reconstruction method, processing device and readable storage medium
CN111192200A (en) * 2020-01-02 2020-05-22 南京邮电大学 Image super-resolution reconstruction method based on fusion attention mechanism residual error network
CN111353940A (en) * 2020-03-31 2020-06-30 成都信息工程大学 Image super-resolution reconstruction method based on deep learning iterative up-down sampling
CN111583107A (en) * 2020-04-03 2020-08-25 长沙理工大学 Image super-resolution reconstruction method and system based on attention mechanism
CN111612695A (en) * 2020-05-19 2020-09-01 华侨大学 Super-resolution reconstruction method for low-resolution face image

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101485206A (en) * 2006-04-30 2009-07-15 惠普开发有限公司 Robust and efficient compression/decompression providing for adjustable division of computational complexity between encoding/compression and decoding/decompression
CN109272452A (en) * 2018-08-30 2019-01-25 北京大学 Learn the method for super-resolution network in wavelet field jointly based on bloc framework subband
CN109903228A (en) * 2019-02-28 2019-06-18 合肥工业大学 A kind of image super-resolution rebuilding method based on convolutional neural networks
CN110599401A (en) * 2019-08-19 2019-12-20 中国科学院电子学研究所 Remote sensing image super-resolution reconstruction method, processing device and readable storage medium
CN111192200A (en) * 2020-01-02 2020-05-22 南京邮电大学 Image super-resolution reconstruction method based on fusion attention mechanism residual error network
CN111353940A (en) * 2020-03-31 2020-06-30 成都信息工程大学 Image super-resolution reconstruction method based on deep learning iterative up-down sampling
CN111583107A (en) * 2020-04-03 2020-08-25 长沙理工大学 Image super-resolution reconstruction method and system based on attention mechanism
CN111612695A (en) * 2020-05-19 2020-09-01 华侨大学 Super-resolution reconstruction method for low-resolution face image

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
STELLAR STREAM: "Ghost Module/GhostNet:一种模型压缩的轻量级模块/网络(论文阅读)", 《HTTPS://BLOG.CSDN.NET/QQ_34923437/ARTICLE/DETAILS/106248103》 *
李蒸; 张彤; 朱国涛; 王新; 王威: "一种基于深度学习的图像超分辨率重构方法" *
武玉坤; 陈沅涛: "应用超分辨率重建算法的图像匹配算法" *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112819771A (en) * 2021-01-27 2021-05-18 东北林业大学 Wood defect detection method based on improved YOLOv3 model
CN113409191A (en) * 2021-06-02 2021-09-17 广东工业大学 Lightweight image super-resolution method and system based on attention feedback mechanism
CN113658044A (en) * 2021-08-03 2021-11-16 长沙理工大学 Method, system, device and storage medium for improving image resolution
WO2023010831A1 (en) * 2021-08-03 2023-02-09 长沙理工大学 Method, system and apparatus for improving image resolution, and storage medium
CN113658044B (en) * 2021-08-03 2024-02-27 长沙理工大学 Method, system, device and storage medium for improving image resolution
CN113658046A (en) * 2021-08-18 2021-11-16 中科天网(广东)科技有限公司 Super-resolution image generation method, device, equipment and medium based on feature separation
WO2024021081A1 (en) * 2022-07-29 2024-02-01 宁德时代新能源科技股份有限公司 Method and apparatus for detecting defect on surface of product
CN116503506A (en) * 2023-06-25 2023-07-28 南方医科大学 Image reconstruction method, system, device and storage medium
CN116503506B (en) * 2023-06-25 2024-02-06 南方医科大学 Image reconstruction method, system, device and storage medium

Also Published As

Publication number Publication date
CN112200724B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
CN112200724B (en) Single-image super-resolution reconstruction system and method based on feedback mechanism
CN110033410B (en) Image reconstruction model training method, image super-resolution reconstruction method and device
CN109410239B (en) Text image super-resolution reconstruction method based on condition generation countermeasure network
CN108537733B (en) Super-resolution reconstruction method based on multi-path deep convolutional neural network
CN106683067B (en) Deep learning super-resolution reconstruction method based on residual sub-images
CN111915487B (en) Face super-resolution method and device based on hierarchical multi-scale residual fusion network
CN110232653A (en) The quick light-duty intensive residual error network of super-resolution rebuilding
CN111325165B (en) Urban remote sensing image scene classification method considering spatial relationship information
CN109035146A (en) A kind of low-quality image oversubscription method based on deep learning
CN111768340B (en) Super-resolution image reconstruction method and system based on dense multipath network
CN111696033B (en) Real image super-resolution model and method based on angular point guided cascade hourglass network structure learning
Shen et al. Convolutional neural pyramid for image processing
TWI719512B (en) Method and system for algorithm using pixel-channel shuffle convolution neural network
CN112017116B (en) Image super-resolution reconstruction network based on asymmetric convolution and construction method thereof
CN112381722A (en) Single-image hyper-segmentation and perception image enhancement joint task learning method
CN115358932A (en) Multi-scale feature fusion face super-resolution reconstruction method and system
CN115953294A (en) Single-image super-resolution reconstruction method based on shallow channel separation and aggregation
CN115713462A (en) Super-resolution model training method, image recognition method, device and equipment
Shen et al. RSHAN: Image super-resolution network based on residual separation hybrid attention module
CN117237190A (en) Lightweight image super-resolution reconstruction system and method for edge mobile equipment
Li et al. RGSR: A two-step lossy JPG image super-resolution based on noise reduction
CN116029905A (en) Face super-resolution reconstruction method and system based on progressive difference complementation
CN111539434B (en) Infrared weak and small target detection method based on similarity
CN113538505A (en) Motion estimation system and method of single picture based on deep learning
Li et al. Adversarial feature hybrid framework for steganography with shifted window local loss

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant