CN111861886A - Image super-resolution reconstruction method based on multi-scale feedback network - Google Patents

Image super-resolution reconstruction method based on multi-scale feedback network Download PDF

Info

Publication number
CN111861886A
CN111861886A CN202010682515.XA CN202010682515A CN111861886A CN 111861886 A CN111861886 A CN 111861886A CN 202010682515 A CN202010682515 A CN 202010682515A CN 111861886 A CN111861886 A CN 111861886A
Authority
CN
China
Prior art keywords
image
resolution
feature
scale
channels
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010682515.XA
Other languages
Chinese (zh)
Other versions
CN111861886B (en
Inventor
陈晓
孙超文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN202010682515.XA priority Critical patent/CN111861886B/en
Publication of CN111861886A publication Critical patent/CN111861886A/en
Application granted granted Critical
Publication of CN111861886B publication Critical patent/CN111861886B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to an image super-resolution reconstruction method based on a multi-scale feedback network, which comprises the following steps: (1) establishing an image data set; (2) extracting the features of an input image, then using a multi-scale upper projection unit and a multi-scale lower projection unit to recursively realize low-resolution and high-resolution feature mapping to obtain high-resolution feature maps with different depths, then using convolution calculation to obtain a residual image for the high-resolution feature map, and finally interpolating the low-resolution image and adding the residual image to obtain an output image; (3) training a multi-scale feedback network by using a data set to generate a trained network model; (4) and inputting the low-resolution image to be processed into the trained network to obtain an output high-resolution image. The method can train the networks at different depths and expand the networks to other amplification factors through small parameter adjustment, saves training cost, can realize amplification of larger factor, and improves the peak signal-to-noise ratio and the structural similarity of the reconstructed image.

Description

Image super-resolution reconstruction method based on multi-scale feedback network
Technical Field
The invention relates to an image super-resolution reconstruction method based on a multi-scale feedback network, and belongs to the fields of computer vision and deep learning.
Background
The Super-resolution (SR) reconstruction technique is an important image processing technique in the field of computer vision, and is widely applied to the fields of medical imaging, security monitoring, remote sensing image quality improvement, image compression and target detection. The image super-Resolution reconstruction aims to establish a proper model to convert a Low Resolution (LR) image into a corresponding High Resolution (HR) image. Since a given LR image input corresponds to multiple possible HR images, the SR reconstruction problem is a challenging ill-conditioned inverse problem.
Currently, the proposed SR reconstruction methods are mainly classified into three major categories, which are interpolation-based methods, reconstruction-based methods, and learning-based methods, respectively. Among them, the SR method based on deep learning has attracted much attention in recent years with its superior reconstruction performance. SRCNN is used as the mountain-opening work in the field of deep learning technology SR, and fully shows the superiority of the convolutional neural network. Therefore, many networks propose a series of SR methods based on convolutional neural networks based on the SRCNN architecture. Depth is an important factor to provide a network with a larger receptive field and more context information, however, increasing depth is very likely to cause two problems: gradient vanishing/explosion and a number of network parameters.
In order to solve the gradient problem, researchers propose residual error learning, successfully train deeper networks, and some networks introduce dense connections to alleviate the gradient disappearance problem and encourage feature reuse; to reduce the parameters, researchers have proposed recursive learning to help weight sharing. Thanks to these mechanisms, many networks tend to construct deeper and more complex network structures to obtain higher evaluation indexes, however, many networks have the following problems through research and discovery:
the first and many SR methods achieve high performance of a deep network, but neglect the training difficulty of the network, resulting in the need to spend a huge training set and invest more training skills and time.
The second, most SR methods learn the hierarchical feature representation directly from the LR input and map to the output space in a feed-forward manner, such a one-way mapping relying on finite features in the LR image. And many feedforward networks which need preprocessing operation only adapt to single amplification factor, and the complicated operation for shifting to other amplification factors is extremely lack of flexibility.
Disclosure of Invention
The invention provides an image super-resolution reconstruction method based on a multi-scale feedback network, aiming at solving the problems in the prior art. The method is characterized by comprising the following steps:
Firstly, establishing a data set by using an image degradation model;
constructing a multi-scale feedback network, wherein the multi-scale feedback network comprises an image feature extraction module, an image feature mapping module and a high-resolution image calculation module;
step 2.1, extracting image characteristics;
LR image I to be inputted from networkLRInput feature extraction module f0Generating an initial LR profile L0
L0=f0(ILR)
Let conv (f, n) denote the convolution layer, f is the convolution kernel size, and n is the number of channels; in the above formula f0Composed of 2 convolution layers conv (3, n)0) And conv (1, n), wherein n0 represents the number of channels of the initial low-resolution feature extraction layer, and n represents the number of input channels in the feature mapping module; first using conv (3, n)0) Generating shallow features L with low resolution image information from input0Then, the number of channels is increased from n by using conv (1, n)0Reducing to n;
step 2.2, image feature mapping;
low resolution feature map Lg-1Input recursive feedback module to generate high resolution feature map Hg
Figure BDA0002586350550000021
Wherein G represents the number of multi-scale projection groups, i.e. the number of recursions;
Figure BDA0002586350550000022
representing the feature mapping process for the set of multi-scale projections in the g-th recursion. When g is equal to 1, the initial feature map L is represented0As an input to the first multi-scale projection group, when g is greater than 1, it indicates the LR feature map L to be produced by the previous multi-scale projection group g-1As a current input;
step 2.3, calculating a high-resolution image;
computing a residual image by cascading a plurality of HR feature map depths according to the following formula:
IRes=fRM([H1,H2,…,Hg])
wherein [ H ]1,H2,…,Hg]Representing a deep cascade of multiple HR feature maps, fRMDenotes the conv (3,3) operation, IResIs a residual image.
Interpolating LR image to obtain image and residual image IResAdding to obtain a reconstructed high-resolution image ISR
ISR=IRes+fUS(ILR)
Wherein f isUSIndicating an interpolation operation.
Step three, training a multi-scale feedback network;
and step four, reconstructing an image.
The technical scheme is further designed as follows: the process of establishing the data set by using the image degradation model in the step one is that I is givenLRRepresenting an LR image, IHRRepresenting the corresponding HR image, the degradation process is represented as:
ILR=D(IHR;)
modeling a degradation map that generates an LR image from an HR image, and modeling the degradation as a single downsampling operation:
Figure BDA0002586350550000023
therein ↓sThe down-sampling operation is performed at a magnification s, which is a scale factor.
The interpolation algorithm is a bilinear interpolation algorithm or a bicubic interpolation algorithm.
The loss function for training the multi-scale feedback network in the third step is as follows:
Figure BDA0002586350550000031
wherein x is a set of weight parameters and bias parameters, i represents a sequence number of iterative training in the whole training process, and m represents the number of training images.
The invention has the beneficial effects that:
the modularized end-to-end system structure can flexibly train networks with different depths and arbitrarily expand the networks to other amplification factors only through small parameter adjustment, greatly saves the training cost, can successfully realize amplification with a larger factor (8 times), and improves the peak signal-to-noise ratio and the structural similarity of a reconstructed image. The method can also relieve the influence of ringing effect and chessboard artifact based on a convolutional neural network method, predict more high-frequency details and inhibit smooth components, so that the reconstructed image has clearer and sharper edge characteristics and is closer to a real high-resolution image.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a block diagram of a multi-scale feedback network;
FIG. 3 is a block diagram of a projection unit on multiple scales in a network;
fig. 4 is a block diagram of a projection unit in a multi-scale in a network.
Detailed Description
The invention is described in detail below with reference to the figures and the specific embodiments.
Examples
As shown in fig. 1, the image super-resolution reconstruction method based on the multi-scale feedback network of the present embodiment includes the following steps:
step 1, establishing a data set by using an image degradation model;
Let ILRRepresenting an LR image, IHRRepresenting the corresponding HR image, the degradation process is represented as:
ILR=D(IHR;) (1)
modeling a degradation map that generates an LR image from an HR image, and modeling the degradation as a single downsampling operation:
Figure BDA0002586350550000032
therein ↓sThe down-sampling operation is performed at a magnification s, which is a scale factor.
This embodiment uses bicubic interpolation with antialiasing as a downsampling operation, taking m training images in DIV2K as a training set. Set5, Set14, Urban100, BSD100 and Manga109 were chosen as standard test sets and were down-sampled 2, 3, 4 and 8 times using a bicubic interpolation algorithm, respectively.
Step 2, constructing a multi-scale feedback network; the network structure is shown in fig. 2, and includes the following steps:
step 2.1, extracting image features;
initial LR image ILRInput feature extraction module f0Generating an initial LR profile L0
L0=f0(ILR) (3)
Let conv (f, n) denote the convolution layer, f the convolution kernel size and n the number of channels. Wherein f is0Composed of 2 convolution layers conv (3, n)0) And conv (1, n), wherein n0 represents the number of channels of the initial LR image feature extraction layer, and n represents the number of input channels in the feature mapping module. First using conv (3, n)0) Generating shallow features L with LR image information from input 0Then, the number of channels is increased from n by using conv (1, n)0Is reduced to n.
Step 2.2, image feature mapping;
using multi-scale up-projection unitsAnd forming a projection group by the projection units under multiple scales to realize low-resolution and high-resolution feature mapping in a recursion manner, so as to obtain high-resolution feature maps at different depths. Low resolution feature map Lg-1Input recursive feedback module to generate high resolution feature map Hg
Figure BDA0002586350550000041
Wherein G represents the number of multi-scale projection groups, i.e. the number of recursions;
Figure BDA0002586350550000042
representing the feature mapping process for the set of multi-scale projections in the g-th recursion. When g is equal to 1, the initial feature map L is represented0As an input to the first multi-scale projection group, when g is greater than 1, it indicates the LR feature map L to be produced by the previous multi-scale projection groupg-1As the current input.
Figure BDA0002586350550000043
The operations include two operations, mapping the LR characteristic to the HR characteristic and the HR characteristic to the LR characteristic, the structures of which are shown in fig. 3 and 4.
The multi-scale up-projection unit maps the LR feature to the HR feature (the structure is shown in fig. 3) by the following six steps:
(1): LR characteristic map L calculated from previous cycleg-1As input, deconvolution with different kernel sizes are used
Figure BDA0002586350550000044
And
Figure BDA0002586350550000045
performing an upsampling operation on the two branches to obtain two HR feature maps
Figure BDA0002586350550000046
And
Figure BDA0002586350550000047
Figure BDA0002586350550000048
Figure BDA0002586350550000049
Figure BDA00025863505500000410
and
Figure BDA00025863505500000411
respectively, Deconv1 (k)1N) and Deconv2 (k)2,n),k1And k2Denotes the size of the deconvolution kernel, and n denotes the number of channels.
(2): mapping HR characteristics
Figure BDA00025863505500000412
And
Figure BDA00025863505500000413
cascading, using convolutions of different kernel sizes
Figure BDA00025863505500000414
And
Figure BDA00025863505500000415
performing a downsampling operation on two branches and generating two LR profiles
Figure BDA00025863505500000416
And
Figure BDA00025863505500000417
Figure BDA00025863505500000418
Figure BDA00025863505500000419
Figure BDA0002586350550000051
and
Figure BDA0002586350550000052
respectively represent Conv1 (k)12n) and Conv2 (k)22n), the number of channels of each branch is changed from n to 2 n.
(3): map of LR characteristics
Figure BDA0002586350550000053
And
Figure BDA0002586350550000054
cascading, pooling and dimensionality reduction using a 1 x 1 convolution,
Figure BDA0002586350550000055
and
Figure BDA0002586350550000056
mapping to an LR profile
Figure BDA0002586350550000057
Figure BDA0002586350550000058
CuConv (1, n) is shown, and the number of channels per branch is changed from 2n to n. And all 1 x 1 convolutions add non-linear excitation to the learned representation of the previous layer.
(4): computing an input LR profile Lg-1And reconstructed LR feature maps
Figure BDA0002586350550000059
Residual error between
Figure BDA00025863505500000510
Figure BDA00025863505500000511
(5): deconvolution with different kernel sizes
Figure BDA00025863505500000512
And
Figure BDA00025863505500000513
respectively for residual errors
Figure BDA00025863505500000514
An upsampling operation is performed, and the residual error in the LR characteristic is mapped into the HR characteristic, so that a new HR residual error characteristic is generated
Figure BDA00025863505500000515
And
Figure BDA00025863505500000516
Figure BDA00025863505500000517
Figure BDA00025863505500000518
Figure BDA00025863505500000519
and
Figure BDA00025863505500000520
respectively, denote the deconvolution layer Deconv1 (k)1N) and Deconv2 (k)2N), the number of channels of each branch is still n.
(6): characterizing residual HR
Figure BDA00025863505500000521
And
Figure BDA00025863505500000522
cascading, overlapping with the cascaded HR characteristics in the step (2), and outputting the final HR characteristic diagram H of the upper projection unit through 1 × 1 convolution g
Figure BDA00025863505500000523
ChConv (1, n) is shown, the total number of channels added is 2n, and the number of output channels is reduced to n by Conv (1, n), and is kept the same as the number of input channels.
The multi-scale projection unit maps the HR feature to the LR feature (the structure is shown in fig. 4) by the following six steps:
step (1): the HR characteristic diagram H output by the projection unit on the multi-scale of the previous cyclegAs input, convolutions with different kernel sizes are used
Figure BDA00025863505500000524
And
Figure BDA00025863505500000525
performing downsampling operation on the two branches to obtain two LR characteristic maps
Figure BDA00025863505500000526
And
Figure BDA00025863505500000527
Figure BDA00025863505500000528
Figure BDA00025863505500000529
Figure BDA00025863505500000530
and
Figure BDA00025863505500000531
respectively represent Conv1 (k)1N) and Conv2 (k)2,n)。
Step (2): map of LR characteristics
Figure BDA00025863505500000532
And
Figure BDA00025863505500000533
cascading, using deconvolution of different kernel sizes
Figure BDA00025863505500000534
And
Figure BDA00025863505500000535
performing an upsampling operation on two branches and generating two HR profiles
Figure BDA0002586350550000061
And
Figure BDA0002586350550000062
Figure BDA0002586350550000063
Figure BDA0002586350550000064
Figure BDA0002586350550000065
and
Figure BDA0002586350550000066
respectively, Deconv1 (k)12n) and Deconv2 (k)22n), the number of channels of each branch is changed from n to 2 n.
And (3): mapping HR characteristics
Figure BDA0002586350550000067
And
Figure BDA0002586350550000068
the processes of the cascade connection are carried out,and obtaining HR characteristic diagram by 1 multiplied by 1 convolution
Figure BDA0002586350550000069
Figure BDA00025863505500000610
CdConv (1, n) is shown, and the number of channels per branch is changed from 2n to n.
And (4): computing an input HR profile HgAnd a reconstructed HR profile
Figure BDA00025863505500000611
Residual error between
Figure BDA00025863505500000612
Figure BDA00025863505500000613
And (5): convolution with different kernel sizes
Figure BDA00025863505500000614
And
Figure BDA00025863505500000615
respectively for residual errors
Figure BDA00025863505500000616
A downsampling operation is performed, and the residual error in the HR characteristic is mapped into the LR characteristic, so as to generate a new LR residual error characteristic
Figure BDA00025863505500000617
And
Figure BDA00025863505500000618
Figure BDA00025863505500000619
Figure BDA00025863505500000620
Figure BDA00025863505500000621
and
Figure BDA00025863505500000622
respectively, the convolutional layers Conv1 (k)1N) and Conv2 (k)2N), the number of channels of each branch is still n.
And (6): LR characterization of residual errors
Figure BDA00025863505500000623
And
Figure BDA00025863505500000624
cascading, overlapping with the LR characteristics cascaded in the step 2, and outputting the final LR characteristic graph L of the lower projection unit through 1 × 1 convolutiong
Figure BDA00025863505500000625
ClConv (1, n) is shown, the total number of channels added is 2n, and Conv (1, n) reduces the number of output channels to n, keeping the same as the number of input channels.
Step 2.3, calculating a high-resolution image;
calculating a residual image by the following formula through depth cascading of the multiple high-resolution feature maps;
IRes=fRM([H1,H2,…,Hg]) (23)
wherein [ H ]1,H2,…,Hg]Representing a deep cascade of multiple HR feature maps, fRMRepresenting the conv (3,3) operation, a series of cascaded HR feature maps are input into conv (3,3) to generate a residual image IRes
Interpolating the low resolution image to obtain an image and a residual image IResAdding to generate a reconstructed high resolution image ISR
ISR=IRes+fUS(ILR) (24)
Wherein f isUSThe interpolation upsampling operation is represented by a bilinear interpolation algorithm, and a bicubic interpolation algorithm or other interpolation algorithms can also be used.
Step 3, training a multi-scale feedback network;
the batch of the network is set to 16 and data enhancement is performed using rotation and flipping. LR images and corresponding HR images of different sizes are input according to the magnification factor. Adam was used to optimize the network parameters, the momentum factor was 0.9, and the weight was attenuated by 0.0001. The initial learning rate value was set to 0.0001 and the learning rate was attenuated to half of the original per 200 iterations.
Different kernel sizes and fills are designed in each branch of the multi-scale projection unit and the kernel sizes and step sizes are adjusted according to the corresponding magnification. Both input and output use the RGB channels of the color image. The PReLU is used as an activation function behind all convolutional and deconvolution layers, except the reconstruction layer at the end of the network. And (3) training the network by using the image data set in the step (1) according to the process in the step (2) until the cost loss is reduced to a set value and the training reaches the maximum iteration times. By means of L1The function is a loss function, and the expression is as follows:
Figure BDA0002586350550000071
wherein x is a set of weight parameters and bias parameters, i represents a sequence number of iterative training in the whole training process, and m represents the number of training images.
Step 4, reconstructing an image;
and inputting the low-resolution image to be processed into the trained network to obtain an output high-resolution image.
The peak signal-to-noise ratio and the structural similarity are used as evaluation indexes to evaluate the model performance in 5 standard test sets of Set5, Set14, Urban100, BSD100 and Manga109, and all tests adopt y channels.
In order to verify the effectiveness and reliability of the method, comparison is carried out with a plurality of existing reconstruction methods on different magnification factors. The method was compared to the currently available 21 advanced methods at low magnification (× 2, × 3, × 4). Since many models are not suitable for high magnification (x 8), the method is compared to 12 advanced methods. For x 2 amplification, the method achieves the best peak signal-to-noise results in the five reference data sets. However, the peak signal-to-noise ratio and structural similarity of the method are superior to all other models for x 3, x 4 and x 8 amplification. The advantages are relatively more pronounced with increasing amplification factor, especially for x 8, demonstrating the effectiveness of the method in dealing with high magnification. In the five data sets, the method has higher objective evaluation indexes in the aspects of peak signal-to-noise ratio and structural similarity. It is demonstrated that the method is not only prone to the construction of regular artificial patterns, but is also good at reconstructing irregular natural patterns. The method has advantages in adapting to various scene characteristics and has surprising super-resolution reconstruction results for images with different characteristics.
The multi-scale feedback network method designed by the present embodiment uses only m (800) training images from DIV2K, and can still achieve superior reconstruction performance at 8 times magnification by a relatively small training set compared to other existing methods. By combining the multi-scale convolution with a feedback mechanism, the method can learn rich hierarchical feature representation on a plurality of context scales, capture image features of different scales, and refine low-level representation by using high-level features to better represent the mutual relation between HR (high-rate) images and LR (low-rate) images. In addition to combining the high-level information and the low-level information, the local information and the global information are combined through global residual learning and local residual feedback fusion, so that the quality of a reconstructed image is improved better. In addition, the modularized end-to-end architecture enables the method to train and flexibly train networks of different depths and arbitrarily expand to other magnification factors only through small parameter adjustment. The method can effectively relieve the influence of ringing effect and chessboard artifact, has excellent reconstruction performance compared with a plurality of advanced methods at present, and particularly has more obvious advantage in high-power amplification which is not good for a plurality of methods.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, without departing from the technical principle of the present invention, several improvements and modifications can be made, and these improvements and modifications should be considered as the protection scope of the present invention, including but not limited to the use of the present method and its improvements and modifications for other image processing aspects, such as image classification, detection, denoising, enhancement, etc.

Claims (6)

1. An image super-resolution reconstruction method based on a multi-scale feedback network is characterized by comprising the following steps:
firstly, establishing an image data set by using an image degradation model;
constructing a multi-scale feedback network, wherein the multi-scale feedback network comprises an image feature extraction module, an image feature mapping module and a high-resolution image calculation module;
step 2.1, extracting image characteristics;
low resolution image I of network inputLRInput feature extraction module f0Generating an initial low resolution profile L0
L0=f0(ILR)
Let conv (f, n) denote the convolution layer, f is the convolution kernel size, and n is the number of channels; in the above formula f0Composed of 2 convolution layers conv (3, n)0) And conv (1, n), wherein n0 represents the number of channels of the initial low-resolution feature extraction layer, and n represents the number of input channels in the feature mapping module; first using conv (3, n) 0) Generating shallow features L with low resolution image information from input0Then, the number of channels is increased from n by using conv (1, n)0Reducing to n;
step 2.2, image feature mapping;
forming a projection group by using a multi-scale upper projection unit and a multi-scale lower projection unit to recursively realize low-resolution and high-resolution feature mapping, and obtaining high-resolution feature maps at different depths; low resolution feature map Lg-1Input recursionFeedback module generates high resolution profile Hg
Figure FDA0002586350540000011
Wherein G represents the number of multi-scale projection groups, i.e. the number of recursions;
Figure FDA0002586350540000012
representing the feature mapping process for the set of multi-scale projections in the g-th recursion. When g is equal to 1, the initial feature map L is represented0As an input to the first multi-scale projection group, when g is greater than 1, it indicates the LR feature map L to be produced by the previous multi-scale projection groupg-1As a current input;
Figure FDA0002586350540000013
the operations include two operations of mapping an LR feature to an HR feature and mapping an HR feature to an LR feature;
step 2.3, calculating a high-resolution image;
calculating a residual image by the following formula through depth cascading of the multiple high-resolution feature maps;
IRes=fRM([H1,H2,…,Hg])
wherein [ H ]1,H2,…,Hg]Depth concatenation representing multiple high resolution feature maps, fRMDenotes the conv (3,3) operation, IResIs a residual image;
interpolating the low resolution image to obtain an image and a residual image I ResAdding to generate a reconstructed high resolution image ISR
ISR=IRes+fUS(ILR)
Wherein f isUSIndicating an interpolation operation.
Step three, training a multi-scale feedback network;
step four, image reconstruction;
and inputting the low-resolution image to be processed into the trained network to obtain an output high-resolution image.
2. The image super-resolution reconstruction method based on the multi-scale feedback network as claimed in claim 1, wherein: the process of establishing the data set by using the image degradation model in the first step is that,
given of ILRRepresenting low resolution images, IHRRepresenting the corresponding high resolution image, the degradation process is represented as:
ILR=D(IHR;)
modeling a degradation map that generates a low resolution image from a high resolution image, and modeling the degradation as a single downsampling operation:
Figure FDA0002586350540000021
therein ↓sThe down-sampling operation is performed at a magnification s, which is a scale factor.
3. The image super-resolution reconstruction method based on the multi-scale feedback network as claimed in claim 1, wherein: in the step 2.2 of the image feature mapping process, the process of mapping the LR feature into the HR feature is as follows:
(1): LR characteristic map L calculated from previous cycleg-1As input, deconvolution with different kernel sizes are used
Figure FDA0002586350540000022
And
Figure FDA0002586350540000023
performing an upsampling operation on the two branches to obtain two HR feature maps
Figure FDA0002586350540000024
And
Figure FDA0002586350540000025
Figure FDA0002586350540000026
Figure FDA0002586350540000027
Figure FDA0002586350540000028
and
Figure FDA0002586350540000029
respectively, Deconv1 (k)1N) and Deconv2 (k)2,n),k1And k2Representing the size of a deconvolution kernel, and n representing the number of channels;
(2): mapping HR characteristics
Figure FDA00025863505400000210
And
Figure FDA00025863505400000211
cascading, using convolutions of different kernel sizes
Figure FDA00025863505400000212
And
Figure FDA00025863505400000213
performing a downsampling operation on two branches and generating two LR profiles
Figure FDA00025863505400000214
And
Figure FDA00025863505400000215
Figure FDA00025863505400000216
Figure FDA00025863505400000217
Figure FDA00025863505400000218
and
Figure FDA00025863505400000219
respectively represent Conv1 (k)12n) and Conv2 (k)22n), the number of channels of each branch is changed from n to 2 n;
(3): map of LR characteristics
Figure FDA00025863505400000220
And
Figure FDA00025863505400000221
cascading, pooling and dimensionality reduction using a 1 x 1 convolution,
Figure FDA00025863505400000222
and
Figure FDA00025863505400000223
mapping to an LR profile
Figure FDA00025863505400000224
Figure FDA00025863505400000225
CuConv (1, n) is shown, and the number of channels per branch is changed from 2n to n. And all 1 x 1 convolutions add non-linear excitation on the learned representation of the previous layer;
(4): computing an input LR profile Lg-1And reconstructed LR feature maps
Figure FDA00025863505400000226
Residual error between
Figure FDA00025863505400000227
Figure FDA0002586350540000031
(5): deconvolution with different kernel sizes
Figure FDA0002586350540000032
And
Figure FDA0002586350540000033
respectively for residual errors
Figure FDA0002586350540000034
An upsampling operation is performed, and the residual error in the LR characteristic is mapped into the HR characteristic, so that a new HR residual error characteristic is generated
Figure FDA0002586350540000035
And
Figure FDA0002586350540000036
Figure FDA0002586350540000037
Figure FDA0002586350540000038
Figure FDA0002586350540000039
and
Figure FDA00025863505400000310
respectively, denote the deconvolution layer Deconv1 (k)1N) and Deconv2 (k)2N), each ofThe number of the channels of the branch is still n;
(6): characterizing residual HR
Figure FDA00025863505400000311
And
Figure FDA00025863505400000312
cascading, overlapping with the cascaded HR characteristics in the step (2), and outputting the final HR characteristic diagram H of the upper projection unit through 1 × 1 convolution g
Figure FDA00025863505400000313
ChConv (1, n) is shown, the total number of channels added is 2n, and the number of output channels is reduced to n by Conv (1, n), and is kept the same as the number of input channels.
4. The image super-resolution reconstruction method based on the multi-scale feedback network as claimed in claim 1, wherein: in the step 2.2, in the image feature mapping process, the process of mapping the HR feature to the LR feature is as follows:
(1): the HR characteristic diagram H output by the projection unit on the multi-scale of the previous cyclegAs input, convolutions with different kernel sizes are used
Figure FDA00025863505400000314
And
Figure FDA00025863505400000315
performing downsampling operation on the two branches to obtain two LR characteristic maps
Figure FDA00025863505400000316
And
Figure FDA00025863505400000317
Figure FDA00025863505400000318
Figure FDA00025863505400000319
Figure FDA00025863505400000320
and
Figure FDA00025863505400000321
respectively represent Conv1 (k)1N) and Conv2 (k)2,n);
(2): map of LR characteristics
Figure FDA00025863505400000322
And
Figure FDA00025863505400000323
cascading, using deconvolution of different kernel sizes
Figure FDA00025863505400000324
And
Figure FDA00025863505400000325
performing an upsampling operation on two branches and generating two HR profiles
Figure FDA00025863505400000326
And
Figure FDA00025863505400000327
Figure FDA00025863505400000328
Figure FDA00025863505400000329
Figure FDA00025863505400000330
and
Figure FDA00025863505400000331
respectively, Deconv1 (k)12n) and Deconv2 (k)22n), the number of channels of each branch is changed from n to 2 n;
and (3): mapping HR characteristics
Figure FDA00025863505400000332
And
Figure FDA00025863505400000333
cascading, and obtaining HR characteristic diagram by 1 × 1 convolution
Figure FDA00025863505400000334
Figure FDA00025863505400000335
CdConv (1, n) is represented, and the number of channels of each branch is changed from 2n to n;
and (4): computing an input HR profile HgAnd a reconstructed HR profile
Figure FDA0002586350540000041
Residual error between
Figure FDA0002586350540000042
Figure FDA0002586350540000043
And (5): convolution with different kernel sizes
Figure FDA0002586350540000044
And
Figure FDA0002586350540000045
respectively for residual errors
Figure FDA0002586350540000046
A downsampling operation is performed, and the residual error in the HR characteristic is mapped into the LR characteristic, so as to generate a new LR residual error characteristic
Figure FDA0002586350540000047
And
Figure FDA0002586350540000048
Figure FDA0002586350540000049
Figure FDA00025863505400000410
Figure FDA00025863505400000411
and
Figure FDA00025863505400000412
respectively, the convolutional layers Conv1 (k)1N) and Conv2 (k)2N), the number of channels of each branch is still n;
(6): LR characterization of residual errors
Figure FDA00025863505400000413
And
Figure FDA00025863505400000414
cascading, overlapping with the LR characteristics cascaded in the step (2), and outputting the final LR characteristic diagram L of the lower projection unit through 1 × 1 convolutiong
Figure FDA00025863505400000415
ClConv (1, n) is shown, the total number of channels added is 2n, and Conv (1, n) reduces the number of output channels to n, keeping the same as the number of input channels.
5. The image super-resolution reconstruction method based on the multi-scale feedback network as claimed in claim 3, wherein: the interpolation algorithm is a bilinear interpolation algorithm or a bicubic interpolation algorithm, and other interpolation algorithms can also be used.
6. The image super-resolution reconstruction method based on the multi-scale feedback network as claimed in claim 1, wherein: the loss function for training the multi-scale feedback network in the third step is as follows:
Figure FDA00025863505400000416
wherein x is a set of weight parameters and bias parameters, i represents a sequence number of iterative training in the whole training process, and m represents the number of training images.
CN202010682515.XA 2020-07-15 2020-07-15 Image super-resolution reconstruction method based on multi-scale feedback network Active CN111861886B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010682515.XA CN111861886B (en) 2020-07-15 2020-07-15 Image super-resolution reconstruction method based on multi-scale feedback network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010682515.XA CN111861886B (en) 2020-07-15 2020-07-15 Image super-resolution reconstruction method based on multi-scale feedback network

Publications (2)

Publication Number Publication Date
CN111861886A true CN111861886A (en) 2020-10-30
CN111861886B CN111861886B (en) 2023-08-08

Family

ID=72983037

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010682515.XA Active CN111861886B (en) 2020-07-15 2020-07-15 Image super-resolution reconstruction method based on multi-scale feedback network

Country Status (1)

Country Link
CN (1) CN111861886B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112418418A (en) * 2020-11-11 2021-02-26 江苏禹空间科技有限公司 Data processing method and device based on neural network, storage medium and server
CN112767427A (en) * 2021-01-19 2021-05-07 西安邮电大学 Low-resolution image recognition algorithm for compensating edge information
CN112927159A (en) * 2021-03-11 2021-06-08 清华大学深圳国际研究生院 True image denoising method based on multi-scale selection feedback network
CN113191949A (en) * 2021-04-28 2021-07-30 中南大学 Multi-scale super-resolution pathological image digitization method and system and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160301841A1 (en) * 2010-05-03 2016-10-13 Invisage Technologies, Inc. Devices and methods for high-resolution image and video capture
CN108550115A (en) * 2018-04-25 2018-09-18 中国矿业大学 A kind of image super-resolution rebuilding method
CN108734660A (en) * 2018-05-25 2018-11-02 上海通途半导体科技有限公司 A kind of image super-resolution rebuilding method and device based on deep learning
CN108921789A (en) * 2018-06-20 2018-11-30 华北电力大学 Super-resolution image reconstruction method based on recurrence residual error network
CN109035163A (en) * 2018-07-09 2018-12-18 南京信息工程大学 A kind of adaptive denoising method based on deep learning
CN109741260A (en) * 2018-12-29 2019-05-10 天津大学 A kind of efficient super-resolution method based on depth back projection network
CN110197468A (en) * 2019-06-06 2019-09-03 天津工业大学 A kind of single image Super-resolution Reconstruction algorithm based on multiple dimensioned residual error learning network
CN110276721A (en) * 2019-04-28 2019-09-24 天津大学 Image super-resolution rebuilding method based on cascade residual error convolutional neural networks
CN111192200A (en) * 2020-01-02 2020-05-22 南京邮电大学 Image super-resolution reconstruction method based on fusion attention mechanism residual error network

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160301841A1 (en) * 2010-05-03 2016-10-13 Invisage Technologies, Inc. Devices and methods for high-resolution image and video capture
CN108550115A (en) * 2018-04-25 2018-09-18 中国矿业大学 A kind of image super-resolution rebuilding method
CN108734660A (en) * 2018-05-25 2018-11-02 上海通途半导体科技有限公司 A kind of image super-resolution rebuilding method and device based on deep learning
CN108921789A (en) * 2018-06-20 2018-11-30 华北电力大学 Super-resolution image reconstruction method based on recurrence residual error network
CN109035163A (en) * 2018-07-09 2018-12-18 南京信息工程大学 A kind of adaptive denoising method based on deep learning
CN109741260A (en) * 2018-12-29 2019-05-10 天津大学 A kind of efficient super-resolution method based on depth back projection network
CN110276721A (en) * 2019-04-28 2019-09-24 天津大学 Image super-resolution rebuilding method based on cascade residual error convolutional neural networks
CN110197468A (en) * 2019-06-06 2019-09-03 天津工业大学 A kind of single image Super-resolution Reconstruction algorithm based on multiple dimensioned residual error learning network
CN111192200A (en) * 2020-01-02 2020-05-22 南京邮电大学 Image super-resolution reconstruction method based on fusion attention mechanism residual error network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
XUETONG XUE 等: "Progressive Sub-Band Residual-Learning Network for MR Image Super Resolution", vol. 24, no. 2, pages 377 - 386, XP011770217, DOI: 10.1109/JBHI.2019.2945373 *
宋玉龙: "基于反馈残差网络的矿井图像超分辨率重建算法研究", no. 02, pages 021 - 230 *
黄思炜: "基于深度学习的超分辨率图像重建算法研究", no. 11, pages 138 - 528 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112418418A (en) * 2020-11-11 2021-02-26 江苏禹空间科技有限公司 Data processing method and device based on neural network, storage medium and server
CN112767427A (en) * 2021-01-19 2021-05-07 西安邮电大学 Low-resolution image recognition algorithm for compensating edge information
CN112927159A (en) * 2021-03-11 2021-06-08 清华大学深圳国际研究生院 True image denoising method based on multi-scale selection feedback network
CN112927159B (en) * 2021-03-11 2022-08-02 清华大学深圳国际研究生院 True image denoising method based on multi-scale selection feedback network
CN113191949A (en) * 2021-04-28 2021-07-30 中南大学 Multi-scale super-resolution pathological image digitization method and system and storage medium
CN113191949B (en) * 2021-04-28 2023-06-20 中南大学 Multi-scale super-resolution pathology image digitizing method, system and storage medium

Also Published As

Publication number Publication date
CN111861886B (en) 2023-08-08

Similar Documents

Publication Publication Date Title
CN109903228B (en) Image super-resolution reconstruction method based on convolutional neural network
Yang et al. DRFN: Deep recurrent fusion network for single-image super-resolution with large factors
CN110119780B (en) Hyper-spectral image super-resolution reconstruction method based on generation countermeasure network
CN108765296B (en) Image super-resolution reconstruction method based on recursive residual attention network
CN109035142B (en) Satellite image super-resolution method combining countermeasure network with aerial image prior
CN111861886A (en) Image super-resolution reconstruction method based on multi-scale feedback network
CN110136062B (en) Super-resolution reconstruction method combining semantic segmentation
CN112215755A (en) Image super-resolution reconstruction method based on back projection attention network
CN111932461B (en) Self-learning image super-resolution reconstruction method and system based on convolutional neural network
CN112348743B (en) Image super-resolution method fusing discriminant network and generation network
CN110675321A (en) Super-resolution image reconstruction method based on progressive depth residual error network
CN113096017A (en) Image super-resolution reconstruction method based on depth coordinate attention network model
CN111815516B (en) Super-resolution reconstruction method for weak supervision infrared remote sensing image
CN112184554A (en) Remote sensing image fusion method based on residual mixed expansion convolution
CN112837224A (en) Super-resolution image reconstruction method based on convolutional neural network
CN115936985A (en) Image super-resolution reconstruction method based on high-order degradation cycle generation countermeasure network
CN111353938A (en) Image super-resolution learning method based on network feedback
CN113592715A (en) Super-resolution image reconstruction method for small sample image set
CN112884650A (en) Image mixing super-resolution method based on self-adaptive texture distillation
CN110047038B (en) Single-image super-resolution reconstruction method based on hierarchical progressive network
CN109993701B (en) Depth map super-resolution reconstruction method based on pyramid structure
CN113379606B (en) Face super-resolution method based on pre-training generation model
Liu et al. Facial image inpainting using multi-level generative network
CN115439849B (en) Instrument digital identification method and system based on dynamic multi-strategy GAN network
CN116797456A (en) Image super-resolution reconstruction method, system, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant