CN114862679A - Single-image super-resolution reconstruction method based on residual error generation countermeasure network - Google Patents

Single-image super-resolution reconstruction method based on residual error generation countermeasure network Download PDF

Info

Publication number
CN114862679A
CN114862679A CN202210499131.3A CN202210499131A CN114862679A CN 114862679 A CN114862679 A CN 114862679A CN 202210499131 A CN202210499131 A CN 202210499131A CN 114862679 A CN114862679 A CN 114862679A
Authority
CN
China
Prior art keywords
layer
network
image
resolution
resolution image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210499131.3A
Other languages
Chinese (zh)
Inventor
杨旭广
杨欣
李恒锐
朱义天
樊江锋
周大可
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN202210499131.3A priority Critical patent/CN114862679A/en
Publication of CN114862679A publication Critical patent/CN114862679A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4053Super resolution, i.e. output image resolution higher than sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4046Scaling the whole image or part thereof using neural networks

Abstract

The invention discloses a single-image super-resolution reconstruction method for generating a countermeasure network based on residual errors, which comprises the steps of firstly establishing a generation network, inputting a low-resolution image into the generation network, and obtaining a generated high-resolution image; then, inputting the generated high-resolution image and a real high-resolution image corresponding to the low-resolution image into a discrimination network model together, and calculating the difference between the two images through a perception loss function; then training the generated network and the judgment network simultaneously to ensure that the loss of the generated high-resolution image relative to the real high-resolution image is less than or equal to a preset threshold value, and obtaining the trained generated network; and finally, inputting the low-resolution image of which the resolution needs to be improved into the trained generation network to obtain the reconstructed high-resolution image. The invention overcomes the defect that the difference between the input image and the output image cannot be perceptually reflected in the prior art, and improves the precision of the model and the quality of the generated image.

Description

Single-image super-resolution reconstruction method based on residual error generation countermeasure network
Technical Field
The invention relates to the technical field of image enhancement, in particular to a single-image super-resolution reconstruction method based on a residual error generation countermeasure network.
Background
Super-resolution reconstruction is a classic application belonging to the field of computer vision, and has wide application in monitoring equipment, microscopic imaging, video coding communication, video restoration, satellite imaging remote sensing, digital high-definition images, medical image processing and the like. Super-resolution reconstruction one or more frames of images are reconstructed into a higher resolution image or video by analyzing the signals of the digital images and employing algorithms of the software. The image style migration is to render the content map into a style map type drawing by respectively inputting a style map and a content map, wherein the output picture from the semantic dimension needs to be close to the input image and be close to the target picture in style, color and texture. Style migration is of great importance to understanding both the image and the pictorial representation.
The problem of image super-resolution reconstruction and the problem of stylization can be regarded as the problem of processing and transformation of images, one of which trains a feedforward convolutional neural network in a supervised mode, and a loss function is used for representing the difference between output and input images, and a pixel-by-pixel difference method is used as the loss function in the network. The method can be used for obtaining a trained network only by one-time feedforward, but has the defect that the method which applies the loss function of pixel-by-pixel difference solving cannot perceptually reflect the difference between an input image and an output image. And secondly, a perception loss function is established, high-level image features are extracted from the trained CNN for solving the difference, super-resolution image reconstruction is realized by minimizing the loss function, and the obtained image is high in quality after synthesis and stylization. But the disadvantage is that the training process is very slow and requires a long-term iterative optimization process.
The current super-resolution reconstruction still has a problem that is difficult to solve, namely, a one-to-many relationship between a low-resolution image (LR) and a converted high-resolution image (HR) can exist. This uncertainty becomes larger as the super resolution factor becomes larger.
Disclosure of Invention
The invention aims to solve the technical problem of providing a single-image super-resolution reconstruction method for generating a countermeasure network based on residual errors aiming at the defects involved in the background technology.
The invention adopts the following technical scheme for solving the technical problems:
the single-image super-resolution reconstruction method for generating the countermeasure network based on the residual error comprises the following steps:
step 1), establishing a generating network;
step 2), inputting the low-resolution image into a generation network to obtain a generated high-resolution image;
step 3), inputting the generated high-resolution image and the real high-resolution image corresponding to the low-resolution image into a discrimination network model together, and calculating the difference between the two images through a perception loss function;
step 4), training the generation network and the discrimination network simultaneously, so that the loss of the generated high-resolution image relative to the real high-resolution image is less than or equal to a preset threshold value, and obtaining the generated network after training;
and 5), inputting the low-resolution image of which the resolution needs to be improved into the trained generation network to obtain a reconstructed high-resolution image.
As a further optimization scheme of the single-image super-resolution reconstruction method for generating the countermeasure network based on the residual error, the generation network in the step 1) comprises a preprocessing layer, a core residual error network and an up-sampling layer;
the pretreatment layer comprises a first convolution layer, a second convolution layer and a relu activation layer, wherein the first convolution layer and the second convolution layer are alternated, and the depths of the first convolution layer and the second convolution layer are 64 and 256 respectively; the size of the first convolution layer convolution kernel is 9 multiplied by 9, and the size of the first convolution layer convolution kernel is 3 multiplied by 3;
the core residual error network comprises 16 residual error blocks, and the residual error blocks adopt a structure of a BN (batch normalization) layer + a convolutional layer of 64 feature maps + a relu active layer + a BN layer + a convolutional layer of 64 feature maps + a relu active layer;
the up-sampling layer performs up-sampling by using two sub-pixel convolution layers, so that the resolution of an input image is improved; and a convolution layer with the depth of 256 is respectively added before the two sub-pixel convolution layers, a relu activation layer is respectively added after the two sub-pixel convolution layers, and the image is reconstructed and amplified step by step so as to avoid the loss of image details caused by continuous amplification of the image.
As a further optimization scheme of the single-map super-resolution reconstruction method based on the residual error generation countermeasure network, the discrimination network described in step 3) uses a convolution layer of 32 feature maps with stride 1 + leak Relu activation layer + stride 2 + convolution layer of 32 feature maps with leak Relu activation layer + stride 1 + convolution layer of 64 feature maps with leak Relu activation layer + stride 2 + convolution layer of 64 feature maps with leak Relu activation layer + stride 1 of 128 feature maps with 1 + convolution layer of 128 feature maps with leak Relu activation layer + stride 2 + convolution layer of 256 feature maps with leak Relu activation layer + stride 2 + two convolution layers with leak Reval activation layer + stride 512 feature maps with 256 feature maps with stride 2 + stride 2.
As a further optimization scheme of the single-image super-resolution reconstruction method for generating the countermeasure network based on the residual error, the perception loss function in the step 3) comprises pixel-level MAE loss, VGG loss and discriminator loss;
the pixel level MAE loss is calculated directly using L1-loss;
the VGG loss
Figure BDA0003634054590000021
In the formula (I), the compound is shown in the specification,
Figure BDA0003634054590000022
representative of the generating network, I HR 、I LR Respectively representing high resolution image, lowResolution image phi i,j Meaning vgg19 the jth convolutional layer, W, preceding the ith pooling layer in the network i,j 、H i,j The widths and the heights of the feature maps in the jth convolutional layer before the ith pooling layer in the VGG network respectively;
loss of discriminator
Figure BDA0003634054590000023
In the formula (I), the compound is shown in the specification,
Figure BDA0003634054590000024
represents the discriminator, and N is the size of back-size.
As a further optimization scheme of the single-image super-resolution reconstruction method for generating the countermeasure network based on the residual error, in the step 4), a training set uses a DIV2K data set, each image in the training set is transformed to 384x384 as a high-resolution input, and transformed to 96x96 as a low-resolution input, the training set comprises 800 training images, back-size is 16 during training, the number of iterations is 100000, and the network is reversely updated every 1000 times of training; the loss of perception is reduced by RMSprop optimization, learning rate 1e-4, no dropout.
Compared with the prior art, the technical scheme adopted by the invention has the following technical effects:
the invention overcomes the defect that the difference between the input image and the output image cannot be perceptually reflected in the prior art, and improves the precision of the model and the quality of the generated image.
Drawings
Fig. 1 is a schematic view of the general structure of the present invention.
Fig. 2 is a schematic diagram of a residual error generation network according to the present invention.
Fig. 3 is a schematic diagram of the residual block structure according to the present invention.
Fig. 4 is a schematic diagram of a discrimination network structure according to the present invention.
Detailed Description
The technical scheme of the invention is further explained in detail by combining the attached drawings:
the present invention may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. In the drawings, components are exaggerated for clarity.
As shown in fig. 1, the invention discloses a single-image super-resolution reconstruction method for generating a countermeasure network based on residual errors, which comprises the following steps:
step 1), establishing a generating network;
the generation network comprises a preprocessing layer, a core residual error network and an up-sampling layer;
the pretreatment layer comprises a first convolution layer, a second convolution layer and a relu activation layer, wherein the first convolution layer and the second convolution layer are alternated, and the depths of the first convolution layer and the second convolution layer are 64 and 256 respectively; the size of the first convolution layer convolution kernel is 9 multiplied by 9, and the size of the first convolution layer convolution kernel is 3 multiplied by 3;
the core residual error network comprises 16 residual error blocks, and the residual error blocks adopt a structure of a BN (batch normalization) layer + a convolutional layer of 64 feature maps + a relu active layer + a BN layer + a convolutional layer of 64 feature maps + a relu active layer;
the up-sampling layer performs up-sampling by using two sub-pixel convolution layers, so that the resolution of an input image is improved; a convolution layer with the depth of 256 is respectively added before the two sub-pixel convolution layers, a relu activation layer is respectively added after the two sub-pixel convolution layers, and the image is reconstructed and amplified step by step so as to avoid the loss of image details caused by continuous amplification of the image;
step 2), inputting the low-resolution image into a generation network to obtain a generated high-resolution image;
step 3), inputting the generated high-resolution image and the real high-resolution image corresponding to the low-resolution image into a discrimination network model together, and calculating the difference between the two images through a perception loss function;
the discrimination network adopts two structures of a convolution layer of 32 feature maps with stride being 1 + leak Relu active layer + stride being 2 + convolution layer of 32 feature maps with stride being 2 + leak Relu active layer + stride being 1 + convolution layer of 64 feature maps with leak Relu active layer + stride being 2 + convolution layer of 128 feature maps with leak Relu active layer + stride being 1 + convolution layer of 256 feature maps with leak Relu active layer + stride being 2 + convolution layer of 128 feature maps with leak Relu active layer + stride being 2 + 512 feature maps with leak Relu active layer + stride being 512 feature maps with trace Relu active layer + stride being 1 + 512 feature maps with stride being 2;
the perception loss function comprises pixel-level MAE loss, VGG loss and discriminator loss;
the pixel level MAE loss is calculated directly using L1-loss;
the VGG loss
Figure BDA0003634054590000041
In the formula (I), the compound is shown in the specification,
Figure BDA0003634054590000042
representative of the generating network, I HR 、I LR Respectively representing a high resolution image, a low resolution image, [ phi ] i,j Meaning vgg19 the jth convolutional layer, W, preceding the ith pooling layer in the network i,j 、H i,j The widths and the heights of the feature maps in the jth convolutional layer before the ith pooling layer in the VGG network respectively;
loss of discriminator
Figure BDA0003634054590000043
In the formula (I), the compound is shown in the specification,
Figure BDA0003634054590000044
representing a discriminator, N is the size of back-size;
step 4), training the generation network and the discrimination network simultaneously, so that the loss of the generated high-resolution image relative to the real high-resolution image is less than or equal to a preset threshold value, and obtaining the generated network after training;
the training set uses a DIV2K data set, each image in the training set is transformed to the size of 384x384 as a high-resolution input, and is transformed to the size of 96x96 as a low-resolution input, the training set comprises 800 training images, back-size is 16 during training, the number of iterations is 100000, and the network is reversely updated every 1000 times of training; the perception loss is reduced by RMSprop optimization, the learning rate is 1e-4, and dropout is avoided;
and 5), inputting the low-resolution image of which the resolution needs to be improved into the trained generation network to obtain a reconstructed high-resolution image.
FIG. 1 is an overall structure of the present invention, which comprises a generation network and a countermeasure network (for defining a series of loss functions), wherein the main structure of the generation network is a combination of a deep residual error network and an SRGAN network; the weights in the countermeasure network are parameters that map the input picture into the output image, and each penalty function computation scalar value is a measure of the difference between the image output and the target image. The image network is trained using Adam to keep the weighted sum of a series of loss functions down.
Fig. 2 is a schematic diagram of a generated network, and a network structure is generated by sampling the inside of the network up and down through stride convolution. The first and last residual convolutional layers kernel are 9x9, and the kernel for the remaining layers are all 3x 3. The network uses two stride 2 convolutional layers to perform down sampling in the network, and then the network is connected with five residual blocks, and finally two stride 1/2 deconvolution layers are used for up sampling.
Fig. 3 is a schematic diagram of the structure of the residual network, which is motivated by solving the "degradation" problem. ResNet can learn redundant blocks into identity mapping, has no influence on performance, and has a certain deep self-adaptive capacity, so that the training of a network with more layers is possible, and the network performance is improved. In the invention, 15 residual modules based on ResNet are used in the feedforward convolutional neural network.
Fig. 4 is a schematic structural diagram of a discrimination network, the core of which is based on vgg19 network, and is used for discriminating whether a picture is a high-resolution image or a super-resolution image, and measuring the difference between the high-resolution image and the super-resolution image through a perceptual loss function and an MSE loss function.
Examples are as follows: the training set used the DIV2K dataset, and transformed each image in the training set to a size of 384x384 as HR input and to a size of 96x96 as LR input, which was a total of 800 training images; set5, Set14 and BSD100 are adopted in the verification Set; the super-resolution reconstruction of x4 is completed through a trained model, the feature loss is minimized by extracting vgg19 at 2_2 layers, during training, bacth-size is 16, the iteration number is 100000 times, RMSprop optimization is performed, the learning rate is 1e-4, and dropout is avoided. And verifying the super-resolution performance of the network through the trained generation network by using a verification set.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above-mentioned embodiments are only illustrative of the present invention and are not intended to limit the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (5)

1. The single-image super-resolution reconstruction method for generating the countermeasure network based on the residual error is characterized by comprising the following steps of:
step 1), establishing a generating network;
step 2), inputting the low-resolution image into a generation network to obtain a generated high-resolution image;
step 3), inputting the generated high-resolution image and the real high-resolution image corresponding to the low-resolution image into a discrimination network model together, and calculating the difference between the two images through a perception loss function;
step 4), training the generation network and the discrimination network simultaneously, so that the loss of the generated high-resolution image relative to the real high-resolution image is less than or equal to a preset threshold value, and obtaining the generated network after training;
and 5), inputting the low-resolution image of which the resolution needs to be improved into the trained generation network to obtain a reconstructed high-resolution image.
2. The single-picture super-resolution reconstruction method based on residual generation countermeasure network of claim 1, wherein the generation network in step 1) comprises a preprocessing layer, a core residual network and an upsampling layer;
the pretreatment layer comprises a first convolution layer, a second convolution layer and a relu activation layer, wherein the first convolution layer and the second convolution layer are alternated, and the depths of the first convolution layer and the second convolution layer are 64 and 256 respectively; the size of the first convolution layer convolution kernel is 9 multiplied by 9, and the size of the first convolution layer convolution kernel is 3 multiplied by 3;
the core residual error network comprises 16 residual error blocks, and the residual error blocks adopt a structure of a BN (batch normalization) layer + a convolutional layer of 64 feature maps + a relu active layer + a BN layer + a convolutional layer of 64 feature maps + a relu active layer;
the up-sampling layer performs up-sampling by using two sub-pixel convolution layers, so that the resolution of an input image is improved; and a convolution layer with the depth of 256 is respectively added before the two sub-pixel convolution layers, a relu activation layer is respectively added after the two sub-pixel convolution layers, and the image is reconstructed and amplified step by step so as to avoid the loss of image details caused by continuous amplification of the image.
3. The single-graph super-resolution reconstruction method based on residual generation countermeasure network of claim 2, the discrimination network described in step 3) uses two convolution layers of 256 characteristic maps of stride + leanreactivating layer + stride + 512 characteristic maps of 256 characteristic maps of 1 strand + leanreactivating layer + stride 2, and the convolution layers of 64 characteristic maps of stride + leanreactivating layer + stride of 2 strand + leakreactivating layer + stride of 64 characteristic maps of 2 strand + leakreactivating layer + stride of 1 strand + stride of 128 characteristic maps of 2 strand + leakreactivating layer + stride of 2.
4. The single-map super-resolution reconstruction method based on residual generation countermeasure network of claim 3, wherein the perceptual loss function in step 3) comprises pixel-level MAE loss, VGG loss, discriminator loss;
the pixel level MAE loss is calculated directly using L1-loss;
the VGG loss
Figure FDA0003634054580000011
In the formula (I), the compound is shown in the specification,
Figure FDA0003634054580000012
representative of the generating network, I HR 、I LR Respectively representing a high resolution image, a low resolution image, [ phi ] i,j Meaning vgg19 the jth convolutional layer, W, preceding the ith pooling layer in the network i,j 、H i,j The widths and the heights of the feature maps in the jth convolutional layer before the ith pooling layer in the VGG network respectively;
loss of discriminator
Figure FDA0003634054580000021
In the formula (I), the compound is shown in the specification,
Figure FDA0003634054580000022
represents the discriminator, and N is the size of back-size.
5. The single-image super-resolution reconstruction method based on residual error generation countermeasure network of claim 4, wherein in the step 4), a DIV2K data set is used in the training set, each image in the training set is transformed to 384x384 as a high resolution input, and transformed to 96x96 as a low resolution input, the training set comprises 800 training images, back-size is 16 during training, the number of iterations is 100000, and the network is reversely updated every 1000 times of training; the loss of perception is reduced by RMSprop optimization, learning rate 1e-4, no dropout.
CN202210499131.3A 2022-05-09 2022-05-09 Single-image super-resolution reconstruction method based on residual error generation countermeasure network Pending CN114862679A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210499131.3A CN114862679A (en) 2022-05-09 2022-05-09 Single-image super-resolution reconstruction method based on residual error generation countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210499131.3A CN114862679A (en) 2022-05-09 2022-05-09 Single-image super-resolution reconstruction method based on residual error generation countermeasure network

Publications (1)

Publication Number Publication Date
CN114862679A true CN114862679A (en) 2022-08-05

Family

ID=82636754

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210499131.3A Pending CN114862679A (en) 2022-05-09 2022-05-09 Single-image super-resolution reconstruction method based on residual error generation countermeasure network

Country Status (1)

Country Link
CN (1) CN114862679A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115170399A (en) * 2022-09-08 2022-10-11 中国人民解放军国防科技大学 Multi-target scene image resolution improving method, device, equipment and medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115170399A (en) * 2022-09-08 2022-10-11 中国人民解放军国防科技大学 Multi-target scene image resolution improving method, device, equipment and medium

Similar Documents

Publication Publication Date Title
CN111754403B (en) Image super-resolution reconstruction method based on residual learning
WO2020168844A1 (en) Image processing method, apparatus, equipment, and storage medium
Zhu et al. GAN-based image super-resolution with a novel quality loss
CN112132959B (en) Digital rock core image processing method and device, computer equipment and storage medium
CN111242846B (en) Fine-grained scale image super-resolution method based on non-local enhancement network
CN110349087B (en) RGB-D image high-quality grid generation method based on adaptive convolution
CN111915484A (en) Reference image guiding super-resolution method based on dense matching and self-adaptive fusion
CN111861884B (en) Satellite cloud image super-resolution reconstruction method based on deep learning
CN109214989A (en) Single image super resolution ratio reconstruction method based on Orientation Features prediction priori
CN113837946B (en) Lightweight image super-resolution reconstruction method based on progressive distillation network
CN112801904B (en) Hybrid degraded image enhancement method based on convolutional neural network
CN111932461A (en) Convolutional neural network-based self-learning image super-resolution reconstruction method and system
CN111402138A (en) Image super-resolution reconstruction method of supervised convolutional neural network based on multi-scale feature extraction fusion
Zhang et al. An unsupervised remote sensing single-image super-resolution method based on generative adversarial network
CN112884668A (en) Lightweight low-light image enhancement method based on multiple scales
CN114549555A (en) Human ear image planning and division method based on semantic division network
CN115393191A (en) Method, device and equipment for reconstructing super-resolution of lightweight remote sensing image
CN111242999B (en) Parallax estimation optimization method based on up-sampling and accurate re-matching
Yang et al. Image super-resolution reconstruction based on improved Dirac residual network
CN114862679A (en) Single-image super-resolution reconstruction method based on residual error generation countermeasure network
CN111179171A (en) Image super-resolution reconstruction method based on residual module and attention mechanism
CN113793267B (en) Self-supervision single remote sensing image super-resolution method based on cross-dimension attention mechanism
CN114418854A (en) Unsupervised remote sensing image super-resolution reconstruction method based on image recursion
Zhao et al. Single image super-resolution via blind blurring estimation and anchored space mapping
CN113628114A (en) Image super-resolution reconstruction method of two-channel sparse coding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination