CN114049261A - Image super-resolution reconstruction method focusing on foreground information - Google Patents

Image super-resolution reconstruction method focusing on foreground information Download PDF

Info

Publication number
CN114049261A
CN114049261A CN202210035833.6A CN202210035833A CN114049261A CN 114049261 A CN114049261 A CN 114049261A CN 202210035833 A CN202210035833 A CN 202210035833A CN 114049261 A CN114049261 A CN 114049261A
Authority
CN
China
Prior art keywords
image
attention
convolution
loss
channel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210035833.6A
Other languages
Chinese (zh)
Other versions
CN114049261B (en
Inventor
何凡
彭丽薇
邓靖凛
吴家俊
程艳芬
李辰皓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University of Technology WUT
Original Assignee
Wuhan University of Technology WUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University of Technology WUT filed Critical Wuhan University of Technology WUT
Priority to CN202210035833.6A priority Critical patent/CN114049261B/en
Publication of CN114049261A publication Critical patent/CN114049261A/en
Application granted granted Critical
Publication of CN114049261B publication Critical patent/CN114049261B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image super-resolution reconstruction method for paying attention to foreground information. And further providing the PAMNet, connecting a plurality of PAM modules in series in the PAMNet, introducing jump connection, fully utilizing the shallow features of the image, training through a designed network, and finishing the reconstruction of the super-resolution image. The method can concentrate on the extraction of the foreground information and the identification features of the image, simultaneously reserve the color and texture features of the image, and improve the utilization rate of a shallow layer; the method can reduce the number of parameters and has better objective scoring; the invention has good balance between the performance and the model complexity, and the PAM module has universality and can be embedded into various network structures.

Description

Image super-resolution reconstruction method focusing on foreground information
Technical Field
The invention relates to the technical field of image processing and identification methods, in particular to a super-resolution image reconstruction method focusing on foreground information.
Background
Super-Resolution reconstruction (SR) is an image processing technique that obtains a high-Resolution image by using low-Resolution image restoration. The SR technology aims at improving the image resolution through signal processing and a software method without changing the limitation of physical imaging equipment, and the SR technology not only has important academic research value, but also has practical application in many fields, such as medical imaging, video security monitoring, remote sensing image processing and the like. In addition to improving image quality, SR technology can also improve many computer vision tasks, and increasing image resolution to obtain high quality images has become a problem that needs to be solved urgently in the research community.
Single Image Super-Resolution (SISR) is a research hotspot in the field of Image Super-Resolution reconstruction. SISR only uses a single low-Resolution image as input, reconstructs a High Resolution (HR) image with rich image details and clear textures, and has High practical value. SISR algorithms are mainly divided into three categories: interpolation-based methods, reconstruction-based methods, learning-based methods. Interpolation-based methods such as bicubic interpolation and the like lose a large amount of image details, and reconstructed images are seriously distorted and have low definition. The reconstructed picture quality is relatively high but the expandability is poor based on the model obtained by the reconstruction method such as a neighborhood embedding method, a sparse representation method and the like by adopting the traditional machine learning algorithm. With the development of deep learning and generation of confrontation networks, a learning-based method continuously emerges new novel networks, good performance is shown in the aspect of improving image details and texture features, the reconstructed and generated HR image is more vivid, and great progress is made in image super-resolution.
The residual error network uses residual error learning, a skip connection mode is used in network design for direct mapping in a residual error unit, and a depth model VDSR based on the residual error learning improves the model performance by introducing a residual error structure, but has the problems of large training parameter quantity, unclear reconstructed image background and the like. The EDSR removes the BN layer, improving the quality of the reconstructed image by reducing the memory consumption of the BN layer to stack more layers, but the objective index of the reconstructed image is lower due to training using only the L1 loss.
At present, the super-resolution task of image processing still has the problems of underutilization of shallow features of images, unobtrusive image foreground, lack of visual emphasis and the like.
Disclosure of Invention
Aiming at the problems that the super-resolution task in the image processing still has the problems of underutilization of image shallow layer characteristics, unobtrusive image foreground and lack of visual emphasis, the invention provides the image super-resolution reconstruction method focusing on the foreground information, which reduces the number of training parameters while focusing on the image foreground information and the detail characteristics, improves the utilization rate of the image shallow layer characteristics and the model performance, and greatly improves the visual quality of the image super-resolution.
In order to achieve the above object, the present invention provides a method for reconstructing super-resolution of an image focusing on foreground information, the method comprising the following steps:
1) acquiring an image to be trained, preprocessing image data to obtain a characteristic diagram X ϵ RC×H×W,RC×H×WParameters representing an image, R representing a real number set, C representing the number of channels, H, W representing the image size;
2) for the characteristic diagram X ϵ RC×H×WExtracting the features to obtain a feature output graph SLFϵRC×H×W
3) Outputting a graph S based on the featuresLFϵRC×H×WCarrying out image reconstruction on the image characteristics;
4) training the reconstructed image by adopting a loss function to obtain a super-resolution image SR;
the method is characterized in that the specific steps of the step 2) comprise:
2.1) inputting the feature map X into a feature extraction layer of the PAMNet networkThrough N serial PAM module basic units, residual shallow features of the image are calculated and transmitted for multiple times, shallow features of all the preposed N-2 PAM modules are input to the tail end of the (N-1) th PAM module by using jump connection, and splicing operation is carried out in channel dimensions to obtain an image residual shallow feature map S ϵ R10C×H×W
2.2) the image residual shallow feature map S ϵ R10C×H×WAdopting 1 × 1 convolution to reduce dimension and aggregate shallow features to obtain a feature map S with dimensions (C, H, W)LϵRC×H×W
2.3) mapping the feature map SLϵRC×H×WObtaining an extracted characteristic output graph S through the Nth PAM moduleLFϵRC ×H×W
Preferably, the specific steps of step 1) include:
1.1) selecting a training sample image as a training sample image set;
1.2) randomly selecting n original images from the training sample image set to carry out cutting and mirror image inversion operation, and randomly dividing each original image into m multiplied by m images marked as ISR(x) X =1,2, … … n; then randomly selecting q sub-images as training images;
1.3) the training image ISR(x) Inputting the images to be trained into a downsampling layer of a PAMNet network, performing convolution operation on the images to be trained respectively through two times of serial LxL convolution cores, preliminarily extracting image color, contour and texture features, and increasing the number of feature map channels to obtain n images to be trained;
each lxl convolutional layer consists of three convolutional layers: firstly, passing through an L multiplied by L convolutional layer, then passing through an L multiplied by L BN layer and an L multiplied by L Relu layer; obtaining a feature map D of m multiplied by m through the 1 st convolution; obtaining a characteristic map X of m multiplied by m through convolution for the 2 nd time;
1.4) the downsampling layer of the PAMNet network finally outputs an m X m image feature map X ϵ RC×H×WWherein R isC ×H×WParameters representing an image, R represents a real number set, C represents the number of channels, H, W represents the image size, and H = W = m.
Preferably, the L × L convolutional layers performing convolution operation each time in step 1.3) are all composed of three convolutional layers: l × L convolutional layers, L × L BN layers and an L × L Relu layer; obtaining a feature map D of m multiplied by m through the 1 st convolution; the feature map X of m × m is obtained by the 2 nd convolution.
Preferably, the specific steps of step 2.1) include:
2.1.1) when the characteristic diagram X is input into the tPA module at the time t, regarding the 1 st PAM module, X ϵ RC×H×WInputting the Residual error module Residual to carry out convolution operation to obtain an output characteristic diagram XRϵRC×H×W
In the Residual module Residual, the feature map X is inputIN ϵRC×H×W=XϵRC×H×WUsing two convolution kernels F ϵ R of size KC×P×PPerforming convolution operation by using the convolution of the packet with the number of g and the convolution of 1 × 1, wherein the parameter P isGComprises the following steps:
PG=(K×K×C×C×1/g+1×1×C×C)×2
=2×C×C×(K×K×1/g+1)
wherein C represents the number of channels, and K multiplied by K represents the size of a convolution kernel F; obtaining an output characteristic diagram X through a Residual moduleRϵRC×H×W
2.1.2) outputting the characteristic diagram XRϵRC×H×WSimultaneous input of Channel Attention and Spatial Attention modules Channel Attention and Spatial Attention in parallel computing Channel Attention YCAnd spatial domain attention YS
Preferably, the channel domain attention Y is calculatedCIn the process, a SENet structure is adopted, the full connection layer in the SENet is replaced by 1 × 1 convolution, the space characteristics of the image are reserved, and the specific calculation of the attention of the channel domain is as follows:
YC=X+CA(XR)
in the formula, X ϵ RC×H×WRepresenting the input of a residual block, XRϵRC×H×WRepresenting the output after computation of the residue, CA () representing the compute channel domain attention, YCϵRC×H×WA final output representing the channel domain attention;
simultaneous parallel computation of spatial domain attention YSFirstly, three-layer cascade expansion convolution with expansion rates of 1,2 and 3 is used for calculating spatial domain attention, firstly, 1 multiplied by 1 convolution is used for dimensionality reduction, and an input feature graph with dimensionality (C, H and W) is converted into a feature graph with dimensionality (C/K, H and W), wherein K is a dimensionality reduction coefficient;
secondly, performing three times of expansion convolution with different expansion rates on the feature map after dimension reduction, and expanding the receptive field by the minimum parameter number in a limited step number, thereby ensuring the continuity of the receptive field and avoiding information loss caused by pooling;
and finally, fusing information of different channels of the feature diagram by using 1 × 1 convolution and activating by Sigmoid to obtain the feature diagram weight phi of (1, H, W) dimension, and multiplying the weight to the input feature diagram X in the (H, W) dimension distributionRϵRC×H×WThe purpose of paying attention to the foreground information of the image is achieved, and the specific calculation of the attention of the spatial domain is as follows:
YS=X+SA(XR)
in the formula, X ϵ RC×H×WRepresenting the input of a residual block, XRϵRC×H×WRepresenting the output after computation of the residual, SA () representing the computation of spatial domain attention, YSϵRC×H×WRepresenting the final output of spatial domain attention.
2.1.3) attention of the channel region YCAnd spatial domain attention YSSplicing operation is carried out in the channel dimension to obtain an input value G of the Gate control network GateINϵR2C×H×W
2.1.4) input value G of the gated network GateINϵR2C×H×WInformation fusion using 1 × 1 convolution, GINIs reduced to (C, H, W); performing feature extraction and Sigmoid activation by two times of 3 × 3 convolution to obtain activation output with value range of (0, 1)σϵRC×H×W
2.1.5) outputting the activationσDivision by Y as linear combination coefficientCAnd YSTo obtain an output GOUTϵRC×H×W
2.1.6) continuously updating the activation output during the back propagationσDynamically score by learningMatching the weights of the channel domain attention and the spatial domain attention, and concentrating on the attention domain with higher weight to extract image foreground information;
preferably, the specific calculation of the activation output σ that is continuously updated during the back-propagation is
GOUT=(1-σ)YC+σYS
2.1.7) output G of the gating networkOUTϵRC×H×WtPA module initial input X with current timeINϵRC ×H×WAdding to obtain XOUTϵRC×H×W
2.1.8) repeating the operation of the steps 2.1.1) -2.1.7) for N-2 times, and passing N-1 serial PAM modules in total; wherein the input of the first PAM module is X ϵ RC×H×WFor PAMNet other PAM modules, the input and output relations are as follows;
the output of the PAM module at the last time t-1 is recorded as: xOUT(t-1)ϵRC×H×WWill be used as the input X of the tPA module at the current timeIN(t)ϵRC×H×WThat is to say XIN(1)ϵRC×H×W=XϵRC×H×W(ii) a Output X of tPAM at current timeOUT(t)ϵRC×H×WAs input X to the next time t +1PAM moduleIN(t+1)ϵRC×H×WWherein t is in the interval [1, 10 ]]In between, other parameters XR、YC、YS、GIN、GOUTAnd so on.
2.1.9) inputting shallow layer characteristics of all preposed N-2 PAM modules to the tail end of the (N-1) th PAM module of the characteristic extraction layer by adopting jump connection, and splicing in channel dimensions to obtain an image residual error shallow layer characteristic diagram S ϵ R spliced in channel dimensions with dimensions of (10C, H, W)10C×H×WWhere H = W = m.
Preferably, the specific steps of step 3) include:
3.1) adopting a sub-Pixel convolution Pixel-Shuffle method to the characteristic map S obtained in the step 2)LFPerforming a/2 times of upsampling; then, performing convolution operation of a convolution kernel bx b on the image matrix; and then using Leaky-Relu activation function to carry out image momentActivating the array, and outputting the activated image matrix SN1
3.2) applying bicubic interpolation method to corresponding characteristic graph SLFPerforming a/2 times up-sampling to obtain an image matrix SN1Image matrix S with the same size and channel numberP1(ii) a Then to SN1And SP1Summing to obtain an image matrix SNP1
SNP1= SN1+SP1
3.3) to the image matrix SNP1Performing convolution operation of a convolution kernel c x c, and outputting an image matrix of 128 channels; then activating by adopting a Leaky-Relu activation function, and performing a/2 times of up-sampling on the activated image matrix by adopting a Pixel-Shuffle method; performing convolution operation of convolution kernel bxb on the image matrix, activating the image matrix by adopting a Leaky-Relu activation function, and outputting an activated image matrix SN2
3.4) applying bicubic interpolation method to the image matrix SNP1A/2 times of amplification is carried out, and an image matrix S is outputP2(ii) a To SN2And SP2Summing to obtain an image matrix SNP2
SNP2=SN2+SP2
3.5) for the image matrix SNP2Performing convolution operation of a convolution kernel c x c, and outputting an image matrix of 128 channels; and then activating by adopting a Leaky-Relu activation function, and performing convolution operation of convolution kernel dxd on the activated image matrix to finally obtain a reconstructed image, wherein a, b, c and d are natural numbers which are not 0.
Preferably, the reconstructed image and the corresponding original image in the step 4) are input into a pre-training VGG-19 network for training, that is, a non-uniform joint loss L is adoptedUTo counter the loss LGContent loss LCAnd (3) extracting more identification features and detail information while constraining the network to learn the color and texture features of the image by the loss function formed by weighting, and paying more attention to the reconstruction of image foreground information to obtain the super-resolution image SR.
Preferably, the loss function L of the PAMNet network is:
L=γLG+λLU + η LC
wherein γ, λ, η represent weights against loss, non-uniform joint loss, and content loss, respectively, and γ =0.05, λ =1, η =0.1 is taken;
wherein the discriminator loses LDComprises the following steps:
LD=-Exr[log(D(xr,xf))]-Exf[log(1-D(xf,xr))]
against loss LGComprises the following steps:
LG=-Exr[log(1-D(xr,xf))]-Exf[log(D(xf,xr))]
in the formula, xrFor real images, xfFor reconstructing the image, D (x)r,xf) Calculating the difference between the real image and the reconstructed image and limiting D (x) using Sigmoidr,xf)ϵ(0,1),E[]Represents a mathematical expectation;
non-uniform joint loss LUBased on the L1 loss, the L1 loss L before the first pooling layer was calculated separatelyVGG1And L1 loss L before the last pooling layerVGG2By adjusting LVGG1And LVGG2While the constraint generator extracts the underlying features, more detailed information and identifying features are learned:
LU=αLVGG1+βLVGG2
in the formula, alpha is LVGG1Beta is LVGG2The weight of (a) is given by α =0.2 and β = 0.1;
LC=μL1 (xr,xf )+θL2 (xr,xf )
in the formula: x is the number ofrFor real images, xfFor reconstructing the image, μ and θ represent the weights of the L1 loss and L2 loss, respectively, L1、L2L1 loss and L2 loss were expressed, respectively, and μ =0.75 and θ = 0.25.
The invention also proposes a computer-readable storage medium, in which a computer program is stored, which is characterized in that the computer program implements the above-mentioned method when being executed by a processor.
In the image processing method, a game theory method is adopted for generating a countermeasure network (GAN), and a model consists of a generator and a discriminator. In the image super-resolution, a generator is responsible for generating a reconstructed image, a discriminator discriminates the difference between the generated image and a real image according to self-judgment conditions, and the image is further restored by adopting a method of trying to deceive the discriminator. Due to the existence of a special mechanism, the method can generate high-resolution images with good visual effect.
For most of super-resolution methods based on deep learning, in a certain layer of characteristics learned by a network model, neuron activation values at different positions and on different channels have the same weight, and a super-resolution model combined with an attention mechanism can select more important activation values for a super-resolution task and give more weight, so that the reconstruction effect is improved.
The invention provides a universal gating Attention mechanism module PAM (parallel Attention model), which is used for parallelly calculating the Attention Y of a channel domain on a Residual error branch of a Residual BlockcAnd spatial domain attention YSWherein the channel domain attention uses full convolution to preserve image spatial features, the spatial domain attention uses cascade expansion convolution to increase the receptive field and keep the receptive domain intact, and the obtained attention YcAnd YSInputting a cascade gating network Gate, dynamically adjusting the weights of the gating network and the non-uniform joint loss, continuously updating the weight sigma in the back propagation process, dynamically distributing the weights of the attention of a channel domain and the attention of a space domain in a learning mode, and multiplying the weight coefficient distribution to YcAnd YSObtain the final output GOUTAnd the PAM module focuses on the attention domain with higher weight, so that the method has the capability of deeply extracting foreground information and improves the definition of the foreground of the reconstructed image.
The PAM module is used as a core unit, a plurality of PAM modules are connected in series, then the PAM modules are sampled to be used as a basic structure, and the PAM module is constructed by using technologies such as jump connection, grouping convolution, feature fusion and the like. The PAMNet is composed of a down-sampling layer, a feature extraction layer and an up-sampling layer. And the downsampling layer preliminarily extracts the bottom-layer features of the image through 2 times of L multiplied by L convolution and increases the number of feature map channels. The feature extraction layer takes a PAM module as a basic structural unit, shallow features are input into the last PAM module of the feature extraction layer by using jump connection, image shallow residual information is fully utilized, and high-level semantic information and identification features of the image are mined. The up-sampling layer adopts Pixel-Shuffle to enlarge the image. In addition, the PAMNet feature extraction layer uses grouping convolution to reduce the parameter quantity, adds 1 x 1 convolution to reduce dimension and aggregates shallow features.
The invention further provides a non-uniform joint loss LUWhen the network is restrained to learn the color and texture features of the image, the method focuses more on extracting identification features and detail information, emphasizes on reconstruction of image foreground information and highlights visual emphasis.
Compared with the prior art, the invention has the following advantages:
1. the invention provides a universal gate control attention mechanism module PAM (pulse amplitude modulation), which is used for calculating the attention Y of a channel domain on a Residual error branch of a Residual Block in parallelcAnd spatial domain attention YsThe attention of the channel domain and the attention of the space domain are dynamically distributed in a learning mode, so that the PAM module focuses on the attention domain with higher weight, the capability of deeply extracting foreground information is achieved, and the foreground definition of the reconstructed image is improved.
2. The PAMNet reduces the number of parameters, pays attention to the image foreground information, can fully extract detail features and high-frequency information, and has clear details and edges of reconstructed images and better objective scoring.
3. The invention can generate clearer foreground information than the prior art, the detail texture characteristics of the reconstructed image are closer to the real image, the color and the texture characteristics of the image are reserved, and the shallow layer utilization rate is improved.
4. The invention has good balance between the performance and the model complexity, and the PAM module has universality and can be embedded into various network structures.
Drawings
Fig. 1 is a flowchart of an image super-resolution reconstruction method focusing on foreground information according to the present invention.
Fig. 2 is a schematic structural diagram of a PAMNet proposed in the image super-resolution reconstruction method focusing on foreground information.
Fig. 3 is a schematic diagram of a specific structure of PAM in fig. 2.
Fig. 4 is a schematic diagram of a specific structure of the Gate network Gate in fig. 3.
FIG. 5 is a schematic diagram showing the experimental effect of the method of the present invention compared with other methods.
Fig. 6 is a diagram illustrating the comparison between the average PSNR value and the parameter amount of the PAMNet and other networks according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.
As shown in fig. 1, the image super-resolution reconstruction method for focusing on foreground information according to the embodiment of the present invention includes the following steps:
step 1) obtaining an image to be trained, preprocessing image data to obtain a characteristic diagram X ϵ RC×H×W,RC×H×WParameters representing an image, R representing a real number set, C representing the number of channels, and H, W representing the image size.
1.1) selecting training sample images as a training sample image set, and selecting 3450 images of DIV2K and Flickr2K as a training data set;
1.2) randomly selecting 3450 original images from the training sample image set to perform cutting and mirror image inversion operations, and randomly dividing each original image into 128 x 128 images marked as ISR(x) X =1,2, … … n; then, randomly selecting 160000 sub-images as training images;
1.3) will train the image ISR(x) Inputting the images to be trained into a downsampling layer of a PAMNet network, respectively performing convolution operation on the images to be trained through two serial 5 multiplied by 5 convolution kernels, preliminarily extracting image color, contour and texture features, and increasing the number of feature map channels to obtain n images to be trained;
the 5 × 5 convolutional layers each time a convolution operation is performed are composed of three convolutional layers: a 5 × 5 convolutional layer, a 5 × 5 BN layer and a 5 × 5 Relu layer; obtaining a 128 multiplied by 128 feature map D through the 1 st convolution; the feature map X of 128 × 128 is obtained by the 2 nd convolution. The structure of the PAMNet network is shown in fig. 2.
1.4) downsampling layer of PAMNet network finally output image feature map X ϵ R of 128X 128C×H×WWherein R isC ×H×WParameters representing an image, R represents a real number set, C represents the number of channels, and C =64, H, W represents the image size, and H = W = 128.
Step 2) for the feature map X ϵ RC×H×WExtracting the features to obtain a feature output graph SLFϵRC×H×W
2.1) inputting the feature map X into a feature extraction layer of a PAMNet network, calculating and transmitting residual shallow features of the image for multiple times through 11 serial PAM module basic units, inputting the shallow features of all the 9 preposed PAM modules to the tail end of a 10 th PAM module by using jump connection and carrying out splicing operation in channel dimensions to obtain an image residual shallow feature map S ϵ R10C×H×WWhere H = W = 128. The structure of the PAM module is shown in fig. 3.
The specific steps of step 2.1) include:
2.1.1) when the characteristic diagram X is input into the tPA module at the time t, regarding the 1 st PAM module, X ϵ R64×128×128Inputting the Residual error module Residual to carry out convolution operation to obtain an output characteristic diagram X ϵ R64×128×128
In the Residual module Residual, the feature map X is inputIN ϵR64×128×128=XϵR64×128×128Using two convolution kernels F ϵ R of size 3X 364×3×3Performing convolution operation by using 16 packets and 1 × 1 convolution, wherein the parameter P isGComprises the following steps:
PG=(3×3×64×64×1/16+1×1×64×64)×2
=2×64×64×(3×3×1/16+1) =12800 (1)
obtaining an output characteristic diagram X through a Residual moduleRϵR64×128×128
2.1.2) outputting the characteristic diagram XRϵR64×128×128Simultaneous input of Channel Attention and Spatial Attention modules Channel Attention and Spatial Attention in parallel computing Channel Attention YCAnd spatial domain attention YS
Computing channel Domain attention YCIn the process, a SENet structure is adopted, the full connection layer in the SENet is replaced by 1 × 1 convolution, the space characteristics of the image are reserved, and the specific calculation of the attention of the channel domain is as follows:
YC=X+CA(XR) (2)
in the formula, X ϵ R64×128×128Representing the input of a residual block, XRϵR64×128×128Representing the output after computation of the residue, CA () representing the compute channel domain attention, YCϵR64×128×128A final output representing the channel domain attention;
simultaneous parallel computation of spatial domain attention YSFirstly, three-layer cascade expansion convolution with expansion rates of 1,2 and 3 is used for calculating spatial domain attention, firstly, 1 x 1 convolution is used for dimensionality reduction, an input feature map with dimensionality of (64,128,128) is converted into a feature map with dimensionality of (64/K,128 and 128), wherein K is a dimensionality reduction coefficient, and K =4 is taken;
secondly, performing expansion convolution with different expansion rates for 3 times on the feature map after dimension reduction, expanding the receptive field by the minimum parameter number in a limited step number, ensuring the continuity of the receptive field and avoiding information loss caused by pooling;
and finally, fusing information of different channels of the feature map by using 1 × 1 convolution and activating by Sigmoid to obtain (1,128,128) dimension feature map weight phi, and multiplying the weight to the input feature map X in (128 ) dimension distributionRϵR64 ×128×128The purpose of paying attention to the foreground information of the image is achieved, and the specific calculation of the attention of the spatial domain is as follows:
YS=X+SA(XR) (3)
in the formula, X ϵ R64×128×128Representing the input of a residual block, XRϵ R64×128×128Representing the output after computing the residual, CA representing the computing spatial domain attention, YSϵ R64×128×128Representing the final output of spatial domain attention.
2.1.3) attention of channel Domain YCAnd spatial domain attention YSSplicing operation is carried out in the channel dimension to obtain an input value G of the Gate control network GateINϵ R128×128×128. The gated network Gate structure is shown in fig. 4.
2.1.4) gating the input value G of the network GateINϵ R128×128×128Information fusion using 1 × 1 convolution, GINIs reduced to (64,128,128); performing feature extraction and Sigmoid activation by two times of 3 × 3 convolution to obtain activation output with value range of (0, 1)σϵR64×128×128
2.1.5) will activate the outputσDivision by Y as linear combination coefficientCAnd YSTo obtain an output GOUTϵR64×128×128
2.1.6) continuously updating the activation output sigma in the back propagation process, dynamically distributing the weights of the attention of a channel domain and the attention of a space domain in a learning mode, and concentrating on the attention domain with higher weight to extract image foreground information; the specific calculation is as follows:
GOUT=(1-σ)YC+σYS(4)
2.1.7) output G of the gating networkOUTϵR64×128×128tPA module initial input X with current timeINϵR64 ×128×128Adding to obtain XOUTϵR64×128×128
2.1.8) repeating the operations of steps 2.1.1) -2.1.7) for 9 times, wherein a total of N-1 serial PAM modules are passed; wherein the input of the first PAM module is X ϵ R64×128×128For the PAMNet other PAM modules, the input-output relationship is as follows:
the output of the PAM module at the last time t-1 is recorded as: xOUT(t-1)ϵR64×128×128Will be used as the input X of the tPA module at the current timeIN(t)ϵR64×128×128That is to say XIN(1)ϵR64×128×128=XϵR64×128×128(ii) a Current time tOutput X of PAMOUT(t)ϵR64×128×128As input X to the next time t +1PAM moduleIN(t+1)ϵR64×128×128Wherein t is in the interval [1, 10 ]]In between, other parameters XR、YC、YS、GIN、GOUTAnd so on.
2.1.9) inputting shallow features of all the preposed 9 PAM modules to the tail end of the 10 th PAM module of the feature extraction layer by adopting jump connection, and splicing in channel dimension to obtain a channel dimension spliced image residual shallow feature map S ϵ R with dimension (640,128,128)640×128×128
2.2) obtaining the image residual shallow feature map S ϵ R obtained in the step 2.1)640×128×128Adopting 1 × 1 convolution to reduce dimension and aggregate shallow features to obtain feature map S with dimension (64,128,128)Lϵ R64×128×128
2.3) mapping the feature map SLϵR64×128×128Obtaining the extracted characteristic output graph S through the last PAM module (the 11 th PAM module)LFϵR64×128×128
3) Outputting a graph S based on featuresLFϵR64×128×128Performing image reconstruction on the image features:
3.1) adopting a sub-Pixel convolution Pixel-Shuffle method to carry out comparison on the characteristic map S obtained in the step 2)LFPerforming 2 times of upsampling; then, performing convolution operation of convolution kernel 1 multiplied by 1 on the image matrix; activating the image matrix by adopting a Leaky-Relu activation function, and outputting an activated image matrix SN1
3.2) applying bicubic interpolation method to corresponding characteristic graph SLFPerforming 2 times of upsampling to obtain an image matrix SN1Image matrix S with the same size and channel numberP1(ii) a Then to SN1And SP1Summing to obtain an image matrix SNP1
SNP1= SN1+SP1 (5)
3.3) to the image matrix SNP1Performing convolution operation of convolution kernel 3 × 3, and outputting an image matrix of 128 channels;then activating by adopting a Leaky-Relu activation function, and performing 2-time upsampling on the activated image matrix by adopting a Pixel-Shuffle method; performing convolution operation of convolution kernel 1 × 1 on the image matrix, activating the image matrix by adopting a Leaky-Relu activation function, and outputting an activated image matrix SN2
3.4) applying bicubic interpolation method to the image matrix SNP1Amplifying by 2 times and outputting an image matrix SP2(ii) a To SN2And SP2Summing to obtain an image matrix SNP2
SNP2=SN2+SP2 (6)
3.5) to the image matrix SNP2Performing convolution operation of convolution kernel 3 × 3, and outputting an image matrix of 128 channels; and then activating by adopting a Leaky-Relu activation function, and performing convolution operation of convolution kernel 9 x 9 on the activated image matrix to finally obtain a reconstructed image.
4) And training the reconstructed image by adopting a loss function to obtain a super-resolution image SR.
Inputting the reconstructed image and the corresponding original image into a pre-training VGG-19 network for training, namely adopting non-uniform joint loss LUTo counter the loss LGContent loss LCAnd (3) extracting more identification features and detail information while constraining the network to learn the color and texture features of the image by the loss function formed by weighting, and paying more attention to the reconstruction of image foreground information to obtain the super-resolution image SR.
The loss function L for the entire PAMNet network is:
L=γLG+λLU + η LC (7)
wherein γ, λ, η represent weights against loss, non-uniform joint loss, and content loss, respectively, and γ =0.05, λ =1, η =0.1 is taken;
the method is based on generation of a confrontation structure training network model and optimizes model parameters through combination of discriminator loss and generator loss.
Wherein the discriminator loses LDComprises the following steps:
LD=-Exr[log(D(xr,xf))]-Exf[log(1-D(xf,xr))] (8)
against loss LGComprises the following steps:
LG=-Exr[log(1-D(xr,xf))]-Exf[log(D(xf,xr))] (9)
in formulae (8) and (9), xrFor real images, xfFor reconstructing the image, D (x)r,xf) Calculating the difference between the real image and the reconstructed image and limiting D (x) using Sigmoidr,xf)ϵ(0,1) E[]Representing a mathematical expectation.
Non-uniform joint loss LUBased on the L1 loss, the L1 loss L before the first pooling layer was calculated separatelyVGG1And L1 loss L before the last pooling layerVGG2By adjusting LVGG1And LVGG2While the constraint generator extracts the underlying features, more detailed information and identifying features are learned:
LU=αLVGG1+βLVGG2 (10)
in the formula, alpha is LVGG1Beta is LVGG2The weight of (a) is given by α =0.2 and β = 1;
LC=μL1 (xr,xf )+θL2 (xr,xf ) (11)
in the formula: x is the number ofrFor real images, xfFor reconstructing the image, μ and θ represent the weights of the L1 loss and L2 loss, respectively, taking μ =0.75, θ =0.25, L1、L2Indicating L1 losses and L2 losses, respectively.
Comparative experiment:
a total of 3450 images of DIV2K and Flickr2K were selected as the training data Set, and as shown in FIG. 5, Set5, Set14, BSD100 and Urban100 were selected as the test data Set. Compared with the existing image super-resolution method in the aspects of subjectivity and objectivity, the PAM module is respectively embedded into backbone networks of SRGAN and ESRGAN to verify the effectiveness and the universality of the PAM module, and PSNR and SSIM are used as quantization standards for reconstructed image quality on objective indexes. The last picture in fig. 5 shows that PAMNet pays attention to image foreground information while reducing the number of parameters, can fully extract detail features and high-frequency information, and the details and edges of the reconstructed image are clear and have better objective scores.
And replacing the basic residual block of the SRGAN and the RRDB structure in the ESRGAN by the PAM module, keeping other structures and loss functions in the original network unchanged, and verifying the effectiveness and the universality of the PAM module, as shown in FIG. 6. PSNR (Peak Signal to noise ratio) of the PAM-SRGAN on 4 test sets is improved by the accuracy value in an interval [0.54dB, 1.91dB ] compared with the SRGAN; compared with the ESRGAN, the PSNR of the PAM-ESRGAN improves the precision value in the interval of 0.03dB and 0.21dB on 4 test sets. The technical result shows that the PAM module improves the performance of the SRGAN and the ESRGAN network and has good universality.
When the PAM module number in the feature extraction layer is N =11, the PAMNet comprehensively shows that PSNR on 4 data sets exceeds that of the SOTA method RFB-ESRGAN and RFANet. In addition, the PSNR value increases with the increase of the number N of the PAM modules, and when N =11, the PSNR can reach 26.93dB without significant increase, so that the model performance and the complexity are well balanced, and the peak signal-to-noise ratio and the visual quality of a reconstructed image are improved.
The invention can generate clearer foreground information than the prior art, the detail texture characteristics of the reconstructed image are closer to the real image, the color and texture characteristics of the image are reserved, and the integral definition of the image is basically equal to that of the SOTA methods such as RFB-ESRGAN, RFANet and the like.
The average PSNR value of the PAMNet reconstructed image is superior to that of SRCNN, SRGAN, VDSR, EDSR, DBPN, ESRGAN, PRANet, RFB-ESRGAN and other technical methods, and the PSNR average value index is increased by the range of [0.01dB, 0.07dB ]. The average SSIM value is slightly lower in the Set5 and Urban100 data sets than in the RFB-ESRGAN technology, and higher in all other data sets than in other technical approaches.
The jump connection has an important influence on the PAMNet, and the utilization of shallow features by the PAMNet can be improved by using the jump connection, so that the comprehensive performance of a model is improved.
The PAMNet parameters are fewer and have better performance than DBPN, RFANet, SAN, ESRGAN. Compared with RFB-ESRGAN, the PAMNet parameter amount is slightly larger but the overall performance is slightly better than that of RFB-ESRGAN. The results show that PAMNet achieves a good balance between performance and model complexity.
The invention relates to an image super-resolution reconstruction method focusing on foreground information, which provides a universal PAM module to extract foreground information and high-frequency characteristics of an image, extracts channel domain attention and space domain attention weight coefficients by using a gating network, and dynamically modifies the weights of the channel domain attention and the space domain attention in a back propagation process by matching with non-uniform joint loss. Further providing a PAMNet, connecting a plurality of PAM modules in series in the PAMNet, introducing jump connection, fully utilizing shallow features of the image, training through a designed network, and completing reconstruction of a super-resolution image; the method can concentrate on the extraction of the foreground information and the identification features of the image, simultaneously reserve the color and texture features of the image, and improve the utilization rate of a shallow layer; the method can reduce the number of parameters and has better objective scoring; the invention has good balance between the performance and the model complexity, and the PAM module has universality and can be embedded into various network structures.
Although the preferred embodiments of the present invention have been described above with reference to the accompanying drawings, the present invention is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and those skilled in the art can make various changes and modifications within the spirit and scope of the present invention without departing from the spirit and scope of the appended claims.

Claims (10)

1. A method for super-resolution reconstruction of an image focusing on foreground information, the method comprising the steps of:
1) acquiring an image to be trained, preprocessing image data to obtain a characteristic diagram X ϵ RC×H×W, RC×H×WParameters representing an image, R representing a real number set, C representing the number of channels, H, W representing the image size;
2) for the characteristic diagram X ϵ RC×H×WExtracting the features to obtain a feature output graph SLFϵRC×H×W
3) Outputting a graph S based on the featuresLFϵRC×H×WCarrying out image reconstruction on the image characteristics;
4) training the reconstructed image by adopting a loss function to obtain a super-resolution image SR;
the method is characterized in that: the specific steps of the step 2) comprise:
2.1) inputting the feature map X into a feature extraction layer of a PAMNet network, calculating and transmitting residual shallow features of the image for multiple times through N serial PAM module basic units, inputting the shallow features of all the preposed N-2 PAM modules into the tail end of an N-1 PAM module by using jump connection, and carrying out splicing operation in channel dimensions to obtain an image residual shallow feature map S ϵ R10C×H×W
2.2) the image residual shallow feature map S ϵ R10C×H×WAdopting 1 × 1 convolution to reduce dimension and aggregate shallow features to obtain a feature map S with dimensions (C, H, W)LϵRC×H×W
2.3) mapping the feature map SLϵRC×H×WObtaining an extracted characteristic output graph S through the Nth PAM moduleLFϵRC×H×W
2. The image super-resolution reconstruction method of the foreground information of interest according to claim 1, wherein: the specific steps of the step 1) comprise:
1.1) selecting a training sample image as a training sample image set;
1.2) randomly selecting n original images from the training sample image set to carry out cutting and mirror image inversion operation, and randomly dividing each original image into m multiplied by m images marked as ISR(x) X =1,2, … … n; then randomly selecting q sub-images as training images;
1.3) the training image ISR(x) Inputting the images to be trained into a downsampling layer of a PAMNet network, performing convolution operation on the images to be trained respectively through two times of serial LxL convolution cores, preliminarily extracting image color, contour and texture features, and increasing the number of feature map channels to obtain n images to be trained;
1.4) Down-sampling of the PAMNet networkSample layer final output m X m image feature map X ϵ RC×H×WWhere C denotes the number of channels, H, W denotes the image size, and H = W = m.
3. The image super-resolution reconstruction method of the foreground information of interest according to claim 2, wherein: the L × L convolutional layers performing convolution operation each time in the step 1.3) are all composed of three convolutional layers: l × L convolutional layers, L × L BN layers and an L × L Relu layer; obtaining a feature map D of m multiplied by m through the 1 st convolution; the feature map X of m × m is obtained by the 2 nd convolution.
4. The image super-resolution reconstruction method of the foreground information of interest according to claim 2, wherein: the specific steps of the step 2.1) comprise:
2.1.1) when the characteristic diagram X is input into the tPA module at the time t, regarding the 1 st PAM module, X ϵ RC×H×WInputting the Residual error module Residual to carry out convolution operation to obtain an output characteristic diagram XRϵRC×H×W
2.1.2) outputting the characteristic diagram XRϵRC×H×WSimultaneous input of Channel Attention and Spatial Attention modules Channel Attention and Spatial Attention in parallel computing Channel Attention YCAnd spatial domain attention YS
2.1.3) attention of the channel region YCAnd spatial domain attention YSSplicing operation is carried out in the channel dimension to obtain an input value G of the Gate control network GateINϵR2C×H×W
2.1.4) input value G of the gated network GateINϵR2C×H×WInformation fusion using 1 × 1 convolution, GINIs reduced to (C, H, W); performing feature extraction and Sigmoid activation by two times of 3 × 3 convolution to obtain activation output with value range of (0, 1)σϵRC×H×W
2.1.5) outputting the activationσDivision by Y as linear combination coefficientCAnd YSTo obtain an output GOUTϵRC×H×W
2.1.6) continuously updating the activation output during the back propagationσDynamically distributing the weights of the attention of the channel domain and the attention of the space domain in a learning mode, and concentrating on the attention domain with higher weight to extract image foreground information;
2.1.7) output G of the gating networkOUTϵRC×H×WtPA module initial input X with current timeINϵRC×H×WAdding to obtain XOUTϵRC×H×W
2.1.8) repeating the operation of step 2.1.1) -2.1.7) N-2 times;
2.1.9) inputting shallow layer characteristics of all preposed N-2 PAM modules to the tail end of the (N-1) th PAM module of the characteristic extraction layer by adopting jump connection, and splicing in channel dimensions to obtain an image residual error shallow layer characteristic diagram S ϵ R spliced in channel dimensions with dimensions of (10C, H, W)10C×H×WWhere H = W = m.
5. The image super-resolution reconstruction method of the foreground information of interest according to claim 1, wherein: the specific steps of the step 3) comprise:
3.1) adopting a sub-Pixel convolution Pixel-Shuffle method to the characteristic map S obtained in the step 2)LFPerforming a/2 times of upsampling; then, performing convolution operation of a convolution kernel bx b on the image matrix; activating the image matrix by adopting a Leaky-Relu activation function, and outputting an activated image matrix SN1
3.2) applying bicubic interpolation method to corresponding characteristic graph SLFPerforming a/2 times up-sampling to obtain an image matrix SN1Image matrix S with the same size and channel numberP1(ii) a Then to SN1And SP1Summing to obtain an image matrix SNP1
3.3) to the image matrix SNP1Performing convolution operation of a convolution kernel c x c, and outputting an image matrix of 128 channels; then activating by adopting a Leaky-Relu activation function, and performing a/2 times of up-sampling on the activated image matrix by adopting a Pixel-Shuffle method; for the image againPerforming convolution operation of convolution kernel bxb on the matrix, activating the image matrix by adopting a Leaky-Relu activation function, and outputting an activated image matrix SN2
3.4) applying bicubic interpolation method to the image matrix SNP1A/2 times of amplification is carried out, and an image matrix S is outputP2(ii) a To SN2And SP2Summing to obtain an image matrix SNP2
3.5) for the image matrix SNP2Performing convolution operation of a convolution kernel c x c, and outputting an image matrix of 128 channels; and then activating by adopting a Leaky-Relu activation function, and performing convolution operation of convolution kernel dxd on the activated image matrix to finally obtain a reconstructed image, wherein a, b, c and d are natural numbers which are not 0.
6. The image super-resolution reconstruction method of the foreground information of interest according to claim 1, wherein: inputting the reconstructed image and the corresponding original image into a pre-training VGG-19 network for training in the step 4), namely adopting non-uniform joint loss LUTo counter the loss LGContent loss LCAnd (3) extracting more identification features and detail information while constraining the network to learn the color and texture features of the image by the loss function formed by weighting, and paying more attention to the reconstruction of image foreground information to obtain the super-resolution image SR.
7. The image super-resolution reconstruction method of the foreground information of interest according to claim 6, wherein: the loss function L of the PAMNet network is:
L=γLG+λLU + η LC
wherein γ, λ, η represent weights for the penalty loss, non-uniform joint penalty loss, and content penalty loss, respectively;
wherein the discriminator loses LDComprises the following steps:
LD=-Exr[log(D(xr,xf))]-Exf[log(1-D(xf,xr))]
against loss LGComprises the following steps:
LG=-Exr[log(1-D(xr,xf))]-Exf[log(D(xf,xr))]
in the formula, xrFor real images, xfFor reconstructing the image, D (x)r,xf) Calculating the difference between the real image and the reconstructed image and limiting D (x) using Sigmoidr,xf)ϵ(0,1),E[]Represents a mathematical expectation;
non-uniform joint loss LUBased on the L1 loss, the L1 loss L before the first pooling layer was calculated separatelyVGG1And L1 loss L before the last pooling layerVGG2By adjusting LVGG1And LVGG2While the constraint generator extracts the underlying features, more detailed information and identifying features are learned:
LU=αLVGG1+βLVGG2
in the formula, alpha is LVGG1Beta is LVGG2The weight of (c);
LC=μL1 (xr,xf)+θL2 (xr,xf)
in the formula: x is the number ofrFor real images, xfFor reconstructing the image, μ and θ represent the weights of the L1 loss and L2 loss, respectively, L1、L2Indicating L1 losses and L2 losses, respectively.
8. The image super-resolution reconstruction method of the foreground information of interest according to claim 4, wherein: said 2.1.2) calculating the channel Domain attention YCIn the process, a SENet structure is adopted, the full connection layer in the SENet is replaced by 1 × 1 convolution, the space characteristics of the image are reserved, and the specific calculation of the attention of the channel domain is as follows:
YC=X+CA(XR)
in the formula, X ϵ RC×H×WRepresenting the input of a residual block, XRϵRC×H×WRepresenting the output after computation of the residue, CA () representing the compute channel domain attention, YCϵRC×H×WIndicating channel domain attentionFinal output of force;
simultaneous parallel computation of spatial domain attention YSFirstly, three-layer cascade expansion convolution with expansion rates of 1,2 and 3 is used for calculating spatial domain attention, firstly, 1 multiplied by 1 convolution is used for dimensionality reduction, and an input feature graph with dimensionality (C, H and W) is converted into a feature graph with dimensionality (C/K, H and W), wherein K is a dimensionality reduction coefficient;
secondly, performing three times of expansion convolution with different expansion rates on the feature map after dimension reduction, and expanding the receptive field by the minimum parameter number in a limited step number, thereby ensuring the continuity of the receptive field and avoiding information loss caused by pooling;
and finally, fusing information of different channels of the feature diagram by using 1 × 1 convolution and activating by Sigmoid to obtain the feature diagram weight phi of (1, H, W) dimension, and multiplying the weight to the input feature diagram X in the (H, W) dimension distributionRϵRC×H×WThe purpose of paying attention to the foreground information of the image is achieved, and the specific calculation of the attention of the spatial domain is as follows:
YS=X+SA(XR)
in the formula, X ϵ RC×H×WRepresenting the input of a residual block, XRϵRC×H×WRepresenting the output after computation of the residual, SA () representing the computation of spatial domain attention, YSϵRC×H×WRepresenting the final output of spatial domain attention.
9. The image super-resolution reconstruction method of the foreground information of interest according to claim 4, wherein: the specific calculation of continuously updating the activation output σ during the back propagation in step 2.1.6) is
GOUT=(1-σ)YC+σYS
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method of any one of claims 1 to 8.
CN202210035833.6A 2022-01-13 2022-01-13 Image super-resolution reconstruction method focusing on foreground information Active CN114049261B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210035833.6A CN114049261B (en) 2022-01-13 2022-01-13 Image super-resolution reconstruction method focusing on foreground information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210035833.6A CN114049261B (en) 2022-01-13 2022-01-13 Image super-resolution reconstruction method focusing on foreground information

Publications (2)

Publication Number Publication Date
CN114049261A true CN114049261A (en) 2022-02-15
CN114049261B CN114049261B (en) 2022-04-01

Family

ID=80196532

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210035833.6A Active CN114049261B (en) 2022-01-13 2022-01-13 Image super-resolution reconstruction method focusing on foreground information

Country Status (1)

Country Link
CN (1) CN114049261B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114612408A (en) * 2022-03-04 2022-06-10 拓微摹心数据科技(南京)有限公司 Heart image processing method based on federal deep learning
CN114881861A (en) * 2022-05-25 2022-08-09 厦门大学 Unbalanced image over-resolution method based on double-sampling texture perception distillation learning
CN115861684A (en) * 2022-11-18 2023-03-28 百度在线网络技术(北京)有限公司 Training method of image classification model, and image classification method and device
CN116485652A (en) * 2023-04-26 2023-07-25 北京卫星信息工程研究所 Super-resolution reconstruction method for remote sensing image vehicle target detection
CN116645716A (en) * 2023-05-31 2023-08-25 南京林业大学 Expression Recognition Method Based on Local Features and Global Features
CN117078516A (en) * 2023-08-11 2023-11-17 济宁安泰矿山设备制造有限公司 Mine image super-resolution reconstruction method based on residual mixed attention
CN114881861B (en) * 2022-05-25 2024-06-04 厦门大学 Unbalanced image super-division method based on double-sampling texture perception distillation learning

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150093015A1 (en) * 2013-09-26 2015-04-02 Hong Kong Applied Science & Technology Research Institute Company Limited Visual-Experience-Optimized Super-Resolution Frame Generator
CN111192200A (en) * 2020-01-02 2020-05-22 南京邮电大学 Image super-resolution reconstruction method based on fusion attention mechanism residual error network
CN112270646A (en) * 2020-11-05 2021-01-26 浙江传媒学院 Super-resolution enhancement method based on residual error dense jump network
CN112270697A (en) * 2020-10-13 2021-01-26 清华大学 Satellite sequence image moving target detection method combined with super-resolution reconstruction
DE102020122844A1 (en) * 2019-10-29 2021-04-29 Samsung Electronics Co., Ltd. SYSTEM AND PROCEDURE FOR DEEP MACHINE LEARNING FOR COMPUTER VISION APPLICATIONS

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150093015A1 (en) * 2013-09-26 2015-04-02 Hong Kong Applied Science & Technology Research Institute Company Limited Visual-Experience-Optimized Super-Resolution Frame Generator
DE102020122844A1 (en) * 2019-10-29 2021-04-29 Samsung Electronics Co., Ltd. SYSTEM AND PROCEDURE FOR DEEP MACHINE LEARNING FOR COMPUTER VISION APPLICATIONS
CN111192200A (en) * 2020-01-02 2020-05-22 南京邮电大学 Image super-resolution reconstruction method based on fusion attention mechanism residual error network
CN112270697A (en) * 2020-10-13 2021-01-26 清华大学 Satellite sequence image moving target detection method combined with super-resolution reconstruction
CN112270646A (en) * 2020-11-05 2021-01-26 浙江传媒学院 Super-resolution enhancement method based on residual error dense jump network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
NAN MENG 等: "High-Dimensional Dense Residual Convolutional Neural Network for Light Field Reconstruction", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE ( VOLUME: 43, ISSUE: 3, MARCH 1 2021)》 *
谢堂鑫 等: "基于Dirac残差模块的单幅图像超分辨率重建", 《云南民族大学学报( 自然科学版)》 *
陶状 等: "双路径反馈网络的图像超分辨重建算法", 《计算机系统应用》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114612408A (en) * 2022-03-04 2022-06-10 拓微摹心数据科技(南京)有限公司 Heart image processing method based on federal deep learning
CN114881861A (en) * 2022-05-25 2022-08-09 厦门大学 Unbalanced image over-resolution method based on double-sampling texture perception distillation learning
CN114881861B (en) * 2022-05-25 2024-06-04 厦门大学 Unbalanced image super-division method based on double-sampling texture perception distillation learning
CN115861684A (en) * 2022-11-18 2023-03-28 百度在线网络技术(北京)有限公司 Training method of image classification model, and image classification method and device
CN115861684B (en) * 2022-11-18 2024-04-09 百度在线网络技术(北京)有限公司 Training method of image classification model, image classification method and device
CN116485652A (en) * 2023-04-26 2023-07-25 北京卫星信息工程研究所 Super-resolution reconstruction method for remote sensing image vehicle target detection
CN116485652B (en) * 2023-04-26 2024-03-01 北京卫星信息工程研究所 Super-resolution reconstruction method for remote sensing image vehicle target detection
CN116645716A (en) * 2023-05-31 2023-08-25 南京林业大学 Expression Recognition Method Based on Local Features and Global Features
CN116645716B (en) * 2023-05-31 2024-01-19 南京林业大学 Expression recognition method based on local features and global features
CN117078516A (en) * 2023-08-11 2023-11-17 济宁安泰矿山设备制造有限公司 Mine image super-resolution reconstruction method based on residual mixed attention
CN117078516B (en) * 2023-08-11 2024-03-12 济宁安泰矿山设备制造有限公司 Mine image super-resolution reconstruction method based on residual mixed attention

Also Published As

Publication number Publication date
CN114049261B (en) 2022-04-01

Similar Documents

Publication Publication Date Title
CN114049261B (en) Image super-resolution reconstruction method focusing on foreground information
Mo et al. Fake faces identification via convolutional neural network
CN110415170B (en) Image super-resolution method based on multi-scale attention convolution neural network
CN111275618B (en) Depth map super-resolution reconstruction network construction method based on double-branch perception
CN109509152B (en) Image super-resolution reconstruction method for generating countermeasure network based on feature fusion
CN111047515B (en) Attention mechanism-based cavity convolutional neural network image super-resolution reconstruction method
CN111754438B (en) Underwater image restoration model based on multi-branch gating fusion and restoration method thereof
CN110992275A (en) Refined single image rain removing method based on generation countermeasure network
CN109146944B (en) Visual depth estimation method based on depth separable convolutional neural network
CN113362223A (en) Image super-resolution reconstruction method based on attention mechanism and two-channel network
CN110675321A (en) Super-resolution image reconstruction method based on progressive depth residual error network
CN113962893A (en) Face image restoration method based on multi-scale local self-attention generation countermeasure network
CN112489164B (en) Image coloring method based on improved depth separable convolutional neural network
Chen et al. Single image super-resolution using deep CNN with dense skip connections and inception-resnet
Luo et al. Lattice network for lightweight image restoration
CN111986108A (en) Complex sea-air scene image defogging method based on generation countermeasure network
CN111833261A (en) Image super-resolution restoration method for generating countermeasure network based on attention
CN113284100A (en) Image quality evaluation method based on recovery image to mixed domain attention mechanism
CN116309648A (en) Medical image segmentation model construction method based on multi-attention fusion
CN111861884A (en) Satellite cloud image super-resolution reconstruction method based on deep learning
CN115546032A (en) Single-frame image super-resolution method based on feature fusion and attention mechanism
CN116188274A (en) Image super-resolution reconstruction method
CN115526779A (en) Infrared image super-resolution reconstruction method based on dynamic attention mechanism
CN111260585A (en) Image recovery method based on similar convex set projection algorithm
CN113627487B (en) Super-resolution reconstruction method based on deep attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant