CN113627487A - Super-resolution reconstruction method based on deep attention mechanism - Google Patents

Super-resolution reconstruction method based on deep attention mechanism Download PDF

Info

Publication number
CN113627487A
CN113627487A CN202110790131.4A CN202110790131A CN113627487A CN 113627487 A CN113627487 A CN 113627487A CN 202110790131 A CN202110790131 A CN 202110790131A CN 113627487 A CN113627487 A CN 113627487A
Authority
CN
China
Prior art keywords
feature map
resolution
deep
image
attention mechanism
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110790131.4A
Other languages
Chinese (zh)
Other versions
CN113627487B (en
Inventor
刘晶
杨慧
薛雨馨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Technology
Original Assignee
Xian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Technology filed Critical Xian University of Technology
Priority to CN202110790131.4A priority Critical patent/CN113627487B/en
Publication of CN113627487A publication Critical patent/CN113627487A/en
Application granted granted Critical
Publication of CN113627487B publication Critical patent/CN113627487B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4053Super resolution, i.e. output image resolution higher than sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4046Scaling the whole image or part thereof using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a super-resolution reconstruction method based on a deep attention mechanism, which specifically comprises the following steps: step 1, acquiring a low-resolution LR image; step 2, inputting the low-resolution LR image into a deep attention mechanism network to obtain a shallow characteristic map; step 3, inputting the shallow feature map in the step 2 into a deep feature extraction module, cascading the shallow feature map and the deep feature map to obtain a cascading feature map, giving weight distribution to the cascading feature map, and reducing dimensions of the feature map to obtain a dimension reduction feature map; step 4, adding the dimension reduction feature map obtained in the step 3 and the LR image obtained in the step 1, and learning a feature residual error to obtain a global feature map; and 5, inputting the global feature map obtained in the step 4 into an up-sampling module, amplifying the low-resolution feature map to an output scale, and finally performing super-resolution reconstruction on the image in a reconstruction module.

Description

Super-resolution reconstruction method based on deep attention mechanism
Technical Field
The invention belongs to the technical field of image processing methods, and relates to a super-resolution reconstruction method based on a deep attention mechanism.
Background
The concept of attention mechanism can be explained by the examples of liveness, such as when a person is reading a book, the person will pay attention to the text instead of the blank area on the page, and when a page has a color illustration, the person will be attracted by the color illustration. People's vision system tends to focus on some information in the image that assists in the determination and ignore irrelevant information. In computer vision tasks, the attention mechanism ensures that the deep network learns relatively important information, while ignoring irrelevant information. The Google team first applied attention to the task of image classification in 2014. In the same year, bahdana et al have extensively applied their attention model to the task of machine interpretation. Now that attention mechanism has been widely applied in various fields, Xu et al applied attention mechanism research to a picture expression task in 2015 and proposed the concept of hard and soft attention mechanism. Hu et al applied an attention mechanism to a target detection task in 2017, and the recognition effect of the model is improved. Attention mechanisms can be divided into hard and soft attention mechanisms.
The Hard attention mechanism (Hard-attention) is an 0/1 problem that preserves important features in the input information and ignores those features that are not important. For example, in some fine-grained object classification tasks, the hard attention mechanism can locate the key information of image input and take the key information as the next input. The hard attention mechanism can effectively reduce the calculation amount of the network model, only selects the characteristic information useful for the classification result, and obviously improves the calculation efficiency and the classification accuracy of the network.
The Soft attention mechanism (Soft-attention) focuses on the problem of continuously distributing [0,1] regions, and each region of interest is weighted by 0 to 1 according to the degree of interest. The training process can be directly attached to the neural network, and the area of the channel is paid more attention to, so that each concerned area has an effect on the result, and the calculation amount is increased compared with a hard attention mechanism.
According to the difference of the focus of the soft attention mechanism, the soft attention mechanism is generally divided into a channel attention mechanism and a space attention mechanism.
Disclosure of Invention
The invention aims to provide a super-resolution reconstruction method based on a deep attention mechanism.
The invention adopts the technical scheme that a super-resolution reconstruction method based on a deep attention mechanism specifically comprises the following steps:
step 1, acquiring a low-resolution LR image;
step 2, inputting the low-resolution LR image in the step 1 into a deep attention mechanism network, and extracting the low-resolution LR image through a shallow feature extraction module to obtain a shallow feature map;
step 3, inputting the low-resolution LR image in the step 1 into a deep attention mechanism network, and performing deep feature extraction through a deep feature extraction module to obtain a deep feature map; cascading the shallow feature map and the deep feature map to obtain a cascading feature map, performing weight distribution on the cascading feature map, and reducing the dimension of the feature map to obtain a dimension-reduced feature map;
step 4, adding the dimension reduction feature map obtained in the step 3 and the LR image obtained in the step 1, and learning a feature residual error to obtain a global feature map;
and 5, inputting the global feature map obtained in the step 4 into an up-sampling module, amplifying the low-resolution feature map to an output scale, and finally performing super-resolution reconstruction on the image in a reconstruction module.
The invention is also characterized in that:
the specific operation of the step 1 is as follows:
step 1.1, respectively downloading Set5, Set14, BSD100, URBAN100 and MANGA109 data sets on the network;
and step 1.2, carrying out 4-time down-sampling pretreatment on each data set in the step 1.1 to obtain low-resolution LR images corresponding to each data set.
The step 2 comprises the following specific steps:
inputting the low-resolution LR image in the step 1 into a deep attention mechanism network, transforming the low-resolution image input into the network into a feature map space through two layers of convolution layers with ReLU activation functions by a shallow feature extraction module, wherein mathematical formulas of a shallow feature extraction process are as follows (1) and (2):
F-1=HSFEB(ILR) (1);
F0=HSFEB(F-1) (2);
in the formula ILRRepresenting a low resolution image, HSFEB() Representing latent layer feature extraction, F-1Representing the result after the first latent layer feature extraction, F0Representing the results after the second latent layer feature extraction.
The specific process of the step 3 is as follows:
step 3.1, performing depth feature extraction on the feature map by utilizing a plurality of DFEB modules containing a channel attention mechanism;
step 3.2, extracting the shallow feature graph F extracted by the shallow feature extraction module-1、F0And a deep feature map F extracted by the deep feature module1,Fd,...FDCascade together to obtain a cascade characteristic diagram denoted by FCONThe mathematical expression formula is shown in formula (3):
FCON=[F-1,F0,F1,...,Fd,...,FD] (3);
and 3.3, performing weight distribution on the cascade characteristic graph by using the depth dimension of the DAB module in the network model, and performing dimensionality reduction on the cascade characteristic graph to obtain a dimensionality reduction characteristic graph.
The specific process of the step 4 is as follows: obtaining a global feature map by using the following formula (4):
FGF=FDF+ILR (4);
wherein, ILRRepresenting low resolution images LR, FDFRepresenting the weighted feature map after dimensionality reduction, FGFRepresenting a global feature map.
The specific process of the step 5 is as follows:
step 5.1, the global feature map F obtained in the step 4 is processedGFInputting the low-resolution characteristic image into an up-sampling module, and amplifying the low-resolution characteristic image to an output scale in a deconvolution mode;
and 5.2, performing super-resolution reconstruction on the image obtained in the step 5.1 by adopting a convolution layer in a reconstruction module to finally obtain a reconstructed high-resolution picture.
Compared with the traditional super-resolution reconstruction method based on the attention mechanism, the method has the advantages that the deep attention mechanism module (DAB) provided by the method carries out weight optimization in the process of processing the shallow network and the deep network. The traditional attention mechanism only optimizes channel dimension weights, and weight optimization is not performed in the process of processing shallow networks and deep networks. The method has higher convergence speed during training, and simultaneously improves PSNR and SSIM evaluation indexes compared with a method without a DAB module when the iteration times are the same.
Drawings
FIG. 1 is a network structure diagram of a super-resolution method based on a deep attention mechanism according to the present invention;
FIG. 2 is a block diagram of a deep layer feature extraction module (DFEB) of a super-resolution method network structure diagram based on a deep layer attention mechanism according to the present invention;
FIG. 3 is a diagram of a CA module in a deep layer feature extraction module (DFEB) module in a network structure of a super-resolution method based on a deep layer attention mechanism according to the present invention;
FIG. 4 is a DAB module structure diagram of a super-resolution method network structure diagram based on a deep attention mechanism of the present invention;
FIGS. 5(a), (b) are graphs comparing the results obtained by using the super-resolution method based on the deep attention mechanism of the present invention;
fig. 6(a) and (b) are experimental results of validity verification of the DAB module by using the super-resolution method based on the deep attention mechanism of the present invention.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The invention relates to a super-resolution method based on a deep attention mechanism, as shown in figure 1, ILRRepresenting an input low resolution picture, SFEB representing a shallow feature extraction module, F-1,F0A shallow feature map obtained by a shallow feature extraction module; DFEB represents a deep feature extraction module, F1-FDThe deep characteristic map is obtained by a deep characteristic extraction module; concatee stands for cascade operation, and the shallow feature map F-1,F0Deep layer feature map F1,Fd,...FDCascading to obtain a cascading characteristic diagram FCON(ii) a Then, a DAB (DeepAttention Block) module gives new weight to the feature maps, an attention mechanism is established in the depth dimension, and a weighted feature map F is obtainedDAB(ii) a And then, two convolution operations are carried out, the first convolution operation carries out dimensionality reduction operation on the weighted feature graph obtained by the DAB module to reduce the feature graph to 64 dimensions, and the second convolution operation carries out dimensionality reduction on the feature graph to 3 channels to obtain a feature graph FDF(ii) a Then adding the learning characteristic residual error with LR to obtain a global characteristic diagram FGFThen inputting the data into an up-sampling module; the up-sampling module amplifies the low-resolution characteristic image to an output scale in a deconvolution mode; finally, a layer of convolution layer is adopted in a reconstruction module to carry out super-resolution reconstruction on the image to obtain a high-resolution picture IHR
The method is implemented according to the following steps:
step 1, acquiring a low-resolution LR image;
step 1.1, Set5, Set14, BSD100, URBAN100, MANGA109 data sets are downloaded over the network.
And step 1.2, carrying out 4-time downsampling pretreatment on each data set in the step 1.1 to obtain low-resolution LR pictures corresponding to each data set.
Step 2, inputting the low-resolution LR image in the step 1 into a deep attention mechanism network, and extracting the low-resolution LR image through a shallow feature extraction module to obtain a shallow feature map; the method specifically comprises the following steps:
and (3) inputting the low-resolution LR image in the step (1) into a deep attention mechanism network, wherein the deep attention mechanism network can be divided into a shallow feature extraction module, a deep feature extraction module, an up-sampling module and a reconstruction module. The low-resolution LR image is first processed by a Shallow Feature Extraction Block (SFEB), and the low-resolution image of the input network is transformed into a Feature map space by two convolutional layers with ReLU activation functions, where the number of channels of the transformed Feature map is 64. The shallow feature extraction module is mainly responsible for extracting low-frequency information of the image and transmitting the low-frequency information on the network. Inputting the low-resolution LR image into a shallow feature extraction module (SFEB) to obtain a shallow feature map F-1,F0The formula is expressed by the following mathematical formula (1) and (2):
F-1=HSFEB(ILR) (1);
F0=HSFEB(F-1) (2);
in the formula ILRRepresenting a low resolution image, HSFEB() Representing a latent layer feature extraction operation, F-1,F0Representing the resulting shallow feature map after shallow feature extraction.
Step 3, inputting the shallow feature map in the step 2 into a deep feature extraction module, performing depth feature extraction on the shallow feature map to obtain a deep feature map, cascading the shallow feature map and the deep feature map to obtain a cascading feature map, performing weight distribution on the cascading feature map, and reducing the dimension of the feature map to obtain a dimension-reduced feature map;
and 3.1, performing deep feature extraction on the feature map by using a plurality of DFEB modules containing a channel attention mechanism, wherein a sub-module DFEB structure of the deep extraction module is shown in figure 2. Upper layer characteristic diagram Fd-1Inputting the DFEB module, wherein the CA module is a channel attention mechanism, and endowing different weights to the characteristic diagram of the previous layer through the channel attention mechanism; training residual components of the CA module through 4 layers of convolution and a ReLU activation function; finally, the output of the CA module, namely the characteristic graphs with different weights, and the residual error of the CA moduleThe components are added to obtain the output Fd of the DFEB block. Wherein d represents the d-th DFEB module. The mathematical expression of a DFEB module is as in equation (3):
Fd=HDFEB,d(Fd-1)
=HDFEB,d(HDFEB,d-1(…(HDFEB,1(F0))…)) (3);
in the formula, Fd-1Is the profile output of the d-1 level DFEB module, HDFEBRepresenting a deep feature extraction operation, d representing the d-th DFEB module, F0Feature map output, F, for a layer 0 DFEB ModuledAnd (3) representing the characteristic diagram output of the d-th layer DFEB module.
Step 3.1.1, as shown in fig. 3, a structure diagram of the CA module is a schematic diagram of the channel attention module, N represents a feature diagram with a size of h ' × w ' × c ', and a feature diagram U of h × w × c is obtained after a corresponding convolution operation; fsqNamely, the Squeeze operation performs global average pooling on the feature map U to obtain a feature map with the size of 1 × 1 × c; fexNamely, the Excitation operation uses a full-connection network to perform nonlinear transformation on the result after the Squeeze; fscaleTaking the result obtained by the Excitation as the weight, and multiplying the result to the input characteristic of the corresponding channel, namely channel weighting; finally, obtaining the weighted characteristic diagram of different weights of each channel
Figure BDA0003160559170000081
The specific operation of global average pooling is as follows: performing average pooling in a global range on the feature map of each channel, converting a two-dimensional feature map into a real number, recording the real number as Zc, and using a mathematical expression shown as (4):
Figure BDA0003160559170000091
in the formula, H and w represent the height and width of the input picture, ucCharacteristic diagram representing the number of channels C, FsqRepresents the global average pooling operation on the feature map U, and U (i, j) represents the feature mapUth row and j th column.
The specific operation of the Excitation is as follows: generating different excitations for each real number by two fully-connected layers, where W1Representing the first full join operation, the number of lanes from C2Dimension reduction to C2The/r dimension, d δ, represents the ReLU activation layer, where r is the hyper-parameter 16. W2Represents a second full join operation, reducing the number of lanes to C2And (5) maintaining. And finally, mapping by a Sigmoid function to obtain S, wherein a mathematical expression is shown as (5):
S=Fex(z,W)=σ(g(z,W))=σ(W26(W1z)) (5);
in the formula, W1、W2Respectively representing a first full-connection operation and a second full-connection operation, 6 representing a ReLU activation layer, sigma representing Sigmoid function mapping, Z representing a real number after pooling, and S representing a learned weight.
FscaleThe specific operation is as follows: weighting the S weight output in the step (5) according to the channel to achieve the purpose of a channel attention mechanism, wherein a mathematical expression is shown as a step (6):
xc=Fscale(U,sc)=U·sc (6);
in the formula, U represents the number of channels C2Characteristic diagram of (1), scRepresenting the learned weight(s)cWeighted by weight S), where c represents the number of channels.
Step 3.2, obtaining a plurality of groups of characteristic graphs after respectively carrying out shallow characteristic extraction and deep characteristic extraction, wherein the shallow characteristic graph F-1,F0Deep layer feature map F1,Fd,...FD. Cascading these profiles together to obtain a cascaded profile, denoted FCONThe mathematical expression is as in formula (7):
FCON=[F-1,F0,F1,...,Fd,...,FD] (7);
step 3.3, under the initiation of an Attention mechanism, utilizing the depth dimension of the proposed DAB (deep Attention Block) module in a network model to carry out cascade connection on a plurality of groupsThe deep profile is given a weight assignment. The cascade feature maps are the image feature information extracted from different depth convolution layers, and when they have the same weight in the depth direction, we establish the Attention mechanism in the depth dimension by giving new weight to the feature maps through a DAB (deep Attention Block) module, wherein the DAB module is schematically shown in FIG. 4. Through FsqI.e. the Squeeze operation vs. channel number C1 cascade feature diagram FCONCarrying out average pooling to obtain a characteristic diagram Z consisting of c1 real numbers; then obtaining a weight S through two full-connection layers of W1 and W2; through FscaleMultiplying the weight S to be learned by the cascade characteristic diagram to obtain the final output F of the depth dimension DAB moduleDAB
The pixel-by-pixel loss is combined with the perceptual loss as an objective function for optimizing the network. The pixel-by-pixel loss function is a mean square error loss function, which is also called MSE loss function, and represents the mean value of the sum of squares of the difference values of the predicted value and the true value. The minimum point of the loss function indicates that the predicted value is true, and the value of the loss function increases steeply with the increase of the error. With the success of the image stylized migration task, it was found that the resulting feature maps in the convolutional neural network could be used as part of the target loss. The perceptual loss function optimizes a feature map obtained by passing an input low-resolution image through a convolutional neural network, and a difference value between the feature map and a feature map obtained by passing the input low-resolution image through the same network as a truth map can enable a generated image to be semantically closer to a truth image. The significance of the perceptual loss function is that if two network predicted images and true images which are similar to each other in appearance exist, but the two images only have translation deviation of one row of pixels, the two images can obtain larger errors according to pixel-by-pixel loss calculation, the errors can not effectively reflect the real reconstruction effect of the predicted images, the convergence rate of the network can be reduced, the perceptual loss is proposed as a part of the target function under the background, and reconstruction is facilitated to obtain the predicted images with better effects. The mathematical formula of the loss function of the method is as follows:
Figure BDA0003160559170000111
in the formula, yiRepresents the predicted value of the prediction,
Figure BDA0003160559170000112
representing the true value and n representing the dimension of the data. The MSE loss function has smooth curve and continuous conductivity everywhere, is convenient for the forward and backward propagation of the network,
Figure BDA0003160559170000113
indicates the predicted value, VDAB(Y) represents a true value. Cj、Hj、WjRespectively, the shapes of the output characteristic values. The first item represents the pixel-by-pixel loss, the second item represents the perception loss between the predicted characteristic diagram and the truth characteristic diagram output by the DAB module, and the sum of the two items is optimized to continuously update the network structure until the network training is completed.
Step 3.3.1, number of paired channels is C1Cascade feature map FCONThe average pooling is performed, the formula is shown below, and the calculation result Z is represented by C1A feature map composed of real numbers, and the mathematical expression is as the following formula (9):
Figure BDA0003160559170000114
in the formula, H and W represent the height and width of the input picture, ucRepresents the number of channels as C1Characteristic diagram of (1), FsqRepresents the global average pooling operation performed on the feature map U, and U (i, j) represents the pixel of the ith row and the jth column of the feature map U.
Step 3.3.2, generating C by two fully-connected layers1The new weight for each channel, the formula is shown below, W1Denotes the first fully-connected layer, δ denotes the ReLU activation function, C1Dimensional feature graph non-linear mapping to C2And (5) maintaining. W2Denotes a second fully-connected layer, C2The dimension feature map is reduced back to C1Dimension, σ, represents a sigmoid activation function, the result is denoted S, and a mathematical expression such asEquation (10):
S=Fex(z,W)=σ(g(z,W))=σ(W2δ(W1z)) (10);
in the formula W1、W2Respectively, the first and second fully connected operations are represented, δ represents the ReLU active layer, σ represents the Sigmoid function mapping, and Z represents the real number in step 3.3.1. FexI.e. the Excitation operation uses a fully connected network, making a non-linear transformation of the result after Squeeze.
Step 3.3.3, the learned weight is multiplied by the cascade characteristic graph, and the final output F of the depth dimension DAB module can be obtainedDABThe formula can be expressed as formula (11):
FDAB=Fscale(uc,s)=uc·s
(11);
in the formula ucRepresents the number of channels as C1S represents the weight learned in step 3.3.2.
Step 4, adding the dimension reduction feature map obtained in the step 3 and the LR image obtained in the step 1, and learning a feature residual error to obtain a global feature map;
and 4.1, after the network model is processed by the DAB module, giving different importance degrees to the information extracted by the networks with different depths. And (4) performing dimensionality reduction operation on the weighted feature map obtained by the DAB module to reduce the feature map to 64 dimensions.
Step 4.2, performing a layer of convolution operation on the 64-dimensional feature map obtained in the step 4.1, reducing the dimension of the feature map to three channels, adding the learned feature residual error to LR to obtain a global feature map FGFThe formula can be expressed as formula (12):
FGF=FDF+ILR
(12);
in the formula ILRRepresenting a low resolution picture, FDFFeature map representing 3 channels, FGFRepresenting a global feature map.
And 5, inputting the global feature map obtained in the step 4 into an up-sampling module, amplifying the low-resolution feature map to an output scale, and finally performing super-resolution reconstruction on the image in a reconstruction module.
Step 5.1, obtaining the global feature map F from the step 4.2GFAnd inputting the low-resolution characteristic image into an up-sampling module, and amplifying the low-resolution characteristic image to an output scale by adopting a deconvolution mode.
And 5.2, performing super-resolution reconstruction on the image obtained in the step 5.1 by adopting a convolution layer in a reconstruction module, wherein a comparison graph of a reconstruction result is shown in fig. 5, wherein fig. 5(a) is a low-resolution image, and fig. 5(b) is a high-resolution image.
And (3) verifying the validity of the DAB module:
the method provides a deep attention mechanism module (DAB), the traditional attention mechanism only optimizes channel dimension weight, and weight optimization is not carried out in the process of processing a shallow network and a deep network. Effectiveness verification experiment of DAB compares the method with an algorithm which has the same network structure but does not contain a DAB module, and the result is shown in FIG. 6(a) is a PSNR change curve, and FIG. 6(b) is an SSIM change curve). The experimental result shows that the algorithm containing the DAB module has higher convergence speed during training, and meanwhile, compared with the algorithm without the DAB module, the algorithm has the advantages that the PSNR and SSIM evaluation indexes are improved when the iteration times are the same.

Claims (6)

1. A super-resolution reconstruction method based on a deep attention mechanism is characterized by comprising the following steps: the method specifically comprises the following steps:
step 1, acquiring a low-resolution LR image;
step 2, inputting the low-resolution LR image in the step 1 into a deep attention mechanism network, and extracting the low-resolution LR image through a shallow feature extraction module to obtain a shallow feature map;
step 3, inputting the low-resolution LR image in the step 1 into a deep attention mechanism network, and performing deep feature extraction through a deep feature extraction module to obtain a deep feature map; cascading the shallow feature map and the deep feature map to obtain a cascading feature map, performing weight distribution on the cascading feature map, and reducing the dimension of the feature map to obtain a dimension-reduced feature map;
step 4, adding the dimension reduction feature map obtained in the step 3 and the LR image obtained in the step 1, and learning a feature residual error to obtain a global feature map;
and 5, inputting the global feature map obtained in the step 4 into an up-sampling module, amplifying the low-resolution feature map to an output scale, and finally performing super-resolution reconstruction on the image in a reconstruction module.
2. The super-resolution reconstruction method based on the deep attention mechanism as claimed in claim 1, wherein: the specific operation of the step 1 is as follows:
step 1.1, respectively downloading Set5, Set14, BSD100, URBAN100 and MANGA109 data sets on the network;
and step 1.2, carrying out 4-time down-sampling pretreatment on each data set in the step 1.1 to obtain low-resolution LR images corresponding to each data set.
3. The super-resolution reconstruction method based on the deep attention mechanism as claimed in claim 1, wherein: the step 2 comprises the following specific steps:
inputting the low-resolution LR image in the step 1 into a deep attention mechanism network, transforming the low-resolution image input into the network into a feature map space through two layers of convolution layers with ReLU activation functions by a shallow feature extraction module, wherein mathematical formulas of a shallow feature extraction process are as follows (1) and (2):
F-1=HSFEB(ILR) (1);
F0=HSFEB(F-1) (2);
in the formula ILRRepresenting a low resolution image, HSFEB() Representing latent layer feature extraction, F-1Representing the result after the first latent layer feature extraction, F0Representing the results after the second latent layer feature extraction.
4. The super-resolution reconstruction method based on the deep attention mechanism as claimed in claim 1, wherein: the specific process of the step 3 is as follows:
step 3.1, performing depth feature extraction on the feature map by utilizing a plurality of DFEB modules containing a channel attention mechanism;
step 3.2, extracting the shallow feature graph F extracted by the shallow feature extraction module-1、F0And a deep feature map F extracted by the deep feature module1,Fd,...FDCascade together to obtain a cascade characteristic diagram denoted by FCONThe mathematical expression formula is shown in formula (3):
FCON=[F-1,F0,F1,...,Fd,...,FD] (3);
and 3.3, based on the attention mechanism, utilizing the DAB module in the depth dimension of the network model to give weight distribution to the cascade characteristic graph, and reducing the dimension of the cascade characteristic graph to obtain a dimension-reduced characteristic graph.
5. The deep attention mechanism-based super-resolution reconstruction method according to claim 4, wherein: the specific process of the step 4 is as follows: obtaining a global feature map by using the following formula (4):
FGF=FDF+ILR (4);
wherein, ILRRepresenting low resolution images LR, FDFRepresenting the weighted feature map after dimensionality reduction, FGFRepresenting a global feature map.
6. The deep attention mechanism-based super-resolution reconstruction method according to claim 5, wherein: the specific process of the step 5 is as follows:
step 5.1, the global feature map F obtained in the step 4 is processedGFInputting the low-resolution characteristic image into an up-sampling module, and amplifying the low-resolution characteristic image to an output scale in a deconvolution mode;
and 5.2, performing super-resolution reconstruction on the image obtained in the step 5.1 by adopting a convolution layer in a reconstruction module to finally obtain a reconstructed high-resolution picture.
CN202110790131.4A 2021-07-13 2021-07-13 Super-resolution reconstruction method based on deep attention mechanism Active CN113627487B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110790131.4A CN113627487B (en) 2021-07-13 2021-07-13 Super-resolution reconstruction method based on deep attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110790131.4A CN113627487B (en) 2021-07-13 2021-07-13 Super-resolution reconstruction method based on deep attention mechanism

Publications (2)

Publication Number Publication Date
CN113627487A true CN113627487A (en) 2021-11-09
CN113627487B CN113627487B (en) 2023-09-05

Family

ID=78379709

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110790131.4A Active CN113627487B (en) 2021-07-13 2021-07-13 Super-resolution reconstruction method based on deep attention mechanism

Country Status (1)

Country Link
CN (1) CN113627487B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116402682A (en) * 2023-03-29 2023-07-07 辽宁工业大学 Image reconstruction method and system based on differential value dense residual super-resolution
CN117173025A (en) * 2023-11-01 2023-12-05 华侨大学 Single-frame image super-resolution method and system based on cross-layer mixed attention transducer

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110717856A (en) * 2019-09-03 2020-01-21 天津大学 Super-resolution reconstruction algorithm for medical imaging
US20200065646A1 (en) * 2018-08-23 2020-02-27 Samsung Electronics Co., Ltd. Method and device with convolution neural network processing
AU2020100200A4 (en) * 2020-02-08 2020-06-11 Huang, Shuying DR Content-guide Residual Network for Image Super-Resolution
CN112329912A (en) * 2020-10-21 2021-02-05 广州工程技术职业学院 Convolutional neural network training method, image reconstruction method, device and medium
CN112330542A (en) * 2020-11-18 2021-02-05 重庆邮电大学 Image reconstruction system and method based on CRCSAN network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200065646A1 (en) * 2018-08-23 2020-02-27 Samsung Electronics Co., Ltd. Method and device with convolution neural network processing
CN110717856A (en) * 2019-09-03 2020-01-21 天津大学 Super-resolution reconstruction algorithm for medical imaging
AU2020100200A4 (en) * 2020-02-08 2020-06-11 Huang, Shuying DR Content-guide Residual Network for Image Super-Resolution
CN112329912A (en) * 2020-10-21 2021-02-05 广州工程技术职业学院 Convolutional neural network training method, image reconstruction method, device and medium
CN112330542A (en) * 2020-11-18 2021-02-05 重庆邮电大学 Image reconstruction system and method based on CRCSAN network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
邓梦迪;贾瑞生;田煜;刘庆明;: "基于深度学习的地震剖面图像超分辨率重建", 计算机工程与设计, no. 08 *
雷鹏程;刘丛;唐坚刚;彭敦陆;: "分层特征融合注意力网络图像超分辨率重建", 中国图象图形学报, no. 09 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116402682A (en) * 2023-03-29 2023-07-07 辽宁工业大学 Image reconstruction method and system based on differential value dense residual super-resolution
CN116402682B (en) * 2023-03-29 2024-02-09 辽宁工业大学 Image reconstruction method and system based on differential value dense residual super-resolution
CN117173025A (en) * 2023-11-01 2023-12-05 华侨大学 Single-frame image super-resolution method and system based on cross-layer mixed attention transducer
CN117173025B (en) * 2023-11-01 2024-03-01 华侨大学 Single-frame image super-resolution method and system based on cross-layer mixed attention transducer

Also Published As

Publication number Publication date
CN113627487B (en) 2023-09-05

Similar Documents

Publication Publication Date Title
CN113240580B (en) Lightweight image super-resolution reconstruction method based on multi-dimensional knowledge distillation
CN112651973B (en) Semantic segmentation method based on cascade of feature pyramid attention and mixed attention
CN110020989B (en) Depth image super-resolution reconstruction method based on deep learning
CN111275618B (en) Depth map super-resolution reconstruction network construction method based on double-branch perception
CN110119780B (en) Hyper-spectral image super-resolution reconstruction method based on generation countermeasure network
CN111242288B (en) Multi-scale parallel deep neural network model construction method for lesion image segmentation
Wen et al. Image recovery via transform learning and low-rank modeling: The power of complementary regularizers
CN113627487B (en) Super-resolution reconstruction method based on deep attention mechanism
CN114972746B (en) Medical image segmentation method based on multi-resolution overlapping attention mechanism
CN112991350B (en) RGB-T image semantic segmentation method based on modal difference reduction
CN113554032B (en) Remote sensing image segmentation method based on multi-path parallel network of high perception
CN113240683B (en) Attention mechanism-based lightweight semantic segmentation model construction method
CN114862731B (en) Multi-hyperspectral image fusion method guided by low-rank priori and spatial spectrum information
CN111986085A (en) Image super-resolution method based on depth feedback attention network system
Gendy et al. Lightweight image super-resolution based on deep learning: State-of-the-art and future directions
CN115641285A (en) Binocular vision stereo matching method based on dense multi-scale information fusion
CN116563682A (en) Attention scheme and strip convolution semantic line detection method based on depth Hough network
Zhu et al. Stereoscopic image super-resolution with interactive memory learning
CN116883679B (en) Ground object target extraction method and device based on deep learning
CN111275076B (en) Image significance detection method based on feature selection and feature fusion
CN115797181A (en) Image super-resolution reconstruction method for mine fuzzy environment
Zhuge et al. Single image denoising with a feature-enhanced network
Jin et al. Fusion of remote sensing images based on pyramid decomposition with Baldwinian Clonal Selection Optimization
CN115512393A (en) Human body posture estimation method based on improved HigherHRNet
CN115660979A (en) Attention mechanism-based double-discriminator image restoration method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant