CN110992270A - Multi-scale residual attention network image super-resolution reconstruction method based on attention - Google Patents
Multi-scale residual attention network image super-resolution reconstruction method based on attention Download PDFInfo
- Publication number
- CN110992270A CN110992270A CN201911319741.5A CN201911319741A CN110992270A CN 110992270 A CN110992270 A CN 110992270A CN 201911319741 A CN201911319741 A CN 201911319741A CN 110992270 A CN110992270 A CN 110992270A
- Authority
- CN
- China
- Prior art keywords
- image
- attention
- scale residual
- residual error
- resolution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 64
- 238000012549 training Methods 0.000 claims abstract description 48
- 238000012360 testing method Methods 0.000 claims abstract description 25
- 230000007246 mechanism Effects 0.000 claims abstract description 24
- 238000003062 neural network model Methods 0.000 claims abstract description 21
- 238000007781 pre-processing Methods 0.000 claims abstract description 9
- 230000008569 process Effects 0.000 claims description 19
- 230000005284 excitation Effects 0.000 claims description 14
- 238000002474 experimental method Methods 0.000 claims description 13
- 238000011176 pooling Methods 0.000 claims description 12
- 230000004913 activation Effects 0.000 claims description 8
- 238000005520 cutting process Methods 0.000 claims description 8
- 238000005070 sampling Methods 0.000 claims description 6
- 238000001125 extrusion Methods 0.000 claims description 4
- 229940050561 matrix product Drugs 0.000 claims description 4
- 239000011159 matrix material Substances 0.000 claims description 2
- 238000010586 diagram Methods 0.000 abstract description 14
- 239000000284 extract Substances 0.000 abstract description 3
- 239000012141 concentrate Substances 0.000 abstract 1
- 230000006870 function Effects 0.000 description 28
- 238000001994 activation Methods 0.000 description 7
- 238000004590 computer program Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 3
- 230000007547 defect Effects 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 238000003860 storage Methods 0.000 description 3
- 238000004873 anchoring Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 238000003825 pressing Methods 0.000 description 2
- 238000011084 recovery Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008033 biological extinction Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000035939 shock Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Processing (AREA)
Abstract
The invention belongs to the technical field of image super-resolution reconstruction, and discloses a multi-scale residual attention network image super-resolution reconstruction method based on attention, which comprises the steps of selecting a public image data set as an image set to be tested, dividing the image set to be tested into an image training set and an image testing set, and carrying out image preprocessing; designing a multi-scale residual error structural unit module, introducing a channel attention mechanism, and building a multi-scale residual error attention neural network model based on channel attention; inputting the preprocessed image training set into a multi-scale residual error attention neural network model based on channel attention to perform model training; and inputting the preprocessed image test set into the trained model for testing to obtain a finally reconstructed high-resolution image. The method ensures that the basic unit concentrates on extracting high-frequency information, highlights important characteristic diagram information in a channel better, extracts important information in an image better and reduces reconstruction errors.
Description
Technical Field
The invention belongs to the technical field of image super-resolution reconstruction, and particularly relates to a multi-scale residual error attention network image super-resolution reconstruction method based on attention.
Background
Image super-resolution reconstruction is a technique that uses an input image as a low resolution image to generate a high resolution output image. The application field of image super-resolution reconstruction relates to the field of image processing, and has important application prospects in the aspects of military affairs, computer vision, medical diagnosis, public safety, satellite images and the like.
The current reconstruction algorithm applied to the super-resolution of the image can be mainly divided into an interpolation-based method, a reconstruction-based method and a learning-based method.
Based on interpolation methods, common interpolation methods are nearest neighbor interpolation, bilinear interpolation, and bicubic interpolation. The method utilizes the correlation between adjacent pixels of an input single low-resolution image and adopts a mathematical interpolation principle to solve the pixel of an unknown point, thereby reconstructing a high-resolution image. However, the interpolation-based method does not fully consider the global information of the image, and meanwhile, the reconstructed high-resolution image is too smooth and loses most of details of the image, the reconstructed high-resolution image has shock at the position where the gray scale changes violently, the image detail recovery effect is poor, the edge effect is serious, and especially the high-frequency information is seriously lost.
The reconstruction-based method obtains the dependency relationship between the pixels of the high-resolution image and the low-resolution image as the prior knowledge of the image to be reconstructed according to the registration corresponding relationship between the low-resolution image and the high-resolution image, and reconstructs the target high-resolution image by utilizing the prior knowledge. In the process of adopting the reconstruction-based method, the input image signal with low resolution can be supposed to well predict the original image signal with high resolution. Obtaining an LR image needed by people according to a known degradation model, extracting key pixel point characteristic information in the LR image, carrying out prior constraint on generation of an HR image to be generated, and combining prior knowledge in the HR image to obtain a corresponding high-resolution image to be reconstructed. However, the image priori knowledge obtained based on the reconstruction method is limited, so that more high-frequency detail information cannot be recovered for the image with complex reconstruction.
The learning-based method guides the reconstruction of the high-resolution image by learning the mapping relation between the high resolution and the low resolution and utilizing the image priori knowledge obtained by learning. The learning-based method mainly comprises a domain embedding method, a sparse representation method and a deep learning-based method. Yang et al propose a super-resolution reconstruction method based on sparse representation and dictionary learning, which carries out image reconstruction according to the LR image block and the corresponding HR image block super-complete dictionary pair through learning. However, the learning requirement on the high-resolution and low-resolution overcomplete dictionary pairs is high in the reconstruction process, the practicability of the reconstructed image is poor, the field embedding and sparse dictionary combining with Timofte and the like are combined, an anchoring field regression (ANR) algorithm and an improved anchoring field regression (A +) algorithm are provided, although the calculation efficiency in the reconstruction process is improved, the high-frequency detail recovery effect of the image is still poor, and the reconstructed high-resolution image is not well improved. In recent years, with the vigorous development of neural networks, the ability of convolutional neural networks to efficiently extract feature information has been widely applied to image super-resolution reconstruction. In 2014, Dong and the like firstly adopt a convolutional neural network (SRCNN) with three layers of an input layer, a feature extraction layer and a reconstruction layer to extract high-frequency information features of the image. The network has simple structure, less layers and easy realization. The effect is better than that of the traditional interpolation-based and reconstruction-based methods. However, due to the defects of a small number of network layers, a small receptive field, poor generalization capability, limited extracted high-frequency information of the image and the like, the characteristic of the deep high-frequency information of the image is not extracted, and the reconstruction effect is general.
It is known that as the number of network layers increases, the deep feature information of the image can be further extracted. A20-layer convolutional neural network VDSR network model is provided, and image information features are extracted by increasing the number of network layers and increasing the receptive field. However, as the number of network layers increases, phenomena such as gradient extinction and gradient explosion occur, so that the defects that the network is difficult to train and converge exist. The problem can be better solved by introducing the residual error structure while deepening the network under the inspiration of the residual error network structure. However, the number of network layers is increased, and the simple stacking of the network layers leads to increased computational burden, and the network still has the defects of difficult convergence and the like. Meanwhile, the introduction of a single residual structure can not extract the features of different scales, and on the aspect of the problem, a network equally treats the feature information of each channel to be processed, and equally treats the channels rich in high-frequency information channels and a large amount of low-frequency information, so that limited network computing resources are wasted, and the loss of high-frequency information is caused.
Disclosure of Invention
The invention aims to provide an image super-resolution reconstruction method of a multi-scale residual error network based on a channel attention mechanism, which is used for solving the problems in the prior art.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
a multi-scale residual attention network image super-resolution reconstruction method based on attention comprises the following steps:
selecting a public image data set as an image set to be tested, dividing the image set to be tested into an image training set and an image testing set according to a certain proportion, and performing image preprocessing operation;
designing a multi-scale residual error structural unit module, introducing a channel attention mechanism, and building a multi-scale residual error attention neural network model based on channel attention;
inputting the preprocessed image training set into the multi-scale residual error attention neural network model based on the channel attention for model training to obtain a trained multi-scale residual error attention neural network model based on the channel attention;
and inputting the preprocessed image test set into a trained multi-scale residual error attention neural network model based on channel attention to test to obtain a finally reconstructed high-resolution image.
Further, the method for selecting the public image data set as the image set to be tested, dividing the image set to be tested into an image training set and an image testing set according to a certain proportion, and performing image preprocessing operation comprises the following steps:
adopting a DIV2K data set as an image set of an experiment, randomly selecting N images from a plurality of high-resolution images as an experiment training set, and taking the remaining M images as an experiment testing set; respectively carrying out down-sampling on the original high-resolution images by a bicubic interpolation method of a down-sampling factor k on the experimental training set and the test set to obtain a corresponding LR experimental training set and an LR experimental test set, wherein k is 2,3 and 4 times, and the representation images are reduced by 2,3 and 4 times;
cutting the LR experiment training set into I sizeLR×ILRAnd the high-resolution image corresponding to the LR experimental training set is cut into the size IHR×IHRThe size of the cut LR image and the size of the cut HR image satisfy the relation IHR×IHR=kILR×kILRActual image tensor sizes of the LR image and the HR image are H × W × C and kH × kW × C, respectively;
and taking the LR experiment training set obtained by cutting as an input label of a training network, and taking the HR image block data set obtained by cutting as a data label of the training network.
Further, the method for designing the multi-scale residual error structural unit module, introducing a channel attention mechanism, and building the multi-scale residual error attention neural network model based on the channel attention comprises the following steps:
constructing a multi-scale residual error structure unit module by using residual error structures with convolution sizes of 3 multiplied by 3 and 5 multiplied by 5, and introducing an attention mechanism into an output part of the multi-scale residual error structure unit module;
and constructing a multi-scale residual error attention network based on a channel attention mechanism by using a plurality of convolution layers, the multi-scale residual error structural unit module and the sub-pixel convolution, and optimizing the multi-scale residual error attention network based on the channel attention mechanism by adopting a minimum absolute value deviation loss function.
Further, the attention mechanism consists of three processes of squeezing and exciting function and matrix product, wherein,
the extrusion process comprises the following steps: performing global average pooling on input image features with tensor H W C, so that the tensor size of the input image features is 1W 1C, wherein a squeezing function corresponding to the global average pooling is as follows:
where H × W denotes the tensor size, fsq(uc) The function represents the global average pooling operation, uc(i, j) denotes the c-th feature ucValue at (i, j), ucRepresenting the original input tensor of the channel attention block;
the excitation process is as follows: adaptively calibrating the weight of each channel by using an excitation function, wherein the excitation function is as follows:
ec=fex(z,W)=σ(g(z,W))=σ(W2δ(W1z)
wherein the delta function represents a ReLU activation function, sigma is a sigmoid activation function, W1、W2Are respectively expressed asAnd
the matrix multiplication process is as follows: the tensor with the weight score of 1 × 1 × C, which is subjected to the squeezing and exciting part, is multiplied by the original input tensor, and is expressed as:
Xc=f(uc,ec)=ecuc
wherein, XcRepresenting the output of the entire channel attention Block, ecRepresenting the output tensor, u, of the excited partcRepresenting the original input tensor of the channel attention block.
Further, the minimum absolute value deviation loss function is:
wherein L isLADRepresents the minimum absolute value deviation loss function, I represents the ith sample block in the training set, F (I)i LR) Representing the high-resolution image block reconstructed from the ith LR image, N representing the total number of samples in the training set, Ii HRRepresenting the original true high resolution image.
Compared with the prior art, the technical scheme provided by the invention has the following beneficial effects or advantages:
according to the attention-based multi-scale residual attention network image super-resolution reconstruction method, a single-scale residual structure is designed to be a residual structure with different scales, so that image characteristic information under different scales is extracted, image characteristic information under different scales is fused, a channel attention mechanism is introduced to the tail end of a basic unit to be constructed, the basic unit is more focused on extracting high-frequency information, meanwhile, the channel attention mechanism is introduced, important characteristic diagram information in a channel can be better highlighted, the important information in the image can be better extracted, and reconstruction errors are reduced.
Specific embodiments of the present invention are disclosed in detail with reference to the following description and drawings, indicating the manner in which the principles of the invention may be employed. It should be understood that the embodiments of the invention are not so limited in scope. The embodiments of the invention include many variations, modifications and equivalents within the spirit and scope of the appended claims.
Features that are described and/or illustrated with respect to one embodiment may be used in the same way or in a similar way in one or more other embodiments, in combination with or instead of the features of the other embodiments.
It should be emphasized that the term "comprises/comprising" when used herein, is taken to specify the presence of stated features, integers, steps or components but does not preclude the presence or addition of one or more other features, integers, steps or components.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without inventive exercise.
Fig. 1 is a flowchart of a method for super-resolution reconstruction of a multi-scale residual attention network image based on attention according to an embodiment of the present invention;
FIG. 2 is a block diagram of a multi-scale residual error structure unit module according to an embodiment of the present invention;
FIG. 3 is a block diagram of a network architecture of an attention mechanism in an embodiment of the present invention;
fig. 4 is a structural block diagram of a built multi-scale residual attention neural network model based on channel attention in the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
Examples
As shown in fig. 1, an embodiment of the present invention provides a super-resolution reconstruction method for a multi-scale residual attention network image based on attention, which includes the following steps:
step S1: selecting a public image data set as an image set to be tested, dividing the image set to be tested into an image training set and an image testing set according to a certain proportion, and performing image preprocessing operation.
In a specific implementation process, a public image data set is selected as an image set to be tested, the image set to be tested is divided into an image training set and an image testing set according to a certain proportion, and the method for performing image preprocessing operation comprises the following steps:
firstly, a DIV2K data set is used as an image set of an experiment, N pictures are randomly selected from a plurality of high-resolution images to be used as an experiment training set, and M pictures are left to be used as an experiment testing set. For example, 900 pictures are randomly selected from 1000 high-resolution images to be used as an experimental training set, and the rest 100 pictures are used as an experimental testing set.
And then, respectively carrying out down-sampling on the original high-resolution images by a bicubic interpolation method of a down-sampling factor k on the experimental training set and the test set to obtain a corresponding LR experimental training set and an LR experimental test set, wherein k is 2,3 and 4 times, and the representation images are reduced by 2,3 and 4 times.
Then, the LR experimental training set is trimmed to a size ILR×ILRCutting the high-resolution image corresponding to the LR experimental training set into a size IHR×IHRThe size of the cut LR image and the size of the cut HR image satisfy the relation IHR×IHR=kILR×kILRThe actual image tensor sizes of the LR and HR images are H W C and kH kW C, respectively.
And finally, taking the LR experimental training set obtained by cutting as an input label of a training network, and taking the HR image block data set obtained by cutting as a data label of the training network. The training set after image pre-processing is represented as
After the image preprocessing is completed, step S2 is executed: and designing a multi-scale residual error structural unit module, introducing a channel attention mechanism, and building a multi-scale residual error attention neural network model based on channel attention.
In a specific implementation process, the method for designing the multi-scale residual error structural unit module and introducing the channel attention mechanism in the embodiment of the invention to build the multi-scale residual error attention neural network model based on the channel attention specifically comprises the following steps:
first, a multi-scale residual structure unit module is constructed with residual structures of convolution sizes of 3 × 3 and 5 × 5, as shown in fig. 2, and a mechanism of attention is introduced at an output portion of the multi-scale residual structure unit module. Specifically, the attention mechanism in the embodiment of the present invention is specifically an SE module (squeeze-and-excitation blocks), and the SE module specifically includes three processes of squeezing, exciting, and matrix product, as shown in fig. 3, where:
the extrusion process comprises the following steps: performing global average pooling on input image features with tensor H W C, so that the tensor size of the input image features is 1W 1C, wherein a squeezing function corresponding to the global average pooling is as follows:
where H × W denotes the tensor size, fsq(uc) The function represents the global average pooling operation, uc(i, j) denotes the c-th feature ucValue at (i, j), ucRepresenting the original input tensor of the channel attention block.
The above-mentioned squeeze function performs a global averaging operation on the input tensor, sums all eigenvalues of each channel, and then takes an average.
The pressing process is prepared for the subsequent activation process. For the convolution operation of the convolution kernel, because the original feature map is convolved in the manner of one local receptive field when the convolution kernel performs the convolution operation on the feature map, the feature map information outside the local receptive field cannot be utilized in the convolution operation of the convolution kernel, and the global feature information of the feature map is not fully utilized. This problem is more pronounced especially in the initial stages of the network. The pressing operation of the SE structure can solve this problem well. By globally pooling the feature maps of the channels, all position information of the whole map in the feature maps of the channels is fused, and the situation that when channel weight evaluation is carried out, extracted information cannot represent the feature map information of the channel when the feature map convolution is carried out on a local receptive field of a convolution kernel is avoided, so that reference information is insufficient, and the evaluation is inaccurate is avoided.
In order to fully utilize information after the global average pooling operation, excitation is required, and the excitation process in the embodiment of the present invention specifically includes: adaptively calibrating the weight of each channel by using an excitation function, wherein the excitation function is as follows:
ec=fex(z,W)=σ(g(z,W))=σ(W2δ(W1z)
wherein the delta function represents a ReLU activation function, sigma is a sigmoid activation function, W1、W2Are respectively expressed asAnd
taking the feature graph after the full-play average pooling as the input of an excitation part, achieving the purpose of reducing the dimension through a full-connection layer and a ReLU activation function, achieving the original dimension through the full-connection layer and a sigmoid activation function, and simultaneously evaluating the weight score of the channel, wherein the weight score of each channel is in the range of [0,1], and the more the channel weight score is close to 1, the more important the channel information is represented.
After excitation, the tensor with the weight score of 1 × 1 × C and the original input tensor after the extrusion and excitation are subjected to a product operation, namely, a matrix product process, which is expressed as:
Xc=f(uc,ec)=ecuc
wherein, XcRepresenting the output of the entire channel attention Block, ecRepresenting the output tensor, u, of the excited partcRepresenting the original input tensor of the channel attention block.
After a multi-scale residual error attention neural network model based on channel attention is built, a minimum absolute value deviation loss function is adopted to optimize the multi-scale residual error attention network based on the channel attention mechanism. The minimum absolute value deviation loss function in the embodiment of the invention is specifically as follows:
wherein L isLADRepresents the minimum absolute value deviation loss function, I represents the ith sample block in the training set, F (I)i LR) Representing the high-resolution image block reconstructed from the ith LR image, N representing the total number of samples in the training set, Ii HRRepresenting the original true high resolution image.
The constructed multi-scale residual attention neural network model based on the channel attention is shown in fig. 4 and comprises an image feature information extraction part, a convergence layer part and a reconstruction part.
Wherein the image feature information extracting section: in order to keep the input image and output image size equal for each layer in the network, a padding operation is used, while the multi-scale residual structure unit consists of residual structures of convolution size 3 x 3 and 5 x 5, convolution layers of 3 x 3 and 5 x 5, and active layers.
The convergence layer part: and carrying out information fusion on the characteristic information extracted by each multi-scale residual attention structure unit through a convergence layer, wherein the convergence layer adopts convolution size of 1 × 1 size to carry out characteristic fusion.
And a reconstruction part: to magnify the image by a factor of k, we use the sub-pixel convolution layer to upsample the output of the feature extraction.
After the multi-scale residual attention neural network model based on the channel attention is built, step S3 is executed: inputting the preprocessed image training set into the multi-scale residual error attention neural network model based on the channel attention for model training, and obtaining the trained multi-scale residual error attention neural network model based on the channel attention.
After the training is completed, step S4 is performed: and inputting the preprocessed image test set into a trained multi-scale residual error attention neural network model based on channel attention to test to obtain a finally reconstructed high-resolution image.
According to the attention-based multi-scale residual attention network image super-resolution reconstruction method provided by the embodiment of the invention, a single-scale residual structure is designed to contain residual structures with different scales, so that image characteristic information under different scales is extracted, simultaneously, the image characteristic information under different scales is fused, a channel attention mechanism is introduced to the tail end of a basic unit for construction, the basic unit is more focused on extracting high-frequency information, and meanwhile, the channel attention mechanism is introduced, so that important characteristic diagram information in a channel can be better highlighted, the important information in the image can be better extracted, and reconstruction errors are reduced.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.
Claims (5)
1. A multi-scale residual attention network image super-resolution reconstruction method based on attention is characterized by comprising the following steps:
selecting a public image data set as an image set to be tested, dividing the image set to be tested into an image training set and an image testing set according to a certain proportion, and performing image preprocessing operation;
designing a multi-scale residual error structural unit module, introducing a channel attention mechanism, and building a multi-scale residual error attention neural network model based on channel attention;
inputting the preprocessed image training set into the multi-scale residual error attention neural network model based on the channel attention for model training to obtain a trained multi-scale residual error attention neural network model based on the channel attention;
and inputting the preprocessed image test set into a trained multi-scale residual error attention neural network model based on channel attention to test to obtain a finally reconstructed high-resolution image.
2. The attention-based multi-scale residual error attention network image super-resolution reconstruction method of claim 1, wherein the method for selecting a common image data set as an image set to be tested, dividing the image set to be tested into an image training set and an image testing set according to a certain proportion, and performing image preprocessing operation comprises the following steps:
adopting a DIV2K data set as an image set of an experiment, randomly selecting N images from a plurality of high-resolution images as an experiment training set, and taking the remaining M images as an experiment testing set; respectively carrying out down-sampling on the original high-resolution images by a bicubic interpolation method of a down-sampling factor k on the experimental training set and the test set to obtain a corresponding LR experimental training set and an LR experimental test set, wherein k is 2,3 and 4 times, and the representation images are reduced by 2,3 and 4 times;
cutting the LR experiment training set into I sizeLR×ILRAnd the high-resolution image corresponding to the LR experimental training set is cut into the size IHR×IHRThe size of the cut LR image and the size of the cut HR image satisfy the relation IHR×IHR=kILR×kILRActual image tensor sizes of the LR image and the HR image are H × W × C and kH × kW × C, respectively;
and taking the LR experiment training set obtained by cutting as an input label of a training network, and taking the HR image block data set obtained by cutting as a data label of the training network.
3. The attention-based multi-scale residual attention network image super-resolution reconstruction method according to claim 2, wherein the method for designing the multi-scale residual structure unit module and introducing the channel attention mechanism to build the channel attention-based multi-scale residual attention neural network model comprises the following steps:
constructing a multi-scale residual error structure unit module by using residual error structures with convolution sizes of 3 multiplied by 3 and 5 multiplied by 5, and introducing an attention mechanism into an output part of the multi-scale residual error structure unit module;
and constructing a multi-scale residual error attention network based on a channel attention mechanism by using a plurality of convolution layers, the multi-scale residual error structural unit module and the sub-pixel convolution, and optimizing the multi-scale residual error attention network based on the channel attention mechanism by adopting a minimum absolute value deviation loss function.
4. The method for super-resolution reconstruction of multi-scale residual attention network image based on attention according to claim 3, wherein the attention mechanism consists of three processes of squeezing and excitation and matrix product, wherein,
the extrusion process comprises the following steps: performing global average pooling on input image features with tensor H W C, so that the tensor size of the input image features is 1W 1C, wherein a squeezing function corresponding to the global average pooling is as follows:
where H × W denotes the tensor size, fsq(uc) Function representation global average pooling operation,uc(i, j) denotes the c-th feature ucValue at (i, j), ucRepresenting the original input tensor of the channel attention block;
the excitation process is as follows: adaptively calibrating the weight of each channel by using an excitation function, wherein the excitation function is as follows:
ec=fex(z,W)=σ(g(z,W))=σ(W2δ(W1z)
wherein the delta function represents a ReLU activation function, sigma is a sigmoid activation function, W1、W2Are respectively expressed asAnd
the matrix multiplication process is as follows: the tensor with the weight score of 1 × 1 × C, which is subjected to the squeezing and exciting part, is multiplied by the original input tensor, and is expressed as:
Xc=f(uc,ec)=ecuc
wherein, XcRepresenting the output of the entire channel attention Block, ecRepresenting the output tensor, u, of the excited partcRepresenting the original input tensor of the channel attention block.
5. The attention-based multi-scale residual attention network image super-resolution reconstruction method of claim 4, wherein the minimum absolute value deviation loss function is:
wherein L isLADRepresents the minimum absolute value deviation loss function, i represents the ith sample block in the training set,representing the high-resolution image blocks reconstructed from the ith LR image, N representing the total number of samples in the training set,representing the original true high resolution image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911319741.5A CN110992270A (en) | 2019-12-19 | 2019-12-19 | Multi-scale residual attention network image super-resolution reconstruction method based on attention |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911319741.5A CN110992270A (en) | 2019-12-19 | 2019-12-19 | Multi-scale residual attention network image super-resolution reconstruction method based on attention |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110992270A true CN110992270A (en) | 2020-04-10 |
Family
ID=70065762
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911319741.5A Pending CN110992270A (en) | 2019-12-19 | 2019-12-19 | Multi-scale residual attention network image super-resolution reconstruction method based on attention |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110992270A (en) |
Cited By (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111598778A (en) * | 2020-05-13 | 2020-08-28 | 云南电网有限责任公司电力科学研究院 | Insulator image super-resolution reconstruction method |
CN111667445A (en) * | 2020-05-29 | 2020-09-15 | 湖北工业大学 | Image compressed sensing reconstruction method based on Attention multi-feature fusion |
CN111738055A (en) * | 2020-04-24 | 2020-10-02 | 浙江大学城市学院 | Multi-class text detection system and bill form detection method based on same |
CN111783792A (en) * | 2020-05-31 | 2020-10-16 | 浙江大学 | Method for extracting significant texture features of B-ultrasonic image and application thereof |
CN111814863A (en) * | 2020-07-03 | 2020-10-23 | 南京信息工程大学 | Detection method for light-weight vehicles and pedestrians |
CN111915487A (en) * | 2020-08-04 | 2020-11-10 | 武汉工程大学 | Face super-resolution method and device based on hierarchical multi-scale residual fusion network |
CN112085756A (en) * | 2020-09-23 | 2020-12-15 | 清华大学苏州汽车研究院(相城) | Road image multi-scale edge detection model and method based on residual error network |
CN112215755A (en) * | 2020-10-28 | 2021-01-12 | 南京信息工程大学 | Image super-resolution reconstruction method based on back projection attention network |
CN112347977A (en) * | 2020-11-23 | 2021-02-09 | 深圳大学 | Automatic detection method, storage medium and device for induced pluripotent stem cells |
CN112381164A (en) * | 2020-11-20 | 2021-02-19 | 北京航空航天大学杭州创新研究院 | Ultrasound image classification method and device based on multi-branch attention mechanism |
CN112419155A (en) * | 2020-11-26 | 2021-02-26 | 武汉大学 | Super-resolution reconstruction method for fully-polarized synthetic aperture radar image |
CN112686297A (en) * | 2020-12-29 | 2021-04-20 | 中国人民解放军海军航空大学 | Radar target motion state classification method and system |
CN112862688A (en) * | 2021-03-08 | 2021-05-28 | 西华大学 | Cross-scale attention network-based image super-resolution reconstruction model and method |
CN112950570A (en) * | 2021-02-25 | 2021-06-11 | 昆明理工大学 | Crack detection method combining deep learning and dense continuous central point |
CN113066013A (en) * | 2021-05-18 | 2021-07-02 | 广东奥普特科技股份有限公司 | Method, system, device and storage medium for generating visual image enhancement |
CN113095398A (en) * | 2021-04-08 | 2021-07-09 | 西南石油大学 | Fracturing data cleaning method of BP neural network based on genetic algorithm optimization |
CN113298717A (en) * | 2021-06-08 | 2021-08-24 | 浙江工业大学 | Medical image super-resolution reconstruction method based on multi-attention residual error feature fusion |
CN113379598A (en) * | 2021-05-20 | 2021-09-10 | 山东省科学院自动化研究所 | Terahertz image reconstruction method and system based on residual channel attention network |
CN113421187A (en) * | 2021-06-10 | 2021-09-21 | 山东师范大学 | Super-resolution reconstruction method, system, storage medium and equipment |
CN113436155A (en) * | 2021-06-16 | 2021-09-24 | 复旦大学附属华山医院 | Ultrasonic brachial plexus image identification method based on deep learning |
CN113538235A (en) * | 2021-06-30 | 2021-10-22 | 北京百度网讯科技有限公司 | Training method and device of image processing model, electronic equipment and storage medium |
CN113616209A (en) * | 2021-08-25 | 2021-11-09 | 西南石油大学 | Schizophrenia patient discrimination method based on space-time attention mechanism |
CN113793267A (en) * | 2021-09-18 | 2021-12-14 | 中国石油大学(华东) | Self-supervision single remote sensing image super-resolution method based on cross-dimension attention mechanism |
CN113902617A (en) * | 2021-09-27 | 2022-01-07 | 中山大学·深圳 | Super-resolution method, device, equipment and medium based on reference image |
CN114066727A (en) * | 2021-07-28 | 2022-02-18 | 华侨大学 | Multi-stage progressive image super-resolution method |
CN114066873A (en) * | 2021-11-24 | 2022-02-18 | 袁兰 | Method and device for detecting osteoporosis by utilizing CT (computed tomography) image |
CN114331830A (en) * | 2021-11-04 | 2022-04-12 | 西安理工大学 | Super-resolution reconstruction method based on multi-scale residual attention |
CN114494022A (en) * | 2022-03-31 | 2022-05-13 | 苏州浪潮智能科技有限公司 | Model training method, super-resolution reconstruction method, device, equipment and medium |
CN114821449A (en) * | 2022-06-27 | 2022-07-29 | 松立控股集团股份有限公司 | License plate image processing method based on attention mechanism |
CN115330635A (en) * | 2022-08-25 | 2022-11-11 | 苏州大学 | Image compression artifact removing method and device and storage medium |
CN117151990A (en) * | 2023-06-28 | 2023-12-01 | 西南石油大学 | Image defogging method based on self-attention coding and decoding |
CN118052811A (en) * | 2024-04-10 | 2024-05-17 | 南京航空航天大学 | NAM-DSSD model-based aircraft skin defect detection method |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109903228A (en) * | 2019-02-28 | 2019-06-18 | 合肥工业大学 | A kind of image super-resolution rebuilding method based on convolutional neural networks |
-
2019
- 2019-12-19 CN CN201911319741.5A patent/CN110992270A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109903228A (en) * | 2019-02-28 | 2019-06-18 | 合肥工业大学 | A kind of image super-resolution rebuilding method based on convolutional neural networks |
Non-Patent Citations (4)
Title |
---|
JUNCHENG LI ET AL: ""Multi-scale Residual Network for Image Super-Resolution"", 《ECCV》 * |
YULUN ZHANG: ""Image Super-Resolution Using Very Deep Residual Channel Attention Networks"", 《ECCV 2018:COMPUTER VISION》 * |
王东飞: "基于通道注意力的卷积神经网络在图像超分辨率重建中的应用", 《广播与电视技术》 * |
陈书贞等: "利用多尺度卷积神经网络的图像超分辨率算法", 《信号处理》 * |
Cited By (52)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111738055A (en) * | 2020-04-24 | 2020-10-02 | 浙江大学城市学院 | Multi-class text detection system and bill form detection method based on same |
CN111738055B (en) * | 2020-04-24 | 2023-07-18 | 浙江大学城市学院 | Multi-category text detection system and bill form detection method based on same |
CN111598778A (en) * | 2020-05-13 | 2020-08-28 | 云南电网有限责任公司电力科学研究院 | Insulator image super-resolution reconstruction method |
CN111598778B (en) * | 2020-05-13 | 2023-11-21 | 云南电网有限责任公司电力科学研究院 | Super-resolution reconstruction method for insulator image |
CN111667445A (en) * | 2020-05-29 | 2020-09-15 | 湖北工业大学 | Image compressed sensing reconstruction method based on Attention multi-feature fusion |
CN111783792B (en) * | 2020-05-31 | 2023-11-28 | 浙江大学 | Method for extracting significant texture features of B-ultrasonic image and application thereof |
CN111783792A (en) * | 2020-05-31 | 2020-10-16 | 浙江大学 | Method for extracting significant texture features of B-ultrasonic image and application thereof |
CN111814863A (en) * | 2020-07-03 | 2020-10-23 | 南京信息工程大学 | Detection method for light-weight vehicles and pedestrians |
CN111915487A (en) * | 2020-08-04 | 2020-11-10 | 武汉工程大学 | Face super-resolution method and device based on hierarchical multi-scale residual fusion network |
CN111915487B (en) * | 2020-08-04 | 2022-05-10 | 武汉工程大学 | Face super-resolution method and device based on hierarchical multi-scale residual fusion network |
CN112085756A (en) * | 2020-09-23 | 2020-12-15 | 清华大学苏州汽车研究院(相城) | Road image multi-scale edge detection model and method based on residual error network |
CN112085756B (en) * | 2020-09-23 | 2023-11-07 | 清华大学苏州汽车研究院(相城) | Road image multi-scale edge detection model and method based on residual error network |
CN112215755A (en) * | 2020-10-28 | 2021-01-12 | 南京信息工程大学 | Image super-resolution reconstruction method based on back projection attention network |
CN112381164A (en) * | 2020-11-20 | 2021-02-19 | 北京航空航天大学杭州创新研究院 | Ultrasound image classification method and device based on multi-branch attention mechanism |
CN112347977B (en) * | 2020-11-23 | 2021-07-20 | 深圳大学 | Automatic detection method, storage medium and device for induced pluripotent stem cells |
CN112347977A (en) * | 2020-11-23 | 2021-02-09 | 深圳大学 | Automatic detection method, storage medium and device for induced pluripotent stem cells |
CN112419155A (en) * | 2020-11-26 | 2021-02-26 | 武汉大学 | Super-resolution reconstruction method for fully-polarized synthetic aperture radar image |
CN112419155B (en) * | 2020-11-26 | 2022-04-15 | 武汉大学 | Super-resolution reconstruction method for fully-polarized synthetic aperture radar image |
CN112686297A (en) * | 2020-12-29 | 2021-04-20 | 中国人民解放军海军航空大学 | Radar target motion state classification method and system |
CN112950570A (en) * | 2021-02-25 | 2021-06-11 | 昆明理工大学 | Crack detection method combining deep learning and dense continuous central point |
CN112950570B (en) * | 2021-02-25 | 2022-05-17 | 昆明理工大学 | Crack detection method combining deep learning and dense continuous central point |
CN112862688A (en) * | 2021-03-08 | 2021-05-28 | 西华大学 | Cross-scale attention network-based image super-resolution reconstruction model and method |
CN112862688B (en) * | 2021-03-08 | 2021-11-23 | 西华大学 | Image super-resolution reconstruction system and method based on cross-scale attention network |
CN113095398A (en) * | 2021-04-08 | 2021-07-09 | 西南石油大学 | Fracturing data cleaning method of BP neural network based on genetic algorithm optimization |
CN113095398B (en) * | 2021-04-08 | 2022-07-12 | 西南石油大学 | Fracturing data cleaning method of BP neural network based on genetic algorithm optimization |
CN113066013A (en) * | 2021-05-18 | 2021-07-02 | 广东奥普特科技股份有限公司 | Method, system, device and storage medium for generating visual image enhancement |
CN113379598B (en) * | 2021-05-20 | 2023-07-14 | 山东省科学院自动化研究所 | Terahertz image reconstruction method and system based on residual channel attention network |
CN113379598A (en) * | 2021-05-20 | 2021-09-10 | 山东省科学院自动化研究所 | Terahertz image reconstruction method and system based on residual channel attention network |
CN113298717A (en) * | 2021-06-08 | 2021-08-24 | 浙江工业大学 | Medical image super-resolution reconstruction method based on multi-attention residual error feature fusion |
CN113421187B (en) * | 2021-06-10 | 2023-01-03 | 山东师范大学 | Super-resolution reconstruction method, system, storage medium and equipment |
CN113421187A (en) * | 2021-06-10 | 2021-09-21 | 山东师范大学 | Super-resolution reconstruction method, system, storage medium and equipment |
CN113436155A (en) * | 2021-06-16 | 2021-09-24 | 复旦大学附属华山医院 | Ultrasonic brachial plexus image identification method based on deep learning |
CN113436155B (en) * | 2021-06-16 | 2023-12-19 | 复旦大学附属华山医院 | Deep learning-based ultrasonic brachial plexus image recognition method |
CN113538235A (en) * | 2021-06-30 | 2021-10-22 | 北京百度网讯科技有限公司 | Training method and device of image processing model, electronic equipment and storage medium |
CN113538235B (en) * | 2021-06-30 | 2024-01-09 | 北京百度网讯科技有限公司 | Training method and device for image processing model, electronic equipment and storage medium |
CN114066727A (en) * | 2021-07-28 | 2022-02-18 | 华侨大学 | Multi-stage progressive image super-resolution method |
CN113616209A (en) * | 2021-08-25 | 2021-11-09 | 西南石油大学 | Schizophrenia patient discrimination method based on space-time attention mechanism |
CN113616209B (en) * | 2021-08-25 | 2023-08-04 | 西南石油大学 | Method for screening schizophrenic patients based on space-time attention mechanism |
CN113793267B (en) * | 2021-09-18 | 2023-08-25 | 中国石油大学(华东) | Self-supervision single remote sensing image super-resolution method based on cross-dimension attention mechanism |
CN113793267A (en) * | 2021-09-18 | 2021-12-14 | 中国石油大学(华东) | Self-supervision single remote sensing image super-resolution method based on cross-dimension attention mechanism |
CN113902617A (en) * | 2021-09-27 | 2022-01-07 | 中山大学·深圳 | Super-resolution method, device, equipment and medium based on reference image |
CN114331830A (en) * | 2021-11-04 | 2022-04-12 | 西安理工大学 | Super-resolution reconstruction method based on multi-scale residual attention |
CN114066873A (en) * | 2021-11-24 | 2022-02-18 | 袁兰 | Method and device for detecting osteoporosis by utilizing CT (computed tomography) image |
CN114494022A (en) * | 2022-03-31 | 2022-05-13 | 苏州浪潮智能科技有限公司 | Model training method, super-resolution reconstruction method, device, equipment and medium |
CN114494022B (en) * | 2022-03-31 | 2022-07-29 | 苏州浪潮智能科技有限公司 | Model training method, super-resolution reconstruction method, device, equipment and medium |
CN114821449A (en) * | 2022-06-27 | 2022-07-29 | 松立控股集团股份有限公司 | License plate image processing method based on attention mechanism |
CN115330635B (en) * | 2022-08-25 | 2023-08-15 | 苏州大学 | Image compression artifact removing method, device and storage medium |
CN115330635A (en) * | 2022-08-25 | 2022-11-11 | 苏州大学 | Image compression artifact removing method and device and storage medium |
CN117151990A (en) * | 2023-06-28 | 2023-12-01 | 西南石油大学 | Image defogging method based on self-attention coding and decoding |
CN117151990B (en) * | 2023-06-28 | 2024-03-22 | 西南石油大学 | Image defogging method based on self-attention coding and decoding |
CN118052811A (en) * | 2024-04-10 | 2024-05-17 | 南京航空航天大学 | NAM-DSSD model-based aircraft skin defect detection method |
CN118052811B (en) * | 2024-04-10 | 2024-06-11 | 南京航空航天大学 | NAM-DSSD model-based aircraft skin defect detection method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110992270A (en) | Multi-scale residual attention network image super-resolution reconstruction method based on attention | |
CN111369440B (en) | Model training and image super-resolution processing method, device, terminal and storage medium | |
CN111260586B (en) | Correction method and device for distorted document image | |
CN110059768B (en) | Semantic segmentation method and system for fusion point and region feature for street view understanding | |
CN115482241A (en) | Cross-modal double-branch complementary fusion image segmentation method and device | |
CN111476719B (en) | Image processing method, device, computer equipment and storage medium | |
CN112507997A (en) | Face super-resolution system based on multi-scale convolution and receptive field feature fusion | |
CN112215755B (en) | Image super-resolution reconstruction method based on back projection attention network | |
Zhang et al. | Image super-resolution reconstruction based on sparse representation and deep learning | |
CN112541864A (en) | Image restoration method based on multi-scale generation type confrontation network model | |
CN112258526A (en) | CT (computed tomography) kidney region cascade segmentation method based on dual attention mechanism | |
CN113706388B (en) | Image super-resolution reconstruction method and device | |
CN111768340B (en) | Super-resolution image reconstruction method and system based on dense multipath network | |
CN111914654B (en) | Text layout analysis method, device, equipment and medium | |
CN111402128A (en) | Image super-resolution reconstruction method based on multi-scale pyramid network | |
CN115564649B (en) | Image super-resolution reconstruction method, device and equipment | |
Cao et al. | New architecture of deep recursive convolution networks for super-resolution | |
CN112001928A (en) | Retinal vessel segmentation method and system | |
Li et al. | Deep recursive up-down sampling networks for single image super-resolution | |
Chen et al. | RBPNET: An asymptotic Residual Back-Projection Network for super-resolution of very low-resolution face image | |
CN113689517A (en) | Image texture synthesis method and system of multi-scale channel attention network | |
CN117576402B (en) | Deep learning-based multi-scale aggregation transducer remote sensing image semantic segmentation method | |
CN114998671A (en) | Visual feature learning device based on convolution mask, acquisition device and storage medium | |
CN117593187A (en) | Remote sensing image super-resolution reconstruction method based on meta-learning and transducer | |
CN111428809B (en) | Crowd counting method based on spatial information fusion and convolutional neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200410 |