CN111461973A - Super-resolution reconstruction method and system for image - Google Patents
Super-resolution reconstruction method and system for image Download PDFInfo
- Publication number
- CN111461973A CN111461973A CN202010051767.2A CN202010051767A CN111461973A CN 111461973 A CN111461973 A CN 111461973A CN 202010051767 A CN202010051767 A CN 202010051767A CN 111461973 A CN111461973 A CN 111461973A
- Authority
- CN
- China
- Prior art keywords
- attention
- image
- feature
- channel
- super
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 230000007246 mechanism Effects 0.000 claims abstract description 31
- 238000013527 convolutional neural network Methods 0.000 claims description 41
- 230000004927 fusion Effects 0.000 claims description 32
- 238000000605 extraction Methods 0.000 claims description 30
- 238000010586 diagram Methods 0.000 claims description 11
- 230000009977 dual effect Effects 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 5
- 238000005215 recombination Methods 0.000 claims description 5
- 230000006798 recombination Effects 0.000 claims description 5
- 230000009467 reduction Effects 0.000 claims description 5
- 238000000638 solvent extraction Methods 0.000 claims description 4
- 238000012512 characterization method Methods 0.000 abstract description 7
- 230000000694 effects Effects 0.000 description 14
- 238000012549 training Methods 0.000 description 9
- 230000009286 beneficial effect Effects 0.000 description 7
- 230000006870 function Effects 0.000 description 6
- 230000000903 blocking effect Effects 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a super-resolution reconstruction method and a system of an image, wherein features on image blocks are extracted at different depths and then fused based on a channel attention mechanism and a position attention mechanism of the image, so that the importance of feature channel information and position information is effectively identified, and the features at different depths contribute to the super-resolution of the image. In addition, the invention fully excavates the relation between the characteristics, relieves the performance bottleneck problem caused by the over-depth of the network, and has better characteristic characterization capability and robustness of the image.
Description
Technical Field
The invention belongs to the field of computer vision, and particularly relates to a super-resolution reconstruction method and system for an image.
Background
Although the early image super-resolution method based on the difference value is simple and efficient, the effect in practical application is greatly limited. Recently, methods based on deep convolutional neural networks have achieved performance that surpasses traditional image hyper-segmentation methods. The earliest method based on the deep convolutional neural network provides an image super-resolution network with only three layers, wherein the image super-resolution network comprises four parts of shallow feature extraction, nonlinear mapping, reconstruction and upsampling, and the deep learning is introduced into an image super-resolution task for the first time. Based on the network structure, the image super-resolution network is gradually deepened, the super-resolution network can be deepened and the training convergence is effectively guaranteed by introducing the residual learning structure, and the problems of difficult training caused by the network deepening are solved. But the network is still too shallow and the reconstruction still has poor results.
After a deeper network is proved in a computer vision field classification task to effectively improve the performance on a vision task, the effective residual error learning strategy is introduced into other image super-resolution methods based on a convolution network. The mainstream image super-resolution network has the characteristics of being very deep and wide, residual modules in the classification network are used as basic modules, and batch normalization layers in the residual modules are removed to save parameters and calculated amount so as to deepen the network, so that the image super-resolution effect is greatly improved. Although the network becomes very deep by stacking the residual modules, the relation among the extracted features is not fully excavated, the feature characterization capability of the network is limited, and the network performance cannot be improved even if the network is continuously deepened, so that a performance bottleneck exists. In addition, super-resolution reconstruction is performed only by using a feature map with a single scale, and the problem of poor robustness exists under the condition of a large super-resolution factor.
Therefore, in order to improve execution speed and performance, in the existing image super-resolution reconstruction method, an image super-resolution network becomes deep and compact, and a more effective jump connection mode is adopted, so that although the connection is helpful for the model to accurately and effectively execute an image super-resolution task, the characteristic information of different depths is not fully utilized and fused, the receptive field is insufficient during characteristic extraction, and the super-resolution effect is poor. In addition, the generation network based on the perception loss and the confrontation is applied to an image super-resolution task and used for improving the visual sense of reality of the super-resolution result, but more false details can be brought, and the application in an actual scene is limited.
Therefore, the existing image super-resolution method does not fully utilize and fuse the characteristic information of different depths, and the super-resolution effect is poor; in addition, the existing method mainly focuses on increasing the depth and width of the network, does not fully mine the relation between features, limits the feature characterization capability of the network, and has the disadvantages of too deep network, performance bottleneck and poor robustness.
Disclosure of Invention
In view of the above defects or improvement requirements of the prior art, the present invention provides a super-resolution reconstruction method for an image, which aims to solve the problem of poor super-resolution effect caused by insufficient utilization and fusion of different layer feature information in the prior art.
In a first aspect, the present invention provides a super-resolution reconstruction method for an image, comprising the following steps:
s1, partitioning the low-resolution image to be processed to obtain a plurality of image blocks;
s2, extracting features on each image block at different depths based on a channel attention mechanism and a position attention mechanism of the image to obtain feature maps at different depths corresponding to each image block;
s3, fusing the feature maps of different depths corresponding to the image blocks, and performing image reconstruction on the fused feature maps according to the image super-resolution multiple to obtain super-resolution image blocks;
and S4, recombining the super-resolution image blocks to obtain a high-resolution image.
The invention has the beneficial effects that: the channel attention mechanism and the position attention mechanism based on the image are used for extracting the features of the image blocks at different depths and then fusing the features, so that the importance of feature channel information and position information is effectively identified, the features at different depths contribute to super-resolution of the image, the features at different depths are fully utilized and fused, the obtained image features retain larger receptive field and better detail features, and the super-resolution effect of the obtained image is better after reconstruction is performed based on the features.
The invention has the further beneficial effects that: the method effectively identifies the importance of the characteristic channel and the position, fully excavates the relation between the characteristics, can improve the network characterization capability when the network is continuously deepened, relieves the performance bottleneck problem caused by the over-depth of the network, and has better characteristic characterization capability and robustness of the image.
On the basis of the technical scheme, the invention can be further improved as follows.
Further preferably, the above S2 includes the steps of:
s21, extracting shallow features of the image block by adopting a convolutional neural network to obtain a shallow feature map;
s22, extracting the channel attention feature and the position attention feature of the shallow feature map at different depths respectively by adopting a trained deep feature extraction network, and fusing the channel attention feature and the position attention feature to obtain feature maps at different depths corresponding to the image block;
the deep feature extraction network comprises N cascaded attention model-based double-attention convolutional neural networks, the N cascaded attention model-based double-attention convolutional neural networks correspond to N depths one by one, and N is a positive integer greater than or equal to 2.
Further preferably, the above-mentioned dual attention convolutional neural network includes a channel attention model, a location attention model and a fusion layer;
the output end of the channel attention model is connected with the input end of the fusion layer, and the output end of the position attention model is connected with the input end of the fusion layer;
the channel attention model is used for obtaining the initial weight of the channel attention by calculating the pixel average value of each channel of the input feature map based on a channel attention mechanism, adjusting the initial weight and performing point multiplication on the initial weight and the input feature map to obtain the channel attention feature of the input feature map;
the position attention model is used for calculating the average value of each channel pixel at each pixel position point of the input feature map based on a position attention mechanism to obtain a position attention initial weight, and after the position attention initial weight is adjusted, the position attention initial weight is subjected to point multiplication with the input feature map to obtain a position attention feature of the input feature map;
the fusion layer is used for fusing the obtained channel attention feature and the position attention feature.
The invention has the further beneficial effects that: before the image features are extracted, the shallow features of the image are extracted firstly, the detail features of the image can be extracted preliminarily by the shallow features, and a more precise input is provided for subsequently extracting the features of each depth, so that the feature characterization effect of the obtained feature maps of different depths is better.
Further preferably, the channel attention initial weight of the c-th channel is:
where c is 1,2, …, L is the number of channels of the input feature map X, H is the height of the input feature map X, W is the width of the input feature map X, Xc(i, j) is the characteristic value of ith row and jth column on the ith channel of the input characteristic diagram X.
The invention has the further beneficial effects that: by introducing the attention feature of the image channel, the importance of different channels of the shallow feature map can be effectively distinguished, more weight is given to more important feature channels, and the image can be better characterized.
Further preferably, the initial weight of attention of the ith row and the jth column is:
where i is 1,2, …, H, j is 1,2, …, W, H is the height of the input feature map X, W is the width of the input feature map X, which is the activation function, L is the number of channels of the input feature map X, X is the activation function, H is the width of the input feature map X, H is the number of channels of the input feature map X, H is the width of the inputi,j,cThe characteristic value of ith row and jth column on the ith channel of the input characteristic diagram X is obtained.
The invention has the further beneficial effects that: by introducing the position attention feature, the contribution degree of the shallow feature at different positions of the feature map to the super-resolution of the image can be acquired, more weight is given to more important feature positions, and the image can be better characterized.
Further preferably, the method for fusing the feature maps of different depths corresponding to the image blocks in S3 includes: and after splicing the feature maps of different depths corresponding to the image blocks on the feature channels, performing convolution dimensionality reduction to obtain a fused feature map.
The invention has the further beneficial effects that: because the low-level features have more detailed information and the high-level features have better semantic information, the method has different effects on the network reconstruction effect, and the features with different depths are utilized and fused, so that the richness of the features can be enhanced and the robustness of the method provided by the invention can be enhanced through feature information complementation.
In a second aspect, the present invention provides a super-resolution reconstruction system for an image, comprising: the device comprises an image intercepting module, a feature extraction module, an image reconstruction module and an image recombination module;
the image capture module is used for blocking the low-resolution image to be processed to obtain a plurality of image blocks and outputting the image blocks to the feature extraction module;
the feature extraction module is used for extracting the features of each image block input by the image interception module at different depths based on a channel attention mechanism and a position attention mechanism of the image to obtain feature maps of different depths corresponding to each image block and outputting the feature maps to the image reconstruction module;
the image reconstruction module is used for fusing the feature maps of different depths corresponding to the image blocks, reconstructing the fused feature maps according to the required image super-division multiple to obtain super-resolution image blocks and outputting the super-resolution image blocks to the image reconstruction module;
the image recombination module is used for recombining the super-resolution image blocks input by the image reconstruction module to obtain a high-resolution image.
The invention has the beneficial effects that: the system provided by the invention comprises a feature extraction module, and a channel attention mechanism and a position attention mechanism based on the image fully utilize and fuse different depth features, so that the obtained image features retain larger receptive field and better detail features, and the obtained image has better super-resolution effect after reconstruction based on the features.
Further preferably, the feature extraction module includes a shallow network element and a deep network element;
the shallow network unit is used for extracting shallow features of the image blocks by adopting a convolutional neural network to obtain a shallow feature map and outputting the shallow feature map to the deep network unit;
the deep network unit adopts a trained deep feature extraction network to respectively extract the channel attention feature and the position attention feature of the shallow feature map at different depths, and the channel attention feature and the position attention feature are fused to obtain feature maps at different depths corresponding to the image block;
the deep feature extraction network comprises N cascaded attention model-based double-attention convolutional neural networks, the N cascaded attention model-based double-attention convolutional neural networks correspond to N depths one by one, and N is a positive integer greater than or equal to 2.
Further preferably, the above-mentioned dual attention convolutional neural network includes a channel attention model, a location attention model and a fusion layer;
the output end of the channel attention model is connected with the input end of the fusion layer, and the output end of the position attention model is connected with the input end of the fusion layer;
the channel attention model is used for obtaining the initial weight of the channel attention by calculating the pixel average value of each channel of the input feature map based on a channel attention mechanism, adjusting the initial weight and performing point multiplication on the initial weight and the input feature map to obtain the channel attention feature of the input feature map;
the position attention model is used for calculating the average value of each channel pixel at each pixel position point of the input feature map based on a position attention mechanism to obtain a position attention initial weight, and after the position attention initial weight is adjusted, the position attention initial weight is subjected to point multiplication with the input feature map to obtain a position attention feature of the input feature map;
the fusion layer is used for fusing the obtained channel attention feature and the position attention feature.
In a third aspect, the present invention also provides a storage medium, which when read by a computer, causes the computer to execute the super-resolution reconstruction method for an image provided by the first aspect of the present invention.
Drawings
Fig. 1 is a flowchart of a super-resolution reconstruction method for an image according to embodiment 1 of the present invention;
FIG. 2 is a schematic diagram of a dual attention convolutional neural network structure provided in embodiment 1 of the present invention;
fig. 3 is a schematic diagram of an image super-resolution system provided in embodiment 2 of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Examples 1,
A super-resolution reconstruction method of an image, as shown in fig. 1, includes the following steps:
s1, partitioning the low-resolution image to be processed to obtain a plurality of image blocks;
specifically, in the embodiment, the image blocks with the size of H × W are repeatedly cut out from the to-be-processed low-resolution RGB image in a sliding window manner in a sequence from left to right and from top to bottom, and the image blocks with the size of H × W are repeatedly cut out from the to-be-processed low-resolution RGB image in a sliding window manner.
S2, extracting features on each image block at different depths based on a channel attention mechanism and a position attention mechanism of the image to obtain feature maps at different depths corresponding to each image block;
preferably, the method comprises the following steps:
s21, extracting shallow features of the image block by adopting a convolutional neural network to obtain a shallow feature map; specifically, the convolutional neural network is a shallow convolutional neural network, which is a 1-layer or 2-layer convolutional neural network.
S22, extracting the channel attention feature and the position attention feature of the shallow feature map at different depths respectively by adopting a trained deep feature extraction network, and fusing the channel attention feature and the position attention feature to obtain feature maps at different depths corresponding to the image block; the deep feature extraction network comprises N cascaded attention model-based double-attention convolutional neural networks, the N cascaded attention model-based double-attention convolutional neural networks correspond to N depths one by one, and N is a positive integer greater than or equal to 2.
In addition, the channel attention and the position attention enable the invention to effectively focus on more important features and greatly improve the characterization capability of the feature maps with different depths, in the embodiment, 800 images in a data set DIV2K are used as a training sample set, and the images in the training sample set are horizontally overturned and rotated by 90 degrees, so that the number of training samples is increased, the robustness of the system is improved, the deep feature extraction network is trained by adopting the training sample, and the ADAM optimizer and the L1 loss function are used for optimizing the network.
Preferably, as shown in fig. 2, the above-mentioned dual attention convolutional neural network includes a channel attention model, a location attention model and a fusion layer;
the output end of the channel attention model is connected with the input end of the fusion layer, and the output end of the position attention model is connected with the input end of the fusion layer; the channel attention model is used for obtaining the initial weight of the channel attention by calculating the pixel average value of each channel of the input feature map based on a channel attention mechanism, adjusting the initial weight and performing point multiplication on the initial weight and the input feature map to obtain the channel attention feature of the input feature map; specifically, the initial attention weight of the obtained channel is adjusted by using a convolutional neural network to be closer to a true value, and in the embodiment, the convolutional neural network has 2 layers. The position attention model is used for calculating the average value of each channel pixel at each pixel position point of the input feature map based on a position attention mechanism to obtain a position attention initial weight, and after the position attention initial weight is adjusted, the position attention initial weight is subjected to point multiplication with the input feature map to obtain a position attention feature of the input feature map; specifically, the obtained initial weight of the position attention is adjusted by using a convolutional neural network to be closer to a real value, and in the embodiment, the convolutional neural network has 2 layers. The fusion layer is used for fusing the obtained channel attention feature and the position attention feature.
Specifically, in order to effectively distinguish the importance of different channels of the feature map, more important feature channels are given more weight, and a channel attention mechanism is introduced. Preferably, the channel attention initial weight for the c-th channel is:
where c is 1,2, …, L is the number of channels of the input feature map X, H is the height of the input feature map X, W is the width of the input feature map X, Xc(i, j) is the characteristic value of ith row and jth column on the ith channel of the input characteristic diagram X.
In order to obtain the contribution degree of the features at different positions of the feature map to the super-resolution of the image, more important feature positions are given more weight. Preferably, the initial attention weight of the ith row and the jth column is:
where i is 1,2, …, H, j is 1,2, …, W, H is the height of the input feature map X, W is the width of the input feature map X, which is the activation function, L is the number of channels of the input feature map X, X is the activation function, H is the width of the input feature map X, H is the number of channels of the input feature map X, H is the width of the inputi,j,cThe characteristic value of ith row and jth column on the ith channel of the input characteristic diagram X is obtained.
And fusing the obtained channel attention feature and the position attention feature to obtain feature maps of different depths corresponding to the image block as follows:
Yd=Cd+Ad,d=1,2,…,N
wherein N is a double attention volumeThe number of cascades of the product neural network, i.e. the depth, CdIs an image channel attention feature map at the d depth, AdThe image position attention feature map at the d-th depth is obtained. Specifically, for the image channel attention feature map at each depth, the channel attention feature map C of the C-th channel thereofc=Zc'·XcWherein c is 1,2, …, L is the channel number of the shallow feature diagram X, Zc' channel attention weight for the c-th channel of modulated input profile X, XcIs the c-th channel of the input signature X. For the image position attention feature map at each depth, the position attention feature value of the ith row and the jth column is Ai,j=Pi,'j·Xi,jWhere i is 1,2, …, H, j is 1,2, …, W, H is the height of the input feature map X, W is the width of the input feature map X, P is the width of the input feature map X, andi',jposition attention weight of ith row and jth column of modulated input characteristic diagram Xi,jThe characteristic value of the ith row and the jth column of the input characteristic diagram is obtained.
It should be noted that the features of different depths make different contributions to the image super-resolution task, and the shallow feature has a small receptive field and has the characteristics of rich detail features but insufficient semantic information; the deep characteristic has larger receptive field and has the characteristics of complete semantic information but insufficient detail characteristics. By fusing feature maps with different depths, better detail features can be obtained, larger receptive field features are reserved, and great promotion effect is achieved on the subsequent reconstruction of super-resolution images.
S3, fusing the feature maps of different depths corresponding to the image blocks, and performing image reconstruction on the fused feature maps according to the image super-resolution multiple to obtain super-resolution image blocks;
preferably, the method comprises the following steps:
s31, after the feature maps of different depths corresponding to the image blocks are spliced on the feature channels, performing convolution dimensionality reduction to obtain a fused feature map;
specifically, feature maps X corresponding to N different depths1,X2,…,XNFirstly, splicing the N feature maps on the channel dimension to obtain a spliced feature map, wherein the size of the spliced feature map is H × W × (N ×L), then carrying out convolution dimensionality reduction on the spliced feature map by using a layer of convolution neural network, and reducing the output channel number from the size of N ×L to L to obtain a fused feature map, and the size of the fused feature map is H × W ×L.
S32, according to the required image hyper-division multiple S, the fused feature map is up-sampled and amplified to the S times of the feature map; the image super-division multiple S is set according to actual requirements; specifically, a deconvolution method or a PixelShuffle method can be adopted to perform upsampling on the fused feature map;
in order to ensure the efficiency of the algorithm, in this embodiment, the reconstruction network includes two layers of convolutional neural networks, specifically, the size of the amplified feature map is H '× W' ×L, and RGB images Y of H '× W' × are output after the reconstruction network is fitted, wherein when the reconstruction network is trained, a L norm distance between an output image and an image label is calculated through an L loss function, and further, a weight parameter of the convolutional neural network is adjusted.
And S4, recombining the super-resolution image blocks to obtain a high-resolution image.
Specifically, the obtained super-resolution image blocks are subjected to stitching operation according to the accessing sequence and the positions in step S1, so as to obtain a high-resolution image.
In order to prove the super-resolution effect of the super-resolution image obtained by the method provided by the invention, the existing Bicubic algorithm, the SRCNN algorithm, the FSRCNN algorithm, the VDSR algorithm, the EDSR algorithm, the RDN algorithm and the RCAN algorithm are respectively adopted to carry out 2X super-resolution experiments on four standard data sets of image super-resolution Set5, BSD100 and Manga109, and the obtained results are shown in the following table:
the higher the two values are, the more similar the details and the results of the obtained high-resolution image and the original low-resolution image to be processed are, and the better the super-resolution effect is. Compared with other algorithms, the method provided by the invention has the advantages that the PSNR and SSIM values are higher and the super-resolution effect is better on different standard data sets.
Examples 2,
An image super-resolution system, as shown in fig. 3, comprises: the image reconstruction method comprises an image interception module 1, a feature extraction module 2, an image reconstruction module 3 and an image recombination module 4;
the image capturing module 1 is used for blocking a low-resolution image to be processed to obtain a plurality of image blocks and outputting the image blocks to the feature extraction module 2;
the feature extraction module 2 is used for extracting features of each image block input by the image interception module at different depths based on a channel attention mechanism and a position attention mechanism of the image to obtain feature maps of different depths corresponding to each image block and outputting the feature maps to the image reconstruction module 3; preferably, the feature extraction module includes a shallow network unit 21 and a deep network unit 22; the shallow network unit 21 is configured to extract a shallow feature of the image block by using a convolutional neural network to obtain a shallow feature map, and output the shallow feature map to the deep network unit 22; in this embodiment, the convolutional neural network is a shallow convolutional neural network, and is a 1-layer or 2-layer convolutional neural network.
The deep network unit 22 adopts a trained deep feature extraction network to respectively extract the channel attention feature and the position attention feature of the shallow feature map at different depths, and the channel attention feature and the position attention feature are fused to obtain feature maps at different depths corresponding to the image blocks; the deep feature extraction network comprises N cascaded attention model-based double-attention convolutional neural networks, the N cascaded attention model-based double-attention convolutional neural networks correspond to N depths one by one, and N is a positive integer greater than or equal to 2. Preferably, the above-mentioned dual attention convolutional neural network includes a channel attention model, a location attention model and a fusion layer; the output end of the channel attention model is connected with the input end of the fusion layer, and the output end of the position attention model is connected with the input end of the fusion layer; the channel attention model is used for obtaining the initial weight of the channel attention by calculating the pixel average value of each channel of the input feature map based on a channel attention mechanism, adjusting the initial weight and performing point multiplication on the initial weight and the input feature map to obtain the channel attention feature of the input feature map; the position attention model is used for calculating the average value of each channel pixel at each pixel position point of the input feature map based on a position attention mechanism to obtain a position attention initial weight, and after the position attention initial weight is adjusted, the position attention initial weight is subjected to point multiplication with the input feature map to obtain a position attention feature of the input feature map; the fusion layer is used for fusing the obtained channel attention feature and the position attention feature.
The image reconstruction module 3 is used for fusing the feature maps of different depths corresponding to the image blocks, reconstructing the fused feature maps according to the required image super-division multiple to obtain super-resolution image blocks, and outputting the super-resolution image blocks to the image reconstruction module 4;
further preferably, the image reconstruction module 3 includes a multi-layer feature fusion unit 31, an upsampling unit 32 and an image reconstruction unit 33; the multi-layer feature fusion unit 31 is configured to perform convolution dimensionality reduction on feature maps of different depths corresponding to the image blocks after the feature maps are spliced on the feature channels, obtain a fused feature map, and output the fused feature map to the upsampling unit 32; the up-sampling unit 32 is configured to up-sample the fused feature map input by the multi-layer feature fusion unit according to the required image hyper-division multiple S, amplify the feature map by S times, and output the amplified feature map to the image reconstruction unit 33; the image reconstruction unit 33 is configured to reconstruct the amplified feature map input by the upsampling unit to obtain a super-resolution image block.
The image reconstruction module 4 is configured to reconstruct the super-resolution image blocks input by the image reconstruction module 3 to obtain a high-resolution image.
Specifically, in the embodiment, a low-resolution image to be processed is input into an image interception module 1 for blocking to obtain a plurality of image blocks with the size of 48 × 48 and output to a feature extraction module 2, the feature extraction module 2 comprises a shallow network unit 21 and a deep network unit 22, the shallow network unit 21 comprises a layer of convolutional neural network, the number of convolutional kernels is 3, the size of the convolutional kernels is 3 × 3, the deep network unit 22 comprises 10 cascaded double-attention convolutional neural networks, the image blocks with the size of 48 × 48 are firstly subjected to the shallow network unit 21 to obtain a shallow feature map with the size of 48 × 64, then the deep network unit 22 is used to obtain feature maps x1, x2 … and x10, the multi-layer feature fusion unit 31 in the image reconstruction module 3 is used for fusing the obtained 10 feature maps to obtain a fused feature map, the size of 48 × 64, the fused feature map is obtained, the size of 48 × 3, the multi-layer feature fusion unit 31 in the image reconstruction module 3 performs fusion on-sampling on the obtained 10 feature maps, the super-depth images obtained by the super-resolution image reconstruction module, the super-resolution image reconstruction system is subjected to the transformation training, the super-resolution image fusion of the super-resolution image collection 96 sampling and the super-resolution image collection is subjected to obtain a super-resolution image reconstruction system, the super-resolution image collection is subjected to the super-resolution image collection training, the super-resolution image collection is subjected to the super-resolution image collection training, the super-resolution image collection is subjected to the super-resolution image collection.
Examples 3,
A storage medium that, when read by a computer, causes the computer to execute the super-resolution reconstruction method of an image provided in embodiment 1 of the present invention.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (10)
1. A super-resolution reconstruction method of an image is characterized by comprising the following steps:
s1, partitioning the low-resolution image to be processed to obtain a plurality of image blocks;
s2, extracting the features of each image block at different depths based on the channel attention mechanism and the position attention mechanism of the image to obtain feature maps at different depths corresponding to each image block;
s3, fusing the feature maps of different depths corresponding to the image blocks, and performing image reconstruction on the fused feature maps according to the image super-resolution multiple to obtain super-resolution image blocks;
and S4, recombining the super-resolution image blocks to obtain a high-resolution image.
2. The method for super-resolution reconstruction of images according to claim 1, wherein said S2 comprises the steps of:
s21, extracting shallow features of the image block by adopting a convolutional neural network to obtain a shallow feature map;
s22, extracting the channel attention feature and the position attention feature of the shallow feature map at different depths respectively by adopting a trained deep feature extraction network, and fusing the channel attention feature and the position attention feature to obtain feature maps at different depths corresponding to the image block;
the deep feature extraction network comprises N cascaded attention model-based double-attention convolutional neural networks, the N cascaded attention model-based double-attention convolutional neural networks correspond to N depths one by one, and N is a positive integer greater than or equal to 2.
3. The super-resolution reconstruction method of images according to claim 2, wherein the dual attention convolutional neural network comprises a channel attention model, a location attention model and a fusion layer;
the output end of the channel attention model is connected with the input end of the fusion layer, and the output end of the position attention model is connected with the input end of the fusion layer;
the channel attention model is used for obtaining the initial weight of the channel attention by calculating the pixel average value of each channel of the input feature map based on a channel attention mechanism, adjusting the initial weight and performing point multiplication on the initial weight and the input feature map to obtain the channel attention feature of the input feature map;
the position attention model is used for calculating the average value of each channel pixel at each pixel position point of the input feature map based on a position attention mechanism to obtain a position attention initial weight, and after the position attention initial weight is adjusted, the position attention initial weight is subjected to point multiplication with the input feature map to obtain a position attention feature of the input feature map;
the fusion layer is used for fusing the obtained channel attention feature and the position attention feature.
4. The method for super-resolution reconstruction of images according to claim 3, wherein the initial weighting of the channel attention of the c-th channel is:
where c is 1,2, …, L is the number of channels of the input feature map X, H is the height of the input feature map X, W is the width of the input feature map X, Xc(i, j) is the characteristic value of ith row and jth column on the ith channel of the input characteristic diagram X.
5. The method for super-resolution reconstruction of images according to claim 3, wherein the initial weight of attention of the position in the ith row and the jth column is:
where i is 1,2, …, H, j is 1,2, …, W, H is the height of the input feature map X, and W is the input feature map XWide, for activation function, L is the number of channels of input feature map X, Xi,j,cThe characteristic value of ith row and jth column on the ith channel of the input characteristic diagram X is obtained.
6. The method for super-resolution reconstruction of images according to claim 1, wherein the method for fusing the feature maps of different depths corresponding to the image blocks in S3 includes: and after splicing the feature maps of different depths corresponding to the image blocks on the feature channels, performing convolution dimensionality reduction to obtain a fused feature map.
7. A super-resolution reconstruction system for images, comprising: the device comprises an image intercepting module, a feature extraction module, an image reconstruction module and an image recombination module;
the image intercepting module is used for partitioning the low-resolution image to be processed to obtain a plurality of image blocks and outputting the image blocks to the feature extraction module;
the feature extraction module is used for extracting features of each image block input by the image interception module at different depths based on a channel attention mechanism and a position attention mechanism of an image to obtain feature maps of different depths corresponding to each image block and outputting the feature maps to the image reconstruction module;
the image reconstruction module is used for fusing the feature maps of different depths corresponding to the image blocks, reconstructing the fused feature maps according to the required image super-division multiple to obtain super-resolution image blocks and outputting the super-resolution image blocks to the image reconstruction module;
the image recombination module is used for recombining the super-resolution image blocks input by the image reconstruction module to obtain a high-resolution image.
8. The system for super-resolution reconstruction of images according to claim 7, wherein the feature extraction module comprises a shallow network unit and a deep network unit;
the shallow network unit is used for extracting shallow features of the image blocks by adopting a convolutional neural network to obtain a shallow feature map and outputting the shallow feature map to the deep network unit;
the deep network unit adopts a trained deep feature extraction network to respectively extract the channel attention feature and the position attention feature of the shallow feature map at different depths, and the channel attention feature and the position attention feature are fused to obtain feature maps at different depths corresponding to the image blocks;
the deep feature extraction network comprises N cascaded attention model-based double-attention convolutional neural networks, the N cascaded attention model-based double-attention convolutional neural networks correspond to N depths one by one, and N is a positive integer greater than or equal to 2.
9. The system for super-resolution reconstruction of images according to claim 8, wherein the dual attention convolutional neural network comprises a channel attention model, a location attention model and a fusion layer;
the output end of the channel attention model is connected with the input end of the fusion layer, and the output end of the position attention model is connected with the input end of the fusion layer;
the channel attention model is used for obtaining the initial weight of the channel attention by calculating the pixel average value of each channel of the input feature map based on a channel attention mechanism, adjusting the initial weight and performing point multiplication on the initial weight and the input feature map to obtain the channel attention feature of the input feature map;
the position attention model is used for calculating the average value of each channel pixel at each pixel position point of the input feature map based on a position attention mechanism to obtain a position attention initial weight, and after the position attention initial weight is adjusted, the position attention initial weight is subjected to point multiplication with the input feature map to obtain a position attention feature of the input feature map;
the fusion layer is used for fusing the obtained channel attention feature and the position attention feature.
10. A storage medium, characterized in that, when the instructions are read by a computer, the instructions cause the computer to execute the super-resolution reconstruction method of the image according to any one of claims 1 to 6 of the present invention.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010051767.2A CN111461973A (en) | 2020-01-17 | 2020-01-17 | Super-resolution reconstruction method and system for image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010051767.2A CN111461973A (en) | 2020-01-17 | 2020-01-17 | Super-resolution reconstruction method and system for image |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111461973A true CN111461973A (en) | 2020-07-28 |
Family
ID=71683159
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010051767.2A Pending CN111461973A (en) | 2020-01-17 | 2020-01-17 | Super-resolution reconstruction method and system for image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111461973A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112329867A (en) * | 2020-11-10 | 2021-02-05 | 宁波大学 | MRI image classification method based on task-driven hierarchical attention network |
CN112333456A (en) * | 2020-10-21 | 2021-02-05 | 鹏城实验室 | Live video transmission method based on cloud edge protocol |
CN112560662A (en) * | 2020-12-11 | 2021-03-26 | 湖北科技学院 | Occlusion image identification method based on multi-example attention mechanism |
CN112668619A (en) * | 2020-12-22 | 2021-04-16 | 万兴科技集团股份有限公司 | Image processing method, device, terminal and storage medium |
CN112712488A (en) * | 2020-12-25 | 2021-04-27 | 北京航空航天大学 | Remote sensing image super-resolution reconstruction method based on self-attention fusion |
CN113034642A (en) * | 2021-03-30 | 2021-06-25 | 推想医疗科技股份有限公司 | Image reconstruction method and device and training method and device of image reconstruction model |
CN113205005A (en) * | 2021-04-12 | 2021-08-03 | 武汉大学 | Low-illumination low-resolution face image reconstruction method |
CN113436054A (en) * | 2021-06-28 | 2021-09-24 | 合肥高维数据技术有限公司 | Super-resolution network-based image side information estimation steganography method and storage medium |
CN113706388A (en) * | 2021-09-24 | 2021-11-26 | 上海壁仞智能科技有限公司 | Image super-resolution reconstruction method and device |
CN114004784A (en) * | 2021-08-27 | 2022-02-01 | 西安市第三医院 | Method for detecting bone condition based on CT image and electronic equipment |
CN114547017A (en) * | 2022-04-27 | 2022-05-27 | 南京信息工程大学 | Meteorological big data fusion method based on deep learning |
CN114827482A (en) * | 2021-01-28 | 2022-07-29 | 北京字节跳动网络技术有限公司 | Image brightness adjusting method and device, electronic equipment and medium |
CN117727104A (en) * | 2024-02-18 | 2024-03-19 | 厦门瑞为信息技术有限公司 | Near infrared living body detection device and method based on bilateral attention |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108364023A (en) * | 2018-02-11 | 2018-08-03 | 北京达佳互联信息技术有限公司 | Image-recognizing method based on attention model and system |
CN110148091A (en) * | 2019-04-10 | 2019-08-20 | 深圳市未来媒体技术研究院 | Neural network model and image super-resolution method based on non local attention mechanism |
CN110532955A (en) * | 2019-08-30 | 2019-12-03 | 中国科学院宁波材料技术与工程研究所 | Example dividing method and device based on feature attention and son up-sampling |
-
2020
- 2020-01-17 CN CN202010051767.2A patent/CN111461973A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108364023A (en) * | 2018-02-11 | 2018-08-03 | 北京达佳互联信息技术有限公司 | Image-recognizing method based on attention model and system |
CN110148091A (en) * | 2019-04-10 | 2019-08-20 | 深圳市未来媒体技术研究院 | Neural network model and image super-resolution method based on non local attention mechanism |
CN110532955A (en) * | 2019-08-30 | 2019-12-03 | 中国科学院宁波材料技术与工程研究所 | Example dividing method and device based on feature attention and son up-sampling |
Non-Patent Citations (4)
Title |
---|
CHENYANG DUAN,NANFENG XIAO: "Parallax-Based Spatial and Channel Attention for Stereo Image Super-Resolution", 《IEEE ACCESS》 * |
YANTING HU,JIE LI,YUANFEI HUANG,AND XINBO GAO: "Channel-Wise and Spatial Feature Modulation Network for Single Image Super-Resolution", 《HTTPS://ARXIV.ORG/ABS/1809.11130》 * |
YULUN ZHANG,KUNPENG LI,KAI LI,LICHEN WANG,BINENG ZHONG,YUN FU: "Image Super-Resolution Using Very Deep Residual Channel Attention Networks", 《HTTPS://ARXIV.ORG/ABS/1807.02758V2》 * |
田萱,王亮,孟祥光: "《基于深度学习的图像语义分割技术》" * |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112333456B (en) * | 2020-10-21 | 2022-05-10 | 鹏城实验室 | Live video transmission method based on cloud edge protocol |
CN112333456A (en) * | 2020-10-21 | 2021-02-05 | 鹏城实验室 | Live video transmission method based on cloud edge protocol |
CN112329867A (en) * | 2020-11-10 | 2021-02-05 | 宁波大学 | MRI image classification method based on task-driven hierarchical attention network |
CN112560662A (en) * | 2020-12-11 | 2021-03-26 | 湖北科技学院 | Occlusion image identification method based on multi-example attention mechanism |
CN112668619A (en) * | 2020-12-22 | 2021-04-16 | 万兴科技集团股份有限公司 | Image processing method, device, terminal and storage medium |
CN112668619B (en) * | 2020-12-22 | 2024-04-16 | 万兴科技集团股份有限公司 | Image processing method, device, terminal and storage medium |
CN112712488A (en) * | 2020-12-25 | 2021-04-27 | 北京航空航天大学 | Remote sensing image super-resolution reconstruction method based on self-attention fusion |
CN112712488B (en) * | 2020-12-25 | 2022-11-15 | 北京航空航天大学 | Remote sensing image super-resolution reconstruction method based on self-attention fusion |
CN114827482A (en) * | 2021-01-28 | 2022-07-29 | 北京字节跳动网络技术有限公司 | Image brightness adjusting method and device, electronic equipment and medium |
CN114827482B (en) * | 2021-01-28 | 2023-11-03 | 抖音视界有限公司 | Image brightness adjusting method and device, electronic equipment and medium |
CN113034642A (en) * | 2021-03-30 | 2021-06-25 | 推想医疗科技股份有限公司 | Image reconstruction method and device and training method and device of image reconstruction model |
CN113034642B (en) * | 2021-03-30 | 2022-05-27 | 推想医疗科技股份有限公司 | Image reconstruction method and device and training method and device of image reconstruction model |
CN113205005B (en) * | 2021-04-12 | 2022-07-19 | 武汉大学 | Low-illumination low-resolution face image reconstruction method |
CN113205005A (en) * | 2021-04-12 | 2021-08-03 | 武汉大学 | Low-illumination low-resolution face image reconstruction method |
CN113436054A (en) * | 2021-06-28 | 2021-09-24 | 合肥高维数据技术有限公司 | Super-resolution network-based image side information estimation steganography method and storage medium |
CN114004784A (en) * | 2021-08-27 | 2022-02-01 | 西安市第三医院 | Method for detecting bone condition based on CT image and electronic equipment |
CN114004784B (en) * | 2021-08-27 | 2022-06-03 | 西安市第三医院 | Method for detecting bone condition based on CT image and electronic equipment |
CN113706388A (en) * | 2021-09-24 | 2021-11-26 | 上海壁仞智能科技有限公司 | Image super-resolution reconstruction method and device |
CN114547017B (en) * | 2022-04-27 | 2022-08-05 | 南京信息工程大学 | Meteorological big data fusion method based on deep learning |
CN114547017A (en) * | 2022-04-27 | 2022-05-27 | 南京信息工程大学 | Meteorological big data fusion method based on deep learning |
CN117727104A (en) * | 2024-02-18 | 2024-03-19 | 厦门瑞为信息技术有限公司 | Near infrared living body detection device and method based on bilateral attention |
CN117727104B (en) * | 2024-02-18 | 2024-05-07 | 厦门瑞为信息技术有限公司 | Near infrared living body detection device and method based on bilateral attention |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111461973A (en) | Super-resolution reconstruction method and system for image | |
CN109903223B (en) | Image super-resolution method based on dense connection network and generation type countermeasure network | |
CN110210539B (en) | RGB-T image saliency target detection method based on multi-level depth feature fusion | |
Liu et al. | A spectral grouping and attention-driven residual dense network for hyperspectral image super-resolution | |
CN111754438B (en) | Underwater image restoration model based on multi-branch gating fusion and restoration method thereof | |
CN111325751A (en) | CT image segmentation system based on attention convolution neural network | |
CN111275618A (en) | Depth map super-resolution reconstruction network construction method based on double-branch perception | |
CN111311518A (en) | Image denoising method and device based on multi-scale mixed attention residual error network | |
CN112149526B (en) | Lane line detection method and system based on long-distance information fusion | |
CN114332482A (en) | Lightweight target detection method based on feature fusion | |
CN115100039B (en) | Lightweight image super-resolution reconstruction method based on deep learning | |
CN112017116A (en) | Image super-resolution reconstruction network based on asymmetric convolution and construction method thereof | |
CN115660955A (en) | Super-resolution reconstruction model, method, equipment and storage medium for efficient multi-attention feature fusion | |
CN111951164A (en) | Image super-resolution reconstruction network structure and image reconstruction effect analysis method | |
CN114612306A (en) | Deep learning super-resolution method for crack detection | |
CN113239825A (en) | High-precision tobacco beetle detection method in complex scene | |
CN116703725A (en) | Method for realizing super resolution for real world text image by double branch network for sensing multiple characteristics | |
CN110163855B (en) | Color image quality evaluation method based on multi-path deep convolutional neural network | |
CN114926337A (en) | Single image super-resolution reconstruction method and system based on CNN and Transformer hybrid network | |
CN117237641A (en) | Polyp segmentation method and system based on dual-branch feature fusion network | |
CN115965844B (en) | Multi-focus image fusion method based on visual saliency priori knowledge | |
CN116188652A (en) | Face gray image coloring method based on double-scale circulation generation countermeasure | |
CN115908130A (en) | Super-resolution reconstruction method based on mixed attention mechanism | |
CN115797181A (en) | Image super-resolution reconstruction method for mine fuzzy environment | |
CN113191947B (en) | Image super-resolution method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200728 |