CN113837940A - Image super-resolution reconstruction method and system based on dense residual error network - Google Patents
Image super-resolution reconstruction method and system based on dense residual error network Download PDFInfo
- Publication number
- CN113837940A CN113837940A CN202111033944.5A CN202111033944A CN113837940A CN 113837940 A CN113837940 A CN 113837940A CN 202111033944 A CN202111033944 A CN 202111033944A CN 113837940 A CN113837940 A CN 113837940A
- Authority
- CN
- China
- Prior art keywords
- network
- super
- resolution reconstruction
- image
- dense residual
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 26
- 230000007246 mechanism Effects 0.000 claims description 25
- 239000013598 vector Substances 0.000 claims description 17
- 239000011159 matrix material Substances 0.000 claims description 15
- 230000006870 function Effects 0.000 claims description 12
- 238000013528 artificial neural network Methods 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 6
- 238000013480 data collection Methods 0.000 claims description 6
- 230000009466 transformation Effects 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims 1
- 238000013527 convolutional neural network Methods 0.000 description 12
- 238000002474 experimental method Methods 0.000 description 12
- 238000012545 processing Methods 0.000 description 9
- 238000000605 extraction Methods 0.000 description 8
- 238000013135 deep learning Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 238000012549 training Methods 0.000 description 6
- 101100164980 Arabidopsis thaliana ATX3 gene Proteins 0.000 description 4
- 238000010606 normalization Methods 0.000 description 4
- 101100456616 Arabidopsis thaliana MEA gene Proteins 0.000 description 3
- 101150051720 SET5 gene Proteins 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 230000000007 visual effect Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000010835 comparative analysis Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000001965 increasing effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 239000003054 catalyst Substances 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000007850 degeneration Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4046—Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Processing (AREA)
Abstract
The disclosure provides an image super-resolution reconstruction method and system based on a dense residual error network, comprising the following steps: acquiring image information: obtaining an image super-resolution reconstruction result according to the acquired image information and a preset image super-resolution reconstruction model; the image super-resolution reconstruction model is obtained by combining a dense residual error network and an improved transform network; in an improved Transformer network, updating the dot multiplication result of Query and Key by continuously giving new weight, and further obtaining weighted output; the disclosure provides a novel network combining a W-transform and a dense residual error network for improving the super-resolution of an image, and the network achieves the purpose of improving the overall performance by adopting a mode of combining two networks.
Description
Technical Field
The disclosure belongs to the technical field of computer vision, and particularly relates to an image super-resolution reconstruction method and system based on a dense residual error network.
Background
With the wide development of computer vision and the continuous innovation in the field of artificial intelligence, the development of a single image super-resolution technology (SISR) is faster and faster. In 2006, Hinton et al proposed a deep learning concept, whose main idea is to extract high-level semantic features of images through multi-layer nonlinear transformation, emphasize potential distribution rules of learning data, and obtain judgment and prediction capability of new data in continuous training. Therefore, the deep learning officially climbs a large stage in the computer field, and rapidly occupies a place in the computer vision field such as image processing and the like by virtue of the strong fitting capability and learning capability of the deep learning officially. The rapid development of deep learning has prompted more and more researchers to try to introduce deep learning into the super-resolution field.
Dong et al first applied the deep learning concept to single image super resolution (SISIR), and proposed the SRCNN hyper-resolution algorithm, which employs a simpler residual network, but significantly improved in performance and computation speed. With the advent of SRCNN, more and more super-resolution algorithms based on deep learning come into play, such as RCN, DRRN, LapSRN, SRResNet, EDSR and the like. However, these models still have a large room for improvement in computational complexity and performance.
The inventor of the present disclosure finds that the following problems exist in the existing image super-resolution reconstruction method: after the input image passes through a plurality of layers, a plurality of key information features disappear or are filtered before reaching the end of the network or the initial end of the next network, so that the output image lacks a plurality of key information feature points, and finally, the output image is blurred too much and even an image reconstruction error is generated.
Disclosure of Invention
The invention provides an image super-resolution reconstruction method and system based on a dense residual error network, and provides a novel network combining a W-transform and the dense residual error network for improving the super-resolution of an image.
In order to achieve the purpose, the invention is realized by the following technical scheme:
in a first aspect, the present disclosure provides an image super-resolution reconstruction method based on a dense residual error network, including:
acquiring image information:
obtaining an image super-resolution reconstruction result according to the acquired image information and a preset image super-resolution reconstruction model;
the image super-resolution reconstruction model is obtained by combining a dense residual error network and an improved transform network; in an improved Transformer network, a point multiplication result of Query and Key is updated by continuously giving new weight, and then a weighted output is obtained.
Furthermore, the dense residual error network adds a dense layer on the basis of the existing direct path, and performs jump connection on each convolution block.
Furthermore, each convolution block generates a new branch path, each convolution block of an even layer is respectively in cross connection with the convolution block of an even layer in the number of layers below the position, and each convolution block of an odd layer is respectively in cross connection with the convolution block of an odd layer below the position.
Further, each branch adopts nonlinear transformation; and adding a characteristic pyramid module in the branch connection of the first layer and the last layer.
Further, the improved transform network comprises an encoding layer and a decoding layer, wherein the encoding layer comprises 6 encoders, and the decoding layer comprises 6 decoders.
Further, the encoder includes a weighted self-attention mechanism and a feed-forward neural network; the decoder includes a weighted self-attention mechanism, an encoding-decoding attention, and a feed-forward neural network.
Furthermore, the weighted self-attention mechanism is a weight parameter distribution mechanism, wherein a weighted weight is added, a new weight is continuously given to update the dot multiplication result of Query and Key to obtain a weighted output, the weighted output comprises all visible input sequence information, then a new vector matrix is obtained through a normalized exponential function, and then each Value vector is multiplied by the Value in the new vector matrix.
In a second aspect, the present disclosure also provides an image super-resolution reconstruction system based on a dense residual error network, including a data collection module and a super-resolution reconstruction module;
the data collection module configured to: acquiring image information:
the super-resolution reconstruction module is configured to: obtaining an image super-resolution reconstruction result according to the acquired image information and a preset image super-resolution reconstruction model;
the image super-resolution reconstruction model is obtained by combining a dense residual error network and an improved transform network; in an improved Transformer network, a point multiplication result of Query and Key is updated by continuously giving new weight, and then a weighted output is obtained.
In a third aspect, the present disclosure also provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor, implements the steps of the dense residual network-based image super-resolution reconstruction method according to the first aspect.
In a fourth aspect, the present disclosure further provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the method for reconstructing image super resolution based on dense residual error network according to the first aspect when executing the program.
Compared with the prior art, the beneficial effect of this disclosure is:
1. the present disclosure proposes a novel backbone network for super-resolution reconstruction; global blocks are added in a multi-branch serial block sequence and repeatedly jump and cross with convolution, so that the extraction capability of key semantic features is enhanced, and the trunk network disclosed by the invention has excellent peak signal-to-noise ratio (PSNR) performance on processing low-resolution images with complex scenes;
2. the method adopts the improved Transformer to make up the defects of the convolutional neural network in the aspects of parallel computing capability and feature extraction, and finds the deep feature corresponding relation through the proposed Weighted Self-Attention mechanism (Weighted Self-Attention), so that the image can better recover the features such as texture, detail and the like, and further can output the image with higher resolution; subsequent experiments show that the W-transform (Weighted Self-orientation-transform) improved by the disclosure is remarkably improved in quantitative and qualitative evaluation of images;
3. in the disclosure, a novel network (Trans DRN) combining a W-Transformer and a dense residual error network (DRN) has excellent network performance and faster processing speed; experimental results prove that the algorithm disclosed by the invention is superior to the existing advanced super-resolution algorithm in processing speed and network performance, such as EDSR, OISR-RK3, MDCN, LW-CSC and IPT.
Drawings
The accompanying drawings, which form a part hereof, are included to provide a further understanding of the present embodiments, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the present embodiments and together with the description serve to explain the present embodiments without unduly limiting the present embodiments.
Fig. 1 is a detailed flow chart of example 1 of the present disclosure;
FIG. 2 is an overall network framework of embodiment 1 of the present disclosure;
fig. 3 is a block diagram of a dense residual error network of embodiment 1 of the present disclosure;
FIG. 4 is a global block detail diagram of embodiment 1 of the present disclosure;
FIG. 5 is a W-Transformer framework diagram of example 1 of the present disclosure;
FIG. 6 is a detailed view of a weighted self-attention module of embodiment 1 of the present disclosure;
FIG. 7 is a visual comparison graph of the network performance of 5 models according to embodiment 1 of the present disclosure;
FIG. 8 is a comparison example one of images of W-Trans DRN of example 1 of the present disclosure and other 6 advanced algorithms;
FIG. 9 is a second example of image comparison of W-Trans DRN of example 1 of the present disclosure with 6 other advanced algorithms;
FIG. 10 is a third example of image comparison of W-Trans DRN of example 1 of the present disclosure with 6 other advanced algorithms;
FIG. 11 is a comparison example four of images of W-Trans DRN of example 1 of the present disclosure and 6 other advanced algorithms;
FIG. 12 is a graph of parameters versus peak signal-to-noise ratio (PSNR) for example 1 of the present disclosure;
FIG. 13 is a PSNR versus model runtime reference for embodiment 1 of the present disclosure;
fig. 14 is a graph comparing the network performance of W-TransDRN of example 1 of the present disclosure and three other models at different data volumes.
The specific implementation mode is as follows:
the present disclosure is further described with reference to the following drawings and examples.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
The super-resolution model is an important basis for researching the super-resolution of the image; as the super-resolution model mentioned in the background art, although partial effects are obtained to a certain extent, a single convolutional neural network has obvious defects in parallel computing capability and feature extraction capability; the network model combining the W-Transformer and the dense residual error network provided by the disclosure can make up for the parallel computing power of a single convolutional neural network, improves the extraction capability of key information characteristics, and has better expressive force on network performance; in this model, the following three technical problems will be solved in the present disclosure, as follows:
1. whether the resolution of the output image is greatly improved or not is judged by the network model;
whether a good balance relationship can be generated on the relationship between the running time and the network performance after combining the Transformer and the convolutional neural network;
3. in the case of a large amount of pre-training data, the network model in the present disclosure has better image processing capability.
Example 1:
as shown in fig. 1, the present embodiment provides an image super-resolution reconstruction method based on a dense residual network, including:
acquiring image information:
obtaining an image super-resolution reconstruction result according to the acquired image information and a preset image super-resolution reconstruction model;
the image super-resolution reconstruction model is obtained by combining a dense residual error network and an improved transform network; in an improved Transformer network, a point multiplication result of Query and Key is updated by continuously giving new weight, and then a weighted output is obtained.
Specifically, first, in this embodiment, a dense residual error network (DRN) is used as a backbone, which aims to increase the network depth by adding branch layers, thereby increasing the information and gradient flow capability of the network; the dense residual error network (DRN) maintains direct connection while adding branch connection to further enhance the extraction capability of key information features and avoid the phenomenon of 'information loss or filtering' caused by too deep network. Secondly, a new Weighted Self-Attention mechanism (Weighted Self-Attention mechanism) is proposed in this embodiment, which updates the multiplication result of Query and Key by continuously giving new weights (Weight), so as to obtain a better Weighted output, the Weighted Self-Attention mechanism aims to pay Attention to the Key image information features by continuously updating the weights (Weight), so as to weaken irrelevant features, and the proposed Weighted Self-Attention mechanism is beneficial to further improving the modeling capability of the Self-Attention mechanism and expanding the receptive field range. Finally, in this embodiment, an improved transform (W-transform) is combined with the DRN to solve the problem of parallel computing capability and feature extraction caused by the convolutional neural network, and the W-transform has the capability of mining long-distance dependency and is not limited by inductive bias, so that it has strong expressiveness. Experiments further prove that the network structure provided in the embodiment has better PSNR (peak signal-to-noise ratio)/SSIM (structure similarity) performance and faster convergence speed.
As shown in fig. 1, a general network block diagram is introduced in detail for a network framework, and in this embodiment, a manner of combining a residual error network and a transform is adopted to further improve super-resolution of an image; the dense residual error network (DRN) provided by the embodiment adopts a new form of the traditional residual error network, adds branch connection on the basis of the existing direct path, and performs jump connection on each volume block, so as to better extract concerned semantic features and avoid the problem that key information is filtered or lost in the information extraction process; because the parallel computing and feature extracting capabilities of the residual error network are weak, in this embodiment, a dense residual error network (DRN) is combined with a transform, and in order to better improve the network performance to obtain an image with higher quality, a new form of the W-transform is provided in this embodiment, and a Weighted Self-Attention mechanism (Weighted Self-Attention-mechanism) is adopted on the original basis, and has good performance capability in extracting information features such as image texture and the like.
As shown in fig. 3, the design method of the dense residual error network is described in detail; in order to improve the feature extraction capability of the attention information, the embodiment adopts a new connection form: on the basis of a traditional residual error network (ResNet), on the basis of original direct connection, each convolution block can generate a new branch path, each convolution block of an even layer is respectively in cross connection with convolution blocks of even layers in the number of layers below the position, and each convolution block of an odd layer is respectively in cross connection with convolution blocks of odd layers below the position. In a dense residual error network (DRN), this embodiment employs M convolutional layers, each of which has a 3 × 3 convolutional block, and each convolutional layer generates M branches, which is totalA plurality of even layers, generatingThe number of even-numbered branches is,an odd number of layers, generatingThere are odd branches, each with a 1 × 1 volume Block and a Global Block (Global Block), as shown in fig. 4. Each branch using a non-linear transformation HM(. in) wherein HM(. cndot.) is a complex function that performs batch normalization, correcting linear units, pooling, or convolution operations. Finally, MthThe layer receives all the previous layers (x)0,…,xM-1) The feature map of (a), namely:
xM=HM([x0,…,xM-1]) (1)
wherein, [ x ]0,…,xM-1]Represents a cascade of M-1 branch feature maps at 0, …; mthRepresents the Mth layer convolution; hMRepresenting the complex functions used by the M convolutions (complex functions include batch normalization, linear correction units, pooling, convolution, etc.).
In addition, in the branch connection between the first layer and the last layer, in order to weaken the position Feature and enhance the semantic Feature, in this embodiment, a Feature Pyramid module (Feature Pyramid network: FPN) is added on the branch, and the low-level Feature and the high-level Feature are fused, so as to reduce the loss of the attention information, thereby obtaining a more accurate information Feature and improving the output capability of the high-quality image.
As shown in fig. 5, for the improved Transformer, i.e.: the W-Transformer is introduced in detail; in this embodiment, 1W-Transformer and a dense residual error network (DRN) are used to build an overall network, where an encoding layer (Encoder Block) in the W-Transformer is formed by stacking 6 encoders (encoders), a decoding layer (Decoder Block) is formed by stacking 6 decoders (decoders), and each Encoder-Decoder has the same Block diagram, and fig. 5 shows 1 Encoder-Decoder. The main flow of the W-Transformer is as follows: input-encoding-decoding-output. Before using the W-Transformer, each information point on the image needs to be embedded into a vector with a model sample dimension equal to 512, that is:dmodel512. The Encoder (Encoder) mainly comprises 2 parts: a weighted Self-Attention mechanism (Weight Self-Attention) and a Feed-Forward neural network (Feed Forward). The decoder mainly comprises 3 parts: weighted Self-Attention mechanism (Weight Self-Attention), encoding-decoding Attention (Encoder-Decoder Attention), and Feed-Forward neural network (Feed Forward). Unlike the Encoder, an encoding-decoding Attention (Encoder-Decoder Attention) is added to the Decoder, and the main purpose of the Encoder-decoding Attention is to help the current node to acquire key semantic features which need Attention currently. In the presence of a catalyst to obtain Z1Then, the neural network is not directly transmitted into the feedforward neural network, but passes through the normalization layer firstly, so as to prevent the degeneration problem in the deep neural network training. And then, adding a multilayer perceptron (MLP), splicing the Query and the Key, and then multiplying the obtained result by a weight matrix (formula 2) through a full connection layer of which an activation function is a hyperbolic tangent function (tanh) so as to better complete the interaction task of the channel information.
Wherein a is a vector matrix; q is a vector Query in the weighted self-attention mechanism; k is a vector Key in the weighted self-attention mechanism;is a weight matrix; w1A weight that is a weighted self-attention mechanism;Tis a weight value.
In fig. 5, position information is added to improve the disadvantage that position information cannot be directly acquired in the weighted self-attention mechanism. The position information adopts sine and cosine position codes, the position codes are generated by using sine and cosine functions with different frequencies, and the formula is as follows:
wherein i is the second dimension of the feature point vector; pos is the absolute position of the image feature point in the image.
FIG. 6 is a detailed diagram of a Weighted Self-Attention mechanism (Weighted Self-Attention); the self-attention mechanism is mainly a weight parameter allocation mechanism and aims to assist a network model to capture attention information. The weighted self-attention mechanism is achieved by introducing three elements: query, Key, and Value. And calculating the similarity between the Query and each group of keys to obtain a weight coefficient of each Key, and then carrying out weighted summation on Value to obtain a final output numerical Value. Wherein d isq=dv=dk=dmodel=512。
Namely:
suppose X ∈ Rn×dIs a characteristic of a sequence of input samples, where n is the number of input samples (sequence length), d is the single sample dimension,Lxis the length of Source; rn×dIs a natural sequence matrix; . Query, Key and Value are defined as follows:
In fig. 6, a Weight (Weighted Block) is added in this embodiment, and the dot multiplication result of Query and Key is updated by continuously assigning a new Weight (Weight) to obtain a better Weighted output, which mainly contains all visible input sequence information, and then a new vector matrix L is obtained through a normalized exponential function (Softmax), and then the values in each Value vector and L are multiplied. Aims to weaken irrelevant characteristics by continuously updating the Weight (Weight) to focus on the key image information characteristics. And finally, adding and summing the obtained results and then outputting.
Step 1: calculating a fraction value between different input vectors, namely:
S=Q·KT
wherein, S represents Scores, namely a vector matrix;
step 2: and (3) performing normalization processing on the Scorces matrix after the gradient is stabilized, namely:
wherein S isnRepresenting an n-dimensional matrix;
and step 3: further processing is carried out by using a softmax function, namely:
P=softmax(Sn)
wherein, P represents the vector value after the Softmax function processing;
and 4, step 4: finally, a weighted value matrix is obtained, namely:
Z=V·P
where Z represents the matrix of weight values for the output.
Step 1:Compute scores between different input vectors withS=Q·KT;
Step 3:Translate the scores into probabilities with softmax function P=soft max(Sn);
Step 4:Obtain the weighted value matrix withZ=V·P。
The overall formula for this process is as follows:
to further prove that the algorithm (W-Trans DRN) proposed in this embodiment has better network performance and excellent image processing capability in super-resolution application, the following five experiments are developed in this embodiment, and the comparison of experimental data and the visual comparison of images are performed to show the superiority and competitiveness of the algorithm in this embodiment.
1. As shown in fig. 7, the existing four algorithms are compared with the algorithm in the present embodiment for visualization of network performance:
fig. 7 shows a visual comparison of the W-Trans DRN model and the Dense Residual Network (DRN) model in this example with three other algorithms (VDSR, laprn, and IPT) on the BSDS100 data set at a scale factor of 2. Fig. 7 further illustrates: (1) the dense residual error network (DRN) proposed by the embodiment is obviously superior to the traditional ResNet network (VDSR and LapSRN) in PSNR/SSIM performance; (2) the DRN is compared with an IPT model to find that the PSNR value is not obviously improved (the PSNR is only improved by 0.01dB), and the SSIM is obviously superior to the IPT, which indicates that a single DRN network still lacks excellent performance expressive force; (3) the W-Transformer structure was combined on the basis of DRN, i.e.: compared with VDSR, LapSRN and IPT, the PSNR/SSIM performance of the W-Trans DRN is obviously improved, which shows that the W-Trans DRN has higher network performance.
2. Data comparison experiments of the existing 11 algorithms and the algorithm in the embodiment are as follows:
this example and 11 existing advanced algorithms: LapSRN, EDSR, SRMDNF, MSRN, MRFN, OISR-RK3, SeaNet, MDCN, MSCN, LW-CSC, IPT were compared experimentally, and in this section, the network performance was comprehensively evaluated by using two evaluation indexes of PSNR and SSIM in the present example, as shown in Table 1:
table 1: comparative data plot of algorithm proposed by TABLE I and existing advanced algorithm
Summarizing the data obtained by the experiment to obtain a table 1, wherein all suboptimal results are marked with underlines in the table; as can be seen in table 1: at a scale factor of 2, under four data SETs of SET5, SET14, BSDS100, and Urban100, PSNR values of W-Trans DRN are higher than sub-optimal results: 0.51dB, 1.01dB, 1.13dB and 1.12dB, the SSIM values of W-Trans DRN are respectively higher: 0.009dB, 0.0094dB, 0.0104dB, 0.0047 dB; at a scale factor of 3, under four data SETs of SET5, SET14, BSDS100, and Urban100, PSNR values of W-Trans DRN are higher than sub-optimal results: 0.3dB, 0.48dB, 0.84dB, and 0.49dB, the SSIM values for W-Trans DRN are higher respectively: 0.0014dB, 0.0096dB, 0.0098dB and 0.0112 dB; at a scale factor of 4, under four data SETs of SET5, SET14, BSDS100, and Urban100, PSNR values of W-Trans DRN are higher than sub-optimal results: 1.37dB, 1.1dB, 1.17dB and 1.27dB, the SSIM values of W-Trans DRN are respectively higher: 0.0130dB, 0.0023dB, 0.0116dB and 0.0163 dB;
in summary, the W-Trans DRN algorithm proposed in this embodiment achieves better results when the scale factors are 2, 3, and 4.
Fig. 8, 9, 10 and 11 show images reconstructed by the algorithm model proposed in this embodiment and other 6 advanced algorithms, which employ BSDS100 and Urban100 reference data sets and are reconstructed at scale factors of 2 and 4. As can be seen from fig. 8, 9, 10 and 11, the method of the present embodiment is successful in improving the detail definition of the image compared to other methods, further indicating that the W-Trans DRN algorithm is highly reliable in recovering complex textures from low-resolution images.
3. Model parameter comparative analysis experiment:
in fig. 12, the balance between performance and model size is shown in the present embodiment. In the case of a scale factor of 2 on the BSDS100 reference data set, the algorithm proposed in this example was compared with the other 6 algorithms (LapSRN, DRCN, MDSR, EDSR, OISR-RK3 and IPT) in tests on the same configuration. As can be seen from the figure, although the algorithm proposed in the present embodiment is higher in parameter number compared to laprn, DRCN and MDSR, the network performance is obviously better than the three algorithms (PSNR)W-Trans DRN33.61 dB); compared with the EDSR, the OISR-RK3 and the IPT, the algorithm provided by the embodiment is superior to the three algorithms in parameter quantity and network performance. Therefore, the algorithm in the embodiment can achieve a better balance relation in terms of network performance and parameters.
4. Model run time comparative analysis experiment:
in fig. 13, a trade-off relationship between performance and runtime size is shown. During the experiment, the SET14 reference data SET with a scale factor of 4 was used in the present example for test comparison on the same configuration. The algorithm involved in the comparison includes: LapSRN, LW-CSC, MDCN, OISR-RK3, EDSR, IPT and W-Trans DRN (ours). As can be seen from the figure, laprn and LW-CSC are better than the algorithm in the present embodiment in terms of runtime, however, their PSNR values are significantly lower than W-Trans DRN (PSNRW-Trans DRN-30.11 dB); compared with the other 4 algorithms, the algorithm in the embodiment is superior to the other 4 algorithms in terms of running time and network performance. Therefore, the algorithm proposed in the embodiment shows a better balance between network performance and runtime.
5. Data percentage analysis experiment:
in the embodiment, an influence experiment of data percentage is performed, the effectiveness of the W-transform system structure is evaluated, and the performance improvement capacity based on the pre-training of the CNN model (EDSR, LW-CSC), the transform model (IPT) and the Trans-CNN model (W-Trans DRN) is analyzed through an experiment result. In this section, the impact analysis experiment on the data amount was performed using ImageNet data sets of 20%, 40%, 60%, 80% and 100% synthesis percentage in this example, and the experimental results are shown in fig. 14. As can be seen in fig. 14:
(1) when the model is not pre-trained or is slightly pre-trained (< 60%), the CNN-based model shows better network performance than the Transformer-based and Trans-CNN-model-based models;
(2) when the training data volume is increased (> 60%), the performance of the Transformer-based model and the Trans-CNN-based model (W-Trans DRN) is obviously better than that of the CNN-based model;
(3) compared to the transform model (IPT) alone, a model based on a combination of transform and CNN, namely: the W-Trans DRN has better network Performance (PSNR) than the IPT model, and when the training data volume is more than 50%, the W-Trans DRN completely overwhelms the CNN-based model, thereby showing more excellent network performance.
Example 2:
the embodiment provides an image super-resolution reconstruction system based on a dense residual error network, which comprises a data collection module and a super-resolution reconstruction module;
the data collection module configured to: acquiring image information:
the super-resolution reconstruction module is configured to: obtaining an image super-resolution reconstruction result according to the acquired image information and a preset image super-resolution reconstruction model;
the image super-resolution reconstruction model is obtained by combining a dense residual error network and an improved transform network; in an improved Transformer network, a point multiplication result of Query and Key is updated by continuously giving new weight, and then a weighted output is obtained.
Example 3:
the present embodiment provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor, implements the steps of the dense residual network-based image super-resolution reconstruction method described in embodiment 1.
Example 4:
the present embodiment provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the program, the steps of the dense residual network-based image super-resolution reconstruction method according to embodiment 1 are implemented.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and those skilled in the art can make various modifications and variations. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present embodiment should be included in the protection scope of the present embodiment.
Claims (10)
1. The image super-resolution reconstruction method based on the dense residual error network is characterized by comprising the following steps:
acquiring image information:
obtaining an image super-resolution reconstruction result according to the acquired image information and a preset image super-resolution reconstruction model;
the image super-resolution reconstruction model is obtained by combining a dense residual error network and an improved transform network; in an improved Transformer network, a point multiplication result of Query and Key is updated by continuously giving new weight, and then a weighted output is obtained.
2. The method for super-resolution image reconstruction based on dense residual network of claim 1, wherein the dense residual network adds a dense layer on the basis of the existing direct path, and each convolution block is jump-connected.
3. The image super-resolution reconstruction method based on the dense residual network as claimed in claim 2, wherein each convolution block generates a new branch path, each convolution block of an even layer is respectively cross-connected with convolution blocks of an even layer in the number of layers below the position, and each convolution block of an odd layer is respectively cross-connected with convolution blocks of an odd layer below the position.
4. The image super-resolution reconstruction method based on the dense residual network of claim 3, wherein each branch adopts a non-linear transformation; and adding a characteristic pyramid module in the branch connection of the first layer and the last layer.
5. The method for super-resolution image reconstruction based on dense residual network as claimed in claim 1, wherein the improved transform network comprises an encoding layer and a decoding layer, the encoding layer comprises 6 encoders, and the decoding layer comprises 6 decoders.
6. The method for image super-resolution reconstruction based on dense residual network according to claim 5, wherein the encoder comprises a weighted self-attention mechanism and a feedforward neural network; the decoder includes a weighted self-attention mechanism, an encoding-decoding attention, and a feed-forward neural network.
7. The method for image super-resolution reconstruction based on dense residual error network as claimed in claim 6, wherein the weighted self-attention mechanism is a mechanism for distributing weight parameters, the weighted weight is added, the new weight is continuously given to update the dot multiplication result of Query and Key, so as to obtain a weighted output, the output includes all visible input sequence information, then a new vector matrix is obtained through a normalized exponential function, and then each Value vector is multiplied by the Value in the new vector matrix.
8. The image super-resolution reconstruction system based on the dense residual error network is characterized by comprising a data collection module and a super-resolution reconstruction module;
the data collection module configured to: acquiring image information:
the super-resolution reconstruction module is configured to: obtaining an image super-resolution reconstruction result according to the acquired image information and a preset image super-resolution reconstruction model;
the image super-resolution reconstruction model is obtained by combining a dense residual error network and an improved transform network; in an improved Transformer network, a point multiplication result of Query and Key is updated by continuously giving new weight, and then a weighted output is obtained.
9. A computer-readable storage medium, on which a computer program is stored for fingerprint similarity calculation, wherein the program, when being executed by a processor, implements the steps of the dense residual network-based image super-resolution reconstruction method according to any one of claims 1 to 7.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements the steps of the method for dense residual network based super resolution image reconstruction as claimed in any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111033944.5A CN113837940A (en) | 2021-09-03 | 2021-09-03 | Image super-resolution reconstruction method and system based on dense residual error network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111033944.5A CN113837940A (en) | 2021-09-03 | 2021-09-03 | Image super-resolution reconstruction method and system based on dense residual error network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113837940A true CN113837940A (en) | 2021-12-24 |
Family
ID=78962098
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111033944.5A Pending CN113837940A (en) | 2021-09-03 | 2021-09-03 | Image super-resolution reconstruction method and system based on dense residual error network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113837940A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114331849A (en) * | 2022-03-15 | 2022-04-12 | 之江实验室 | Cross-mode nuclear magnetic resonance hyper-resolution network and image super-resolution method |
CN114581864A (en) * | 2022-03-04 | 2022-06-03 | 哈尔滨工程大学 | Transformer-based dynamic dense alignment vehicle weight identification technology |
CN115205117A (en) * | 2022-07-04 | 2022-10-18 | 中国电信股份有限公司 | Image reconstruction method and device, computer storage medium and electronic equipment |
CN116721018A (en) * | 2023-08-09 | 2023-09-08 | 中国电子科技集团公司第十五研究所 | Image super-resolution reconstruction method for generating countermeasure network based on intensive residual error connection |
CN117522682A (en) * | 2023-12-04 | 2024-02-06 | 无锡日联科技股份有限公司 | Method, device, equipment and medium for reconstructing resolution of radiographic image |
CN117934338A (en) * | 2024-03-22 | 2024-04-26 | 四川轻化工大学 | Image restoration method and system |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111192200A (en) * | 2020-01-02 | 2020-05-22 | 南京邮电大学 | Image super-resolution reconstruction method based on fusion attention mechanism residual error network |
CN112950475A (en) * | 2021-03-05 | 2021-06-11 | 北京工业大学 | Light field super-resolution reconstruction method based on residual learning and spatial transformation network |
-
2021
- 2021-09-03 CN CN202111033944.5A patent/CN113837940A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111192200A (en) * | 2020-01-02 | 2020-05-22 | 南京邮电大学 | Image super-resolution reconstruction method based on fusion attention mechanism residual error network |
CN112950475A (en) * | 2021-03-05 | 2021-06-11 | 北京工业大学 | Light field super-resolution reconstruction method based on residual learning and spatial transformation network |
Non-Patent Citations (2)
Title |
---|
OU, J., XIA, H., HUO, W. ET AL: "Single-image super-resolution based on multi-branch residual pyramid network", JOURNAL OF REAL-TIME IMAGE PROCESSING, vol. 18, 20 July 2021 (2021-07-20), pages 1 - 13 * |
马上科普尚尚: "CVPR 2020丨图像超清化+老照片修复技术,拯救你所有的模糊、破损照片", pages 1 - 47, Retrieved from the Internet <URL:http://cloud.tencent.com/developer/article/1652684> * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114581864A (en) * | 2022-03-04 | 2022-06-03 | 哈尔滨工程大学 | Transformer-based dynamic dense alignment vehicle weight identification technology |
CN114581864B (en) * | 2022-03-04 | 2023-04-18 | 哈尔滨工程大学 | Transformer-based dynamic dense alignment vehicle weight identification technology |
CN114331849A (en) * | 2022-03-15 | 2022-04-12 | 之江实验室 | Cross-mode nuclear magnetic resonance hyper-resolution network and image super-resolution method |
CN115205117A (en) * | 2022-07-04 | 2022-10-18 | 中国电信股份有限公司 | Image reconstruction method and device, computer storage medium and electronic equipment |
CN115205117B (en) * | 2022-07-04 | 2024-03-08 | 中国电信股份有限公司 | Image reconstruction method and device, computer storage medium and electronic equipment |
CN116721018A (en) * | 2023-08-09 | 2023-09-08 | 中国电子科技集团公司第十五研究所 | Image super-resolution reconstruction method for generating countermeasure network based on intensive residual error connection |
CN116721018B (en) * | 2023-08-09 | 2023-11-28 | 中国电子科技集团公司第十五研究所 | Image super-resolution reconstruction method for generating countermeasure network based on intensive residual error connection |
CN117522682A (en) * | 2023-12-04 | 2024-02-06 | 无锡日联科技股份有限公司 | Method, device, equipment and medium for reconstructing resolution of radiographic image |
CN117934338A (en) * | 2024-03-22 | 2024-04-26 | 四川轻化工大学 | Image restoration method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113837940A (en) | Image super-resolution reconstruction method and system based on dense residual error network | |
CN109977212B (en) | Reply content generation method of conversation robot and terminal equipment | |
Zhang et al. | Adaptive residual networks for high-quality image restoration | |
CN113159173A (en) | Convolutional neural network model compression method combining pruning and knowledge distillation | |
CN117475038B (en) | Image generation method, device, equipment and computer readable storage medium | |
CN112561028A (en) | Method for training neural network model, and method and device for data processing | |
CN110781893A (en) | Feature map processing method, image processing method, device and storage medium | |
CN110753225A (en) | Video compression method and device and terminal equipment | |
CN108197707A (en) | Compression method based on the convolutional neural networks that global error is rebuild | |
CN110516724A (en) | Visualize the high-performance multilayer dictionary learning characteristic image processing method of operation scene | |
CN115880317A (en) | Medical image segmentation method based on multi-branch feature fusion refining | |
Wang et al. | Reliable identification of redundant kernels for convolutional neural network compression | |
CN114283352A (en) | Video semantic segmentation device, training method and video semantic segmentation method | |
CN113222998A (en) | Semi-supervised image semantic segmentation method and device based on self-supervised low-rank network | |
CN111582091A (en) | Pedestrian identification method based on multi-branch convolutional neural network | |
CN115439367A (en) | Image enhancement method and device, electronic equipment and storage medium | |
CN116016953A (en) | Dynamic point cloud attribute compression method based on depth entropy coding | |
CN113379606B (en) | Face super-resolution method based on pre-training generation model | |
CN113424200A (en) | Methods, apparatuses and computer program products for video encoding and video decoding | |
Yu et al. | Kernel quantization for efficient network compression | |
CN112149803A (en) | Channel pruning method suitable for deep neural network | |
CN116663642A (en) | Model compression method, electronic device and storage medium | |
CN116187401A (en) | Compression method and device for neural network, electronic equipment and storage medium | |
AU2021104479A4 (en) | Text recognition method and system based on decoupled attention mechanism | |
CN115035170A (en) | Image restoration method based on global texture and structure |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |