WO2023040108A1 - 一种图像超分辨率放大模型及其方法 - Google Patents

一种图像超分辨率放大模型及其方法 Download PDF

Info

Publication number
WO2023040108A1
WO2023040108A1 PCT/CN2021/140258 CN2021140258W WO2023040108A1 WO 2023040108 A1 WO2023040108 A1 WO 2023040108A1 CN 2021140258 W CN2021140258 W CN 2021140258W WO 2023040108 A1 WO2023040108 A1 WO 2023040108A1
Authority
WO
WIPO (PCT)
Prior art keywords
resolution
feature
low
image
glrffb
Prior art date
Application number
PCT/CN2021/140258
Other languages
English (en)
French (fr)
Inventor
端木春江
陈诗婷
贺林英
Original Assignee
浙江师范大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浙江师范大学 filed Critical 浙江师范大学
Publication of WO2023040108A1 publication Critical patent/WO2023040108A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present invention relates to the technical field of image processing, and more specifically relates to an image super-resolution magnification model and a method thereof.
  • the first type of method is an interpolation-based method
  • the second type of method is a sample-based method
  • the third type of method is a neural network-based method.
  • neural network-based methods outperform interpolation-based and example-based methods.
  • the present invention provides an image super-resolution magnification model and method thereof, which can completely and accurately magnify and reconstruct an image.
  • An image super-resolution enlargement model including: shallow feature extraction module F SF , multi-level low-resolution feature extraction module F DF , global multi-level low-resolution feature fusion module F GLRFFB , global multi-level high-resolution feature Fusion module F GHRFFB and image reconstruction module F REC ;
  • the shallow feature extraction module F SF is used for performing shallow feature extraction on the input low-resolution image I LR to obtain a shallow feature map H 0 ;
  • the multi-level low-resolution feature extraction module FDF includes M densely connected iterative up-down sampling distillation block IUDDB, which is used to sequentially perform M levels of low-resolution and high-resolution through the M densely connected IUDDB Feature extraction, obtaining the low-resolution feature map H DF-L and the high-resolution feature map H DF-H , wherein the input of each of the IUDDBs after the first IUDDB is the output level of all the previous IUDDBs couplet;
  • the global multi-level low-resolution feature fusion module F GLRFFB is used to receive M HDF-Ls and perform feature fusion to obtain a fused low-resolution feature map H GLRFFB ;
  • the global multi-level high-resolution feature fusion module F GHRFFB is used to receive M HDF-Hs and perform feature fusion to obtain a fused high-resolution feature map H GHRFFB ;
  • the image reconstruction module F REC is configured to receive the H GLRFFB and the H GHRFFB and generate a super-resolution enlarged image I SR .
  • the shallow feature extraction module F SF uses a convolutional layer to extract a shallow feature map H 0 from the input low-resolution image I LR .
  • the iterative up-down sampling distillation block IUDDB includes: up-sampling processing block USB, down-sampling processing block DSB, local multi-level low-resolution feature fusion block LLRFFB, local multi-level high-resolution feature fusion block LHRFFB and residual learning module RL;
  • the USB includes a deconvolution layer and an information distillation layer, wherein the input of the deconvolution layer in the i-th upsampling processing block is The output after the deconvolution operation through the deconvolution layer is The information distillation layer receives the And perform channel segmentation operation to obtain rough image feature map and refined image feature maps which stated Entered into all subsequent DSBs in the IUDDB, the Input to LHRFFB in the current IUDDB;
  • the DSB includes an average pooling layer, and the average pooling layer is used to perform average pooling on the input feature map, wherein the input of the DSB is the output of all USBs before the current DSB
  • the cascade of the DSB outputs a low-resolution feature map, which is respectively input into the LLRFFB in the current IUDDB and all USBs after the current IUDDB;
  • the LLRFFB is used to fuse all the received low-resolution feature maps, and perform feature dimensionality reduction on the fused features, and output H LLRFFB-out to the F GLRFFB ;
  • the LHRFFB is used to receive all Perform feature fusion, complete local multi-level high-resolution feature fusion, and output H LHRFFB-out to the F GHRFFB ;
  • the residual learning module RL is used to learn the residual between the output of the first DSP in the FDF and the output of the current DSP, and obtain the residual output as and will It is input to all subsequent IUDDBs, so that each IUDDB forms a densely connected structure.
  • the F GLRFFB includes a feature fusion unit and a deconvolution upsampling unit;
  • the feature fusion unit is used to perform feature fusion on all received low-resolution feature maps, and obtain the fused low-resolution feature maps as the intermediate feature map H GLRFFB-1 ;
  • the deconvolution upsampling unit is configured to deconvolute and amplify the H GLRFFB-1 to obtain the output H GLRFFB of the F GLRFFB .
  • the F REC includes a feature fusion unit and two convolution units connected in series;
  • the feature fusion unit is used to perform feature fusion of the H GLRFFB input to F REC and the H GHRFFB ;
  • the two convolution units connected in series are used to sequentially perform two convolutions on the fused feature map to obtain I SR .
  • a kind of image super-resolution enlargement method comprises the following steps:
  • the shallow feature map H 0 is extracted from the input low-resolution image I LR through a convolutional layer.
  • S2 specifically includes the following:
  • Upsampling the input feature map specifically including: the i-th input Perform deconvolution operation, output Perform a channel segmentation operation on the feature map after deconvolution operation on the input feature map to obtain a rough image feature map and refined image feature maps will be described Downsampled, for all the Perform feature fusion;
  • the first input is H 0 , when i is not 1, the input is the output cascade of downsampling of the previous i levels;
  • the specific content of S3 includes:
  • the specific content of S4 includes:
  • S5 specifically includes: performing feature fusion on the H GLRFFB and the H GHRFFB , and sequentially performing two convolutions on the fused feature map to obtain I SR .
  • the present invention discloses an image super-resolution zoom-in model and its method, and proposes a new neural network for training and super-resolution zoom-in.
  • the connected iterative up-down sampling distillation block IUDDB iteratively extracts the features of the image at low resolution and high resolution, and some of the features are input to the next iterative high-resolution and low-resolution feature extraction module through distillation, and some
  • the feature input is processed by the global low-resolution fusion block and the global high-resolution fusion block, and finally the image is reconstructed through the image reconstruction module.
  • This model and method undergoes multi-level feature extraction, which is relatively different from the image enlargement model in the prior art.
  • the method and the method have the characteristics of higher reconstruction performance and better imaging effect, and can realize image enlargement stably and effectively.
  • FIG. 1 accompanying drawing is the structural representation of a kind of image super-resolution enlargement model provided by the present invention
  • FIG. 2 accompanying drawing is the structural representation of IUDDB in a kind of image super-resolution enlargement model provided by the present invention
  • FIG. 3 accompanying drawing is the structural representation of USB in a kind of image super-resolution enlargement model provided by the present invention
  • FIG. 4 accompanying drawing is the structural representation of LLRFFB in a kind of image super-resolution enlargement model provided by the present invention
  • FIG. 5 accompanying drawing is the structural representation of GLRFFB and GHRFFB in a kind of image super-resolution enlargement model provided by the present invention
  • FIG. 6 accompanying drawing is the structural representation of REC in a kind of image super-resolution enlargement model provided by the present invention.
  • Fig. 7 accompanying drawing is the performance curve schematic diagram in the experimental part training process in the embodiment of the present invention.
  • Figure 8 is a schematic diagram of the reconstruction effect comparison between IUDFFN and other methods in the embodiment of the present invention.
  • Figure 9 is a schematic diagram of the reconstruction effect comparison between IUDFFN and other methods in the embodiment of the present invention.
  • Fig. 10 is a schematic diagram of the reconstruction effect comparison between IUDFFN and other methods in the embodiment of the present invention.
  • the embodiment of the invention discloses an image super-resolution enlargement model and a method thereof.
  • the entire proposed network structure for super-resolution upscaling is shown in Fig. 1.
  • the proposed network IUDFFN includes a shallow feature extraction module F SF , a multi-level low-resolution and high-resolution feature extraction module F DF , and a global multi-level low-resolution feature fusion module (Global multi-level Low-Resolution Feature Fusion Block, GLRFFB) F GLRFFB , Global multi-level High-Resolution Feature Fusion Block (GHRFFB) F GHRFFB , image reconstruction module F REC .
  • GLRFFB Global multi-level Low-Resolution Feature Fusion Block
  • GHRFFB Global multi-level High-Resolution Feature Fusion Block
  • IUDFFN uses a convolutional layer to extract shallow features H 0 from the input low-resolution image I LR :
  • H 0 is input to the F DF module.
  • the present invention uses M densely connected iterative up-down sampling distillation blocks (Iterative Up-Down sampling Distillation Block, IUDDB) to perform multi-level low-resolution and high-resolution feature extraction.
  • IUDDB Intelligent Up-Down sampling Distillation Block
  • H DF-L ,H DF-H F DF (H 0 ) (2)
  • HDF-L and HDF-H are the low-resolution feature map and high-resolution feature map of the image obtained after H0 passes through the FDF module, respectively. They are then fed into the GLRFFB and GHRFFB modules, respectively.
  • the operations performed in GLRFFB and GHRFFB can be simplified to:
  • H GLRFFB F GLRFFB (H DF-L ) (3)
  • H GHRFFB F GHRFFB (H DF-H ) (4)
  • the image reconstruction module F REC takes H GLRFFB and H GHRFFB as input to generate a high-quality reconstructed image I SR , and this process can be described by formula (5).
  • I SR F REC (H GLRFFB ,H GHRFFB ) (5)
  • the iterative up-down sampling distillation block (IUDDB) in the multi-level low-resolution and high-resolution feature extraction module FDF in the network involved the global multi-level low-resolution feature fusion module (Global multi-level Low-Resolution Feature Fusion Block, GLRFFB) F GLRFFB , Global multi-level High-Resolution Feature Fusion Block (Global multi-level High-Resolution Feature Fusion Block, GHRFFB) F GHRFFB , image reconstruction module F REC for a more in-depth description.
  • the global multi-level low-resolution feature fusion module Global multi-level Low-Resolution Feature Fusion Block, GLRFFB
  • GLRFFB Global multi-level High-Resolution Feature Fusion Block
  • GHRFFB Global multi-level High-Resolution Feature Fusion Block
  • image reconstruction module F REC for a more in-depth description.
  • IUDDB Iterative Up and Down Sampling Distillation Block
  • USB upsampling processing module
  • USB enlarges the image feature map from the low-resolution space to the high-resolution space, and obtains the image high-resolution feature map.
  • the structure of USB is shown in Figure 3.
  • USB mainly includes a deconvolution layer and an information distillation layer (the information distillation operation is the channel split (Channel split) operation).
  • the feature map can be described as:
  • m is the number of USB and DSB contained in each IUDDB in IUDFFN.
  • the present invention marks this part of information as a rough image feature map, and they need to further pass through the subsequent levels in the IUDDB; the remaining 1/4 is The present invention calibrates this part of information as refined image feature maps, which are directly input into LHRFFB.
  • the information flow through the information distillation layer can be expressed as:
  • Distil( ⁇ ) represents the information distillation operation.
  • the rough feature map and fine feature map of the ith USB output in IUDDB are respectively and
  • the dense connection method is innovated in the IUDDB: if a certain USB is not the first USB in the IUDDB, the input of this USB comes from the cascade of all previous DSB outputs.
  • the input of the i-th USB in IUDDB can be expressed by formula (8).
  • the output of the USB has two directions, as shown in Figures 2 and 3, one of which is: rough feature map In all DSBs after entering the USB, another direction is: refined feature map Input into LHRFFB.
  • DSB is corresponding to USB, which implements downsampling of high-resolution feature maps to low-resolution feature maps. After the high-resolution feature map passes through DSB, it becomes a low-resolution feature map, and some new low-resolution features in the image are extracted.
  • DSB consists of only one average pooling layer, whose internal operations are:
  • the feature map output in DSB has two directions, as shown in Figure 2, one direction is input to all USBs after it, and the other direction is input to LLRFFB.
  • LLRFFB receives multiple levels of low-resolution feature maps output from all DSBs.
  • the structure of LLRFFB is indicated in the red dashed box on the left side of Figure 4.
  • these multiple levels of low-resolution feature maps containing different features are first fused, and then feature dimensionality reduction is performed on the fused features.
  • the process can be expressed as:
  • H LLRFFB-out indicates the output of the module LLRFFB.
  • Concat( ⁇ ) represents the feature fusion operation
  • Conv 1 ⁇ 1 ( ⁇ ) represents the feature dimensionality reduction operation.
  • Label 1 in Figure 2 marks the output of the module LLRFFB, which will be input into GLRFFB.
  • LHRFFB The structure of LHRFFB is shown in the blue dashed box on the right side of Figure 4. Its structure is very simple, it only includes a feature fusion operation, which performs feature fusion on the tiny high-resolution feature maps output from all m USBs input, and outputs after completing local multi-level high-resolution feature fusion. Operations in LHRFFB can be described as:
  • H LHRFFB-out represents the output of the LHRFFB module, which is marked with label 2 in Figure 2, and it will be input into GHRFFB.
  • IUDDB also sets a residual learning structure different from any other network model, as shown in the top yellow line in Figure 2, the new residual learning structure in IUDDB connects the output of the first DSB in IUDDB and the last The output of a DSB, so that the IUDDB module only needs to learn the residual between them.
  • This novel residual learning structure can be described by Equation (13).
  • H IUDDB-b represents an output of IUDDB, and this output will be input to all subsequent IUDDBs, so that each IUDDB forms a densely connected structure, n represents the nth IUDDB in the network, and the label in Figure 23
  • the output H IUDDB-b can be scaled.
  • IUDDBs in IUDFFN have three outputs except the last one. These three outputs are marked by labels 1, 2, 3 respectively.
  • Label 1 calibrates the local multi-level low-resolution feature maps output from the IUDDB after fusion and dimensionality reduction. These feature maps will be input into GLRFFB; label 2 calibrates the local multi-level These feature maps will be input into GHRFFB; the label 3 is the low-resolution feature map output by IUDDB to all subsequent IUDDBs. Therefore, the output of the entire IUDDB can be described as:
  • GLRFFB mainly includes two operations, as shown in the red dashed box on the left in Figure 5, one is the feature fusion operation, and the other is the deconvolution upsampling operation.
  • IUDFFN first extracts the shallow feature H 0 of the image in the shallow feature extraction module FSF , and then each IUDDB will output the low-resolution feature map in GLRFFB
  • the first operation in GLRFFB is to fuse all these low-resolution feature maps from different levels:
  • H GLRFFB-1 represents the intermediate feature map output by the GLRFFB module after the first step of operation.
  • the input in GLRFFB is the low-resolution feature map output by multiple levels of IUDDB
  • the input in GHRFFB is the high-resolution feature map output by multiple levels of IUDDB.
  • the image reconstruction module in the network enlarges the image from the low-resolution space to high-resolution space; another method is to upsample the low-resolution feature maps obtained in the network to high-resolution space first, then fuse all high-resolution feature maps in high-resolution space, and then use fusion
  • the final high-resolution feature map is used to reconstruct the final high-resolution image.
  • the second method does not enlarge the image at the image reconstruction layer in the network, and can make full use of the high-resolution and low-resolution features of the image extracted by the middle level of the IUDFFN network.
  • the present invention selects the second method to perform the low-resolution feature map and Fusion of high-resolution feature maps.
  • H GLRFFB Deconv(H GLRFFB-1 ) (16)
  • Deconv( ) represents the deconvolution operation.
  • H GLRFFB represents the output of the GLRFFB module.
  • GRFFB Global multi-level high-resolution feature fusion block
  • Each IUDDB outputs a high-resolution feature map
  • These high-resolution feature maps are distilled refined features with small scale. Therefore, in GHRFFB, the present invention directly fuses these multi-level high-resolution feature maps and outputs them.
  • the structure of GHRFFB is shown in the blue dashed box on the right in FIG. 5 .
  • the operations performed in GHRFFB can be described as:
  • H GHRFFB indicates the output of the GHRFFB module.
  • the structure of the REC module in IUDFFN is shown in Figure 6. It draws on the design idea of the partial post-upsampling model, including a feature fusion operation and two concatenated convolution operations.
  • the feature fusion operation performs feature fusion on the high-resolution feature maps output from the GLRFFB and GHRFFB modules input to this module.
  • Using two convolutions in series at the end of the network can effectively stabilize the quality of high-resolution images generated by the network model.
  • the operations in this module can be described as:
  • Conv 1 ( ) and Conv 2 ( ) denote operations performed with two concatenated convolutions respectively.
  • I SR represents the high-resolution image output by the IUDFFN network after the image high-resolution magnification and reconstruction process, which corresponds to the low-resolution image I LR input to the network.
  • the design idea of the network model is advanced, making full use of the high-resolution and low-resolution images of multiple levels generated by the middle layer of the network High-rate feature maps, and innovatively choose to fuse these feature maps in high-resolution space, realizing the design concept of the model.
  • the IUDDB in IUDFFN innovatively designed a new dense connection and residual learning structure: the new dense connection enables the information output by the USB (DSB) module to be transmitted to all subsequent DSB (USB) modules, which not only strengthens the Feature reuse, and new image features are also extracted; the new residual learning structure connects the output of the first DSB in the IUDDB to the output of the last DSB, so that the IUDDB only needs to learn the residual between the two outputs. , which reduces the amount of computation, speeds up the training process, and improves performance.
  • the advanced feature distillation structure design is properly introduced in USB in IUDDB, which can not only reduce the network size, but also improve the network reconstruction performance.
  • the convolution operation in the convolution layer is followed by a Leaky ReLU activation function operation.
  • IUDFFN is only trained for ⁇ 3 amplification factor, and the convolution kernel size in USB and DSB is set to 7 ⁇ 7. The purpose of this is to increase the receptive field size of up-sampling and down-sampling operations, and to deeply mine low-resolution images.
  • the other convolution kernel sizes are set to 3 ⁇ 3.
  • this embodiment chooses to use the L1 loss function.
  • this embodiment uses widely used PSNR (Peak Signal-to-Noise Ratio) and SSIM (Structural Similarity) indicators in the field of image SR for quantitative evaluation.
  • PSNR Peak Signal-to-Noise Ratio
  • SSIM Structuretural Similarity
  • human visual observation is also used for subjective evaluation.
  • the network model is implemented using the PyTorch framework.
  • the central processing unit CPU of the experimental hardware is i7 8700k
  • the image processor GPU is NVIDIA 2070SUPER
  • the GPU memory is 8GB
  • the computer memory is 16GB.
  • the number of rounds (epoch) of network learning is set to 700, and the small batch data (batch size) is set to 16.
  • This network model uses the DIV2K dataset as a training set, which contains 800 high-definition training images.
  • this embodiment Before inputting into the network training, this embodiment first performs bi-cubic (Bi-cubic) downsampling on these high-resolution images to obtain corresponding low-resolution images, and the low-resolution images and high-resolution images constitute network training. set.
  • the low-resolution images are first randomly cropped into 32 ⁇ 32 image blocks, and then randomly rotated by 90°, 180°, 270° and then input into the network for training.
  • this chapter uses 5 benchmark test sets widely used in the field of image super-resolution, they are Set5, Set14, BSD100, Urban100, Manga109.
  • this embodiment adopts the control variable method and conducts detailed ablation experiments. Including the designed original network, a total of 7 comparison networks are designed in this embodiment. In order to speed up the network training process, this embodiment adjusts the hyperparameters of the network training, sets the batch size to 8, and the number of epochs to 100. Under the condition that the amplification factor is 3 and the test set is Set5, the best quantitative index PSNR results obtained by these 7 networks within 100 epochs are recorded, and these results are recorded in Table 1.
  • Structure 7 which includes all network structure designs, has achieved the highest performance, which proves that the IUDFFN network design idea is advanced and the structure arrangement is reasonable.
  • Each module in the network is indispensable, and the reduction of each module will bring about a decrease in network performance.
  • the network scale parameters of IUDFFN mainly include M (the number of IUDDBs) and m (the number of USBs and DSBs in each IUDDB).
  • M the number of IUDDBs
  • m the number of USBs and DSBs in each IUDDB.
  • CNN Convolutional Neural Network
  • the hyperparameters in the network are properly adjusted: set the batch size to 8, the epoch to 120, the amplification factor is still 3, and the test set is selected as Set5.
  • the performance curves during training are recorded in Fig. 7.
  • the meaning of the legend M3m6 in the figure is: the value of M is 3, the value of m is 6, and the meanings of other legends can be deduced by analogy.
  • the M3m5 model has better performance. Its performance is higher than that of the models M2m4, M2m5, M3m4, M4m6, and M4m5. Although its performance is slightly lower than that of M3m6, its parameters are better than those of the model M3m6. Much less, and its performance is good enough.
  • the scale parameter M in the IUDFFN model is set to 3, and m is set to 5.
  • the classic super-resolution methods include the Bi-cubic method, and the advanced network models that have been proposed include SRCNN, DRCN, LapSRN, DRRN, MemNet, EDSR, RDN, and RCAN, etc.
  • the experimental results for comparison are recorded in Table 2 below.
  • IUDFFN has achieved better objective performance than other advanced methods.
  • Figures 8, 9, and 10 respectively show the comparison of reconstruction effects of IUDFFN and various advanced methods on different test set images. Among them, the method used for each image reconstruction and the evaluation value of the PSNR quantification index are marked below the image.
  • each embodiment in this specification is described in a progressive manner, each embodiment focuses on the difference from other embodiments, and the same and similar parts of each embodiment can be referred to each other.
  • the description is relatively simple, and for the related information, please refer to the description of the method part.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)

Abstract

本发明公开了一种图像超分辨率放大模型及其方法,该模型包括浅层特征提取模块F SF、多级低高分辨率特征提取模块F DF、全局多级低分辨率特征融合模块F GLRFFB、全局多级高分辨率特征融合模块F GHRFFB和图像重构模块F REC;该方法为:对所输入的低分辨率图像I LR进行浅层特征提取,得到浅层特征图H 0;依次进行M个层级的低分辨率和高分辨率特征提取,获得低分辨率特征图H DF-L和高分辨率特征图H DF-H,接收M个所述H DF-L并进行特征融合,得到融合后的低分辨率特征图H GLRFFB;接收M个所述H DF-H并进行特征融合,得到融合后的高分辨率特征图H GHRFFB;接收所述H GLRFFB和所述H GHRFFB,生成超分辨率放大的图像I SR。本发明图像重建性能高,图片放大效果好。

Description

一种图像超分辨率放大模型及其方法 技术领域
本发明涉及图像处理技术领域,更具体的说是涉及一种图像超分辨率放大模型及其方法。
背景技术
在当今的单幅图像超分辨率放大的方法中,包括三类方法。第一类方法是基于插值的方法,第二类方法是基于样例的方法,第三类方法是基于神经网络的方法。目前,基于神经网络的方法所取得的性能要超过基于插值的方法和基于样例的方法。
现有的网络模型结构设计中,未曾有一种网络模型结构设计考虑充分利用网络中多个层级输出的高分辨率特征图和低分辨率特征图的特征进行图像超分辨率重建。
因此,如何提供一种准确率高图像重建效果好的图像超分辨率放大模型及其方法是本领域技术人员亟需解决的问题。
发明内容
有鉴于此,本发明提供了一种图像超分辨率放大模型及其方法,能够完整准确地对图像进行放大重建。
为了实现上述目的,本发明采用如下技术方案:
一种图像超分辨率放大模型,包括:浅层特征提取模块F SF、多级低高分辨率特征提取模块F DF、全局多级低分辨率特征融合模块F GLRFFB、全局多级高分辨率特征融合模块F GHRFFB和图像重构模块F REC
所述浅层特征提取模块F SF,用于对所输入的低分辨率图像I LR进行浅层特征提取,得到浅层特征图H 0
所述多级低高分辨率特征提取模块F DF包括M个稠密连接的迭代上下采样蒸馏块IUDDB,用于通过M个稠密连接的所述IUDDB依次进行M个层级的低分辨率和高分辨率特征提取,获得低分辨率特征图H DF-L和高分辨率特征 图H DF-H,其中第1个所述IUDDB后的每个所述IUDDB的输入均为前面所有所述IUDDB输出的级联;
所述全局多级低分辨率特征融合模块F GLRFFB,用于接收M个所述H DF-L并进行特征融合,得到融合后的低分辨率特征图H GLRFFB
所述全局多级高分辨率特征融合模块F GHRFFB,用于接收M个所述H DF-H并进行特征融合,得到融合后的高分辨率特征图H GHRFFB
所述图像重构模块F REC,用于接收所述H GLRFFB和所述H GHRFFB,生成超分辨率放大的图像I SR
优选的,所述浅层特征提取模块F SF采用卷积层从输入低分辨率图像I LR中提取浅层特征图H 0
优选的,所述迭代上下采样蒸馏块IUDDB包括:上采样处理块USB、下采样处理块DSB、局部多级低分辨率特征融合块LLRFFB、局部多级高分辨率特征融合块LHRFFB和残差学习模块RL;
所述USB包括反卷积层和信息蒸馏层,其中,第i个上采样处理块中反卷积层的输入为
Figure PCTCN2021140258-appb-000001
通过所述反卷积层进行反卷积操作后的输出为
Figure PCTCN2021140258-appb-000002
所述信息蒸馏层接收所述
Figure PCTCN2021140258-appb-000003
并进行通道分割操作,获得粗糙图像特征图
Figure PCTCN2021140258-appb-000004
和精致图像特征图
Figure PCTCN2021140258-appb-000005
其中所述
Figure PCTCN2021140258-appb-000006
输入至后续所有的IUDDB中的DSB中,所述
Figure PCTCN2021140258-appb-000007
输入至当前IUDDB中的LHRFFB中;
其中,当i为1时,所述USB的输入为H 0,当i不为1时,当前USB的输入为当前USB之前所有DSB输出的级联;
所述DSB包括平均池化层,所述平均池化层用于对输入特征图进行平均池化,其中所述DSB的输入为当前DSB之前所有USB输出的
Figure PCTCN2021140258-appb-000008
的级联,所述DSB输出低分辨率特征图,并分别输入至当前IUDDB中的LLRFFB中以及当前IUDDB之后的所有USB中;
所述LLRFFB用于将接收到的所有低分辨率特征图进行融合,并对融合后的特征进行特征降维,并输出H LLRFFB-out至所述F GLRFFB中;
所述LHRFFB用于将接收到的所有
Figure PCTCN2021140258-appb-000009
进行特征融合,完成局部多级高分辨率特征融合,并输出H LHRFFB-out至所述F GHRFFB中;
所述残差学习模块RL,用于学习所述F DF中第一个DSP的输出和当前DSP的输出之间的残差,获得残差输出为
Figure PCTCN2021140258-appb-000010
并且将
Figure PCTCN2021140258-appb-000011
输入到后续所有的IUDDB中去,使得各个IUDDB之间构成稠密连接的结构。
优选的,所述F GLRFFB包括特征融合单元和反卷积上采样单元;
所述特征融合单元,用于将接收到的所有低分辨率特征图进行特征融合,获取融合后的低分辨率特征图作为中间特征图H GLRFFB-1
所述反卷积上采样单元,用于对所述H GLRFFB-1进行反卷积放大,获得所述F GLRFFB的输出H GLRFFB
优选的,所述F REC包括一个特征融合单元和两个串联的卷积单元;
所述特征融合单元,用于将输入到F REC的所述H GLRFFB和所述H GHRFFB进行特征融合;
两个串联的所述卷积单元用于将融合后的特征图依次进行两次卷积获得I SR
一种图像超分辨率放大方法,包括以下步骤:
S1.对所输入的低分辨率图像I LR进行浅层特征提取,得到浅层特征图H 0
S2.依次进行稠密连接的M个层级的低分辨率和高分辨率特征提取依次进行M个层级的低分辨率和高分辨率特征提取,获得低分辨率特征图H DF-L和高分辨率特征图H DF-H
S3.接收M个所述H DF-L并进行特征融合,得到融合后的低分辨率特征图H GLRFFB
S4.接收M个所述H DF-H并进行特征融合,得到融合后的高分辨率特征图H GHRFFB
S5.接收所述H GLRFFB和所述H GHRFFB,生成超分辨率放大的图像I SR
优选的,S1中通过卷积层从输入低分辨率图像I LR中提取浅层特征图H 0
优选的,S2具体包括以下内容:
对输入特征图进行上采样,具体包括:对第i个输入
Figure PCTCN2021140258-appb-000012
进行反卷积操作,输出
Figure PCTCN2021140258-appb-000013
对输入特征图进行反卷积操作后的特征图进行通道分割操作,获得粗糙图像特征图
Figure PCTCN2021140258-appb-000014
和精致图像特征图
Figure PCTCN2021140258-appb-000015
将所述
Figure PCTCN2021140258-appb-000016
进行下采样,对所有所述
Figure PCTCN2021140258-appb-000017
进行特征融合;
其中,第1个输入
Figure PCTCN2021140258-appb-000018
为H 0,当i不为1时,输入为前i个层级下采样 的输出级联;
对经过上采样后的低分辨率特征图进行平均池化,将平均池化后的低分辨率特征图分别进行特征融合以及上采样;
将接收到的所有低分辨率特征图进行融合,并对融合后的特征进行特征降维,并输出H LLRFFB-out
将接收到的所有
Figure PCTCN2021140258-appb-000019
进行特征融合,完成局部多级高分辨率特征融合,并输出H LHRFFB-out
学习第一个层级的上采样输出和当前层级的上采样输出之间的残差,获得残差输出为
Figure PCTCN2021140258-appb-000020
并且进行下一个层级的上采样。
优选的,S3的具体内容包括:
将S2输出的降维后的所有低分辨率特征图进行特征融合,获取融合后的低分辨率特征图作为中间特征图H GLRFFB-1
对所述H GLRFFB-1进行反卷积放大,输出H GLRFFB
S4的具体内容包括:
对S2输出的所有高分辨率特征图进行特征融合,获取融合后的高分辨率特征图H GHRFFB
优选的,S5具体包括:将所述H GLRFFB和所述H GHRFFB进行特征融合,将融合后的特征图依次进行两次卷积获得I SR
经由上述的技术方案可知,与现有技术相比,本发明公开提供了一种图像超分辨率放大模型及其方法,提出了新的神经网络用于训练和超分辨率放大,该网络通过稠密连接的迭代上下采样蒸馏块IUDDB对图像在低分辨率和高分辨率上的特征进行迭代式提取,并且通过蒸馏的方式一部分特征输入给下一个迭代高分辨率和低分辨率特征提取模块,一部分特征输入给全局低分辨率融合块和全局高分辨率融合快进行处理,最后通过图像重构模块进行图像的重建,该模型及方法经过多级特征提取,相对与现有技术中的图像放大模型及方法具有重建性能更高成像效果更好的特点,能够稳定有效地实现图像的放大。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。
图1附图为本发明提供的一种图像超分辨率放大模型的结构示意图;
图2附图为本发明提供的一种图像超分辨率放大模型中IUDDB的结构示意图;
图3附图为本发明提供的一种图像超分辨率放大模型中USB的结构示意图;
图4附图为本发明提供的一种图像超分辨率放大模型中LLRFFB的结构示意图;
图5附图为本发明提供的一种图像超分辨率放大模型中GLRFFB和GHRFFB的结构示意图;
图6附图为本发明提供的一种图像超分辨率放大模型中REC的结构示意图;
图7附图为本发明实施例中实验部分训练过程中的性能曲线示意图;
图8附图为本发明实施例中将IUDFFN与其他方法的重建效果对比示意图;
图9附图为本发明实施例中将IUDFFN与其他方法的重建效果对比示意图;
图10附图本发明实施例中将IUDFFN与其他方法的重建效果对比示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
本发明实施例公开了一种图像超分辨率放大模型及其方法。
以下将结合附图对所提出的网络做进一步地说明。
整个所提出的用于超分辨放大的网络结构如图1所示。所提出的网络IUDFFN包含浅层特征提取模块F SF,多级低、高分辨率特征提取模块F DF,全局多级低分辨率特征融合模块(Global multi-level Low-Resolution Feature Fusion Block,GLRFFB)F GLRFFB,全局多级高分辨率特征融合模块(Global multi-level High-Resolution Feature Fusion Block,GHRFFB)F GHRFFB,图像重构模块F REC
1.IUDFFN使用一个卷积层从输入低分辨率图像I LR中提取浅层特征H 0
H 0=F SF(I LR)=Conv SF(I LR)                                                   (1)
然后H 0输入到F DF模块。在F DF模块中本发明使用M个稠密连接的迭代上下采样蒸馏块(Iterative Up-Down sampling Distillation Block,IUDDB)进行多个层级的低分辨率、高分辨率特征提取。F DF模块中执行的操作可以简单用下式描述。
H DF-L,H DF-H=F DF(H 0)                                   (2)
其中,H DF-L和H DF-H分别是H 0通过F DF模块后得到的图像低分辨率特征图和高分辨率特征图。它们之后分别被输入到GLRFFB和GHRFFB模块中。在GLRFFB和GHRFFB中进行的操作可以简化为:
H GLRFFB=F GLRFFB(H DF-L)                      (3)
H GHRFFB=F GHRFFB(H DF-H)                      (4)
最终,图像重构模块F REC将H GLRFFB和H GHRFFB作为输入,生成一张高质量的重建图像I SR,该过程可以用公式(5)描述。
I SR=F REC(H GLRFFB,H GHRFFB)                    (5)
以下将对所涉及的网络中的多级低、高分辨率特征提取模块F DF中的迭代上下采样蒸馏块(IUDDB),全局多级低分辨率特征融合模块(Global multi-level Low-Resolution Feature Fusion Block,GLRFFB)F GLRFFB,全局多级高分辨率特征融合模块(Global multi-level High-Resolution Feature Fusion Block,GHRFFB)F GHRFFB,图像重构模块F REC进行更深入的描述。
多级低、高分辨率特征提取模块F DF中的迭代上下采样蒸馏块(IUDDB)
其中的迭代上下采样蒸馏块(IUDDB)的结构如图2所示。它是整个网络中的重要组成部分。它主要包含五个部分:上采样处理块(Up Sampling Block, USB)、下采样处理块(Down Sampling Block,DSB)、局部多级低分辨率特征融合块(Local multi-level Low-Resolution Feature Fusion Block,LLRFFB)、局部多级高分辨率特征融合块(Local multi-level High-Resolution Feature Fusion Block,LHRFFB)、残差学习(Residual Learning,RL)结构。以下将详细介绍这些结构。
(1)USB(上采样处理模块)
USB将图像特征图由低分辨率空间放大到了高分辨率空间,得到了图像高分辨率特征图。USB的结构如图3所示。USB主要包含一个反卷积层和一个信息蒸馏层(信息蒸馏操作即为通道分割(Channel split)操作)。特征图通过反卷积层可以描述为:
Figure PCTCN2021140258-appb-000021
其中,
Figure PCTCN2021140258-appb-000022
Figure PCTCN2021140258-appb-000023
分别表示IUDDB中第i个USB中反卷积层的输入和输出。m是IUDFFN中每个IUDDB所包含的USB和DSB的数量。
信息蒸馏后信息流被分成了两部分,3/4是
Figure PCTCN2021140258-appb-000024
本发明将这一部分信息标定为粗糙图像特征图,它们需要进一步通过IUDDB中之后的层级;其余1/4是
Figure PCTCN2021140258-appb-000025
本发明将这一部分信息标定为精致图像特征图,它们直接输入到LHRFFB中去。信息流通过信息蒸馏层可以表示为:
Figure PCTCN2021140258-appb-000026
其中,Distil(·)表示信息蒸馏操作。IUDDB中第i个USB输出的粗糙特征图和精致特征图分别为
Figure PCTCN2021140258-appb-000027
Figure PCTCN2021140258-appb-000028
值得注意的是,如图2所示,IUDDB中创新了稠密连接方式:如果某个USB不是IUDDB中的第一个USB,那么这个USB的输入来自于它之前所有DSB输出的级联。IUDDB中第i个USB的输入可以用公式(8)表示。
Figure PCTCN2021140258-appb-000029
其中,
Figure PCTCN2021140258-appb-000030
表示第(i-1)个DSB的输出,Concat(·)表示特征级联操作。
USB的输出有两个去向,如图2和3所示,其中一个方向是:粗糙特征图
Figure PCTCN2021140258-appb-000031
进入该USB之后所有的DSB中,另一个方向是:精致特征图
Figure PCTCN2021140258-appb-000032
输入到LHRFFB中。
(2)低分辨率处理块(DSB)
DSB和USB是相对应的,它实现将高分辨率特征图下采样为低分辨率特征图。高分辨率特征图通过DSB后,变成了低分辨率特征图,提取到了图像中某些新的低分辨率特征。DSB仅由一个平均池化层组成,其内操作为:
Figure PCTCN2021140258-appb-000033
其中,
Figure PCTCN2021140258-appb-000034
Figure PCTCN2021140258-appb-000035
分别表示IUDDB中第j个DSB的输入和输出。和USB相似,DSB的输入来自于它之前所有USB输出的粗糙特征图的级联,表示为:
Figure PCTCN2021140258-appb-000036
DSB中特征图输出有两个方向,如图2中所示,一个方向是输入到它之后所有的USB中,另一个方向是输入到LLRFFB中。
(3)局部多级低分辨率特征融合块(LLRFFB)
LLRFFB接收来自所有DSB中输出的多个级别的低分辨率特征图。图4中左侧红色虚线框内表示LLRFFB的结构。在LLRFFB中首先将这些多个级别的包含不同特征的低分辨率特征图进行融合,然后再对融合后的特征进行特征降维。该过程可以表示为:
Figure PCTCN2021140258-appb-000037
其中
Figure PCTCN2021140258-appb-000038
表示IUDDB中第m个DSB的输出,H LLRFFB-out表示模块LLRFFB的输出。Concat(·)表示特征融合操作,Conv 1×1(·)表示特征降维操作。图2中标签①标定了模块LLRFFB的输出,它将被输入到GLRFFB中。
(4)局部多级高分辨率特征融合块(LHRFFB)
图4中右侧蓝色虚线框内表示LHRFFB的结构。它的结构非常简单,它只包含一个特征融合操作,它将输入进来的所有m个USB中输出的精致高分辨率特征图进行特征融合,完成局部多级高分辨率特征融合后输出。LHRFFB中的操作可以描述为:
Figure PCTCN2021140258-appb-000039
其中,
Figure PCTCN2021140258-appb-000040
表示IUDDB中第m个USB输出的精致特征图。H LHRFFB-out表示LHRFFB模块的输出,图2中用标签②标定它,它将被输入到GHRFFB中。
(5)RL
在网络模型设计中借鉴残差学习结构有两大好处:第一,残差学习可以有效地抑制网络模型训练过程中出现的梯度消失问题;第二,残差学习可以使得网络只学习连接起始点和连接终点之间的残差,有效地降低网络的计算复 杂度,加速网络的拟合。IUDDB中还设置了一个不同于其它任何网络模型的残差学习结构,如图2中最上方黄色的线所示,IUDDB中新式的残差学习结构连接了IUDDB中第一个DSB的输出和最后一个DSB的输出,使得IUDDB模块只需要学习它们之间的残差。这个新式残差学习结构可以用公式(13)描述。
Figure PCTCN2021140258-appb-000041
其中,H IUDDB-b表示IUDDB的一个输出,这个输出将输入到之后所有的IUDDB中去,使得各个IUDDB之间构成稠密连接的结构,n表示网络中第n个IUDDB,图2中的标签③能够标定输出H IUDDB-b
(6)模块输出
从图2中可以看出,除了最后一个IUDDB外,IUDFFN中所有的IUDDB都有三个输出。这三个输出分别由标签①、②、③标定。标签①标定的是IUDDB中输出的局部多级低分辨率特征图融合并降维后的低分辨率特征图,这些特征图将被输入到GLRFFB中;标签②标定的是IUDDB中输出的局部多级高分辨率特征图融合后的高分辨率特征图,这些特征图将被输入到GHRFFB中;标签③标定的是IUDDB输出给其后所有IUDDB中的低分辨率特征图。所以,整个IUDDB的输出可以描述为:
Figure PCTCN2021140258-appb-000042
其中,
Figure PCTCN2021140258-appb-000043
表示第k个IUDDB中所进行的操作,1≤k≤M,M表示网络中IUDDB的数量。标签①指示
Figure PCTCN2021140258-appb-000044
标签②指示
Figure PCTCN2021140258-appb-000045
Figure PCTCN2021140258-appb-000046
标签③指示
Figure PCTCN2021140258-appb-000047
2.全局多级低分辨率特征融合块(GLRFFB)
GLRFFB主要包含两个操作,如图5中左侧红色虚线框中所示,一个是特征融合操作,另一个是反卷积上采样操作。
IUDFFN首先在浅层特征提取模块F SF中提取到了图像浅层特征H 0,之后,每个IUDDB都会输出到GLRFFB中低分辨率特征图
Figure PCTCN2021140258-appb-000048
GLRFFB中的第一个操作是将所有的这些来自不同层级的低分辨率特征图进行特征融合:
Figure PCTCN2021140258-appb-000049
其中,
Figure PCTCN2021140258-appb-000050
表示的是IUDFFN中第1个IUDDB中输出给GLRFFB模块的低分辨率特征图,H GLRFFB-1表示的是经过第一步操作后GLRFFB模块输出的中间特征图。
GLRFFB中输入进来的是多个层级IUDDB输出的低分辨率特征图,GHRFFB中输入进来的是多个层级IUDDB输出的高分辨率特征图。有两种方式可以将IUDFFN网络模型中生成的低分辨率特征图和高分辨率特征图进行融合。一种是先将高分辨率特征图下采样为低分辨率特征图,然后将所有的低分辨率特征图进行融合,最后,在网络中的图像重构模块将图像从低分辨率空间放大到高分辨率空间;另一种方法是将网络中获得的低分辨率特征图先行上采样到高分辨率空间,然后在高分辨率空间将所有的高分辨率特征图进行融合,然后再用融合后的高分辨率特征图去重建最终的高分辨率图像。第二种方法不在网络中的图像重构层放大图像,并可以充分利用IUDFFN网络中间层级提取到的图像高分辨率、低分辨率特征,本发明选择第二种方法进行低分辨率特征图和高分辨率特征图的融合。
所以,GLRFFB中经过特征融合操作后,再对融合后的低分辨率特征图进行反卷积放大:
H GLRFFB=Deconv(H GLRFFB-1)                  (16)
其中,Deconv(·)表示反卷积操作。H GLRFFB表示GLRFFB模块的输出。
3.全局多级高分辨率特征融合块(GHRFFB)
每个IUDDB都会输出高分辨率特征图
Figure PCTCN2021140258-appb-000051
这些高分辨率特征图是经过蒸馏得到的精致特征,规模小。所以,在GHRFFB中,本发明将这些多个层级的高分辨率特征图直接融合后输出,GHRFFB的结构如图5中右侧蓝色虚线框中所示。在GHRFFB中执行的操作可以描述为:
Figure PCTCN2021140258-appb-000052
其中,
Figure PCTCN2021140258-appb-000053
表示IUDFFN中第2个IUDDB中输出给GHRFFB模块的高分辨率特征图,H GHRFFB表示GHRFFB模块的输出。
4.图像重构模块
IUDFFN中的REC模块结构如图6所示,它借鉴了偏后置上采样模型设计思想,包含一个特征融合操作和两个串联的卷积操作。特征融合操作将输入到该模块的来自GLRFFB和GHRFFB模块输出的高分辨率特征图进行特征融合。在网络最后使用两个卷积串联的方式能够有效地稳定网络模型生成的高分辨率图像的质量。该模块中的操作可以描述为:
I SR=Conv 2(Conv 1(Concat(H GLRFFB,H GHRFFB)))            (18)
其中,Conv 1(·)和Conv 2(·)表示用两个串联的卷积分别进行的操作。I SR表示IUDFFN网络进行图像高分辨率放大重建过程后输出的高分辨率的图像,它与输入网络的低分辨率图像I LR相对应。
从以上对IUDFFN网络模型的描述中,可以看出其主要有以下三个创新点:(1)网络模型设计思想先进,充分利用了网络中间层级生成的多个级别的图像高分辨率、低分辨率特征图,并创新性的选择在高分辨率空间将这些特征图进行融合,实现了模型的设计理念。(2)IUDFFN中的IUDDB创新性的设计了新式的稠密连接和残差学习结构:新式稠密连接使得USB(DSB)模块输出的信息能够传递给它之后所有的DSB(USB)模块,不仅加强了特征复用,还提取到了新的图像特征;新式残差学习结构将IUDDB中第一个DSB的输出连接到了最后一个DSB的输出,使得IUDDB只需学习它们二者输出之间的残差即可,减少了计算量,加速了训练过程,提高了性能表现。(3)在IUDDB中的USB中恰当地引入了先进的特征蒸馏结构设计,不仅可以减小网络规模,还可以提高网络重建性能。
下面将通过实验数据来进一步对本发明进行说明:
1.实验设置
在IUDFFN模型中,卷积层中卷积操作后面都跟了一个Leaky ReLU激活函数操作。IUDFFN只针对×3放大因子进行训练,USB和DSB中的卷积核大小设置为7×7,这样做的目的是为了增大上采样操作和下采样操作的感受野大小,以深度挖掘低分辨率特征图和高分辨率特征图之间隐含的联系。其它的卷积核大小设置为3×3。在本文网络规模研究部分,最终确定参数M=3,m=5,则IUDFFN中SF、DF、GLRFFB、GHRFFB、REC模块的输出通道数分别为:64、(320、80、64)、240、240、3。
在训练网络时,本实施例选择使用L 1损失函数。评估网络性能,本实施例使用在图像SR领域广泛使用的PSNR(峰值信噪比)和SSIM(结构相似度)指标进行量化评估,此外,还使用人眼视觉观察进行主观评估。网络模型利用PyTorch框架实现,实验硬件的中央处理器CPU为i7 8700k,图像处理器GPU为NVIDIA的2070SUPER,GPU显存为8GB,电脑内存为16GB。网络学习的回合数(epoch)设置为700,小批量数据(batch size)设置为16。使用 Adam [54]优化器对网络模型学习率进行优化,其中超参数β 1=0.9,β 2=0.999,学习率的初值设置为1×10 -4,随着训练次数的增加,网络中的学习率自适应降低。
2.训练集和测试集
本网络模型使用DIV2K数据集作为训练集,该数据集包含了800张高清训练图像。在输入网络训练之前,本实施例先对这些高分辨率图像进行双三次(Bi-cubic)下采样,得到与之对应的低分辨率图像,低分辨率图像和高分辨率图像构成了网络训练集。低分辨率图像首先经过随机裁剪,裁剪为32×32大小的图像块,然后再经过随机旋转90°、180°、270°后输入到网络中进行训练。对于网络性能测试,本章使用了5个在图像图像超分辨率领域广泛使用的基准测试集,它们分别是Set5、Set14、BSD100、Urban100、Manga109。
3.网络可靠性研究和规模选择
(1)消融实验
为了验证IUDFFN模型设计思想和结构安排的可靠性和稳定性,对于网络中的主要结构,本实施例采用控制变量法,进行了详细的消融实验。包括设计的原始网络在内,本实施例总共设计了7个对比网络。为了加速网络训练过程,本实施例调整了网络训练的超参数,将batch size设置为8,epoch数量设置为100。在放大因子为3,测试集为Set5的条件下,记录了这7个网络在100个epoch内所取得的最好量化指标PSNR结果,并将这些结果被记录在了表格1中。从表格中可以看出,包含所有网络结构设计的Structure 7取得了最高的性能表现,这证明了IUDFFN网络设计思想先进,结构安排合理。其网络中的每个模块都是不可或缺的,每个模块的减少,都会带来网络性能的下降。
表1不同结构网络模型量化评估结果比较
(√表示模型中包含此结构,×表示模型中不包含此结构)
Figure PCTCN2021140258-appb-000054
Figure PCTCN2021140258-appb-000055
(2)网络规模研究
IUDFFN网络规模参数主要包括M(IUDDB的数量)和m(每个IUDDB中USB和DSB的数量)。在基于CNN(卷积神经网络)的各种应用中,随着网络深度和宽度的增加,即网络规模的增大,网络的性能往往会发生变化。在一定范围内,随着网络规模增大,网络的性能会不断提升,但是当网络规模超过了一定范围时,训练时会出现梯度消失,对训练集过拟合等问题,造成网络的性能下降。为了获得控制网络规模的两个参数M和m的较优解,进行多次实验。同样,为了加速实验进程,将网络中的超参数适当调小:将batch size设置为8,epoch设置为120,放大因子还是为3,测试集选定为Set5。训练过程中的性能曲线被记录在了图7中。图中图例M3m6的含义是:M的值为3,m的值为6,其它图例的含义依次类推。
观察图中曲线,可以发现M3m5模型有着比较好的表现,它的性能在模型M2m4、M2m5、M3m4、M4m6、M4m5之上,虽然它的性能略微地低于M3m6,但是它的参数要比模型M3m6的少很多,且它的性能表现已经足够优异。为了平衡网络的参数量和性能,最终,本实施例将IUDFFN模型中的规模参数M设置为3,m设置为5。
4.实验结果与分析
(1)重建图像在客观指标上的比较
本实施例选择了一些经典且前沿的超分辨率算法和网络模型进行客观指 标比较,经典的超分辨率方法有Bi-cubic方法,已提出的先进的网络模型有SRCNN、DRCN、LapSRN、DRRN、MemNet、EDSR、RDN和RCAN等。对比的实验结果记录在如下表2中。
表2所提出的网络IUDFFN与其它先进方法或网络结构的量化结果比较
(最好的结果和次好的结果分别进行了加粗和加下划线展示)
Figure PCTCN2021140258-appb-000056
Figure PCTCN2021140258-appb-000057
观察以上表格可以得出,除了Manga109测试集外,在所有的测试集上,当放大因子为3,评价指标为PSNR和SSIM时,IUDFFN取得了较其它先进方法更好的客观性能表现,具体描述为:在评价指标为PSNR时,IUDFFN在基准测试集Set5、Set14、BSD100、Urban100上分别比先进模型RDN高出0.44dB、0.54dB、0.43dB和0.56dB,比先进模型RCAN分别高出0.41dB、0.46dB、0.37dB、0.27dB。
(2)重建图像视觉对比
将IUDFFN模型与其它先进方法或者网络模型的重建效果进行视觉对比。图8、9、10分别展示了IUDFFN和各种先进方法在不同测试集图像上的重建效果对比图。其中,每张图像重建所用的方法和PSNR量化指标评估值标注在图像下方。
观察图8可以发现,真实高分辨率图像左侧向日葵内部是有颗粒感的,但是除了本发明中的IUDFFN模型重建出的图像有颗粒感外,其它方法重建出的图像颗粒感较弱。观察图9,虽然建筑物的局部结构及其复杂,但是IUDFFN模型重建出的图像与真实高分辨率图像结构一致,纹理接近。不仅如此,与其它方法相比,IUDFFN模型重建出的图像细节处内容更加丰富。图10展示了IUDFFN对漫画图像的重建效果,观察此图可以发现:在图像左上角人物头发处,其它所有先进方法重建出的图像与原图像相比,都受到了伪影影响,且较为严重,而本发明的IUDFFN模型重建出的图像只受到了较小的伪影影响,与真实高分辨率图像最为接近,视觉观感舒适,取得了最高的图像重建性能。
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本发明。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本发明的精神或范围的情况下,在其它实施例中实现。因此,本发明将不会被限制于本文所示的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。

Claims (10)

  1. 一种图像超分辨率放大模型,其特征在于,包括:浅层特征提取模块F SF、多级低高分辨率特征提取模块F DF、全局多级低分辨率特征融合模块F GLRFFB、全局多级高分辨率特征融合模块F GHRFFB和图像重构模块F REC
    所述浅层特征提取模块F SF,用于对所输入的低分辨率图像I LR进行浅层特征提取,得到浅层特征图H 0
    所述多级低高分辨率特征提取模块F DF包括M个稠密连接的迭代上下采样蒸馏块IUDDB,用于通过M个稠密连接的所述IUDDB依次进行M个层级的低分辨率和高分辨率特征提取,获得低分辨率特征图H DF-L和高分辨率特征图H DF-H,其中第1个所述IUDDB后的每个所述IUDDB的输入均为前面所有所述IUDDB输出的级联;
    所述全局多级低分辨率特征融合模块F GLRFFB,用于接收M个所述H DF-L并进行特征融合,得到融合后的低分辨率特征图H GLRFFB
    所述全局多级高分辨率特征融合模块F GHRFFB,用于接收M个所述H DF-H并进行特征融合,得到融合后的高分辨率特征图H GHRFFB
    所述图像重构模块F REC,用于接收所述H GLRFFB和所述H GHRFFB,生成超分辨率放大的图像I SR
  2. 根据权利要求1所述的一种图像超分辨率放大模型,其特征在于,所述浅层特征提取模块F SF采用卷积层从输入低分辨率图像I LR中提取浅层特征图H 0
  3. 根据权利要求1所述的一种图像超分辨率放大模型,其特征在于,所述迭代上下采样蒸馏块IUDDB包括:上采样处理块USB、下采样处理块DSB、局部多级低分辨率特征融合块LLRFFB、局部多级高分辨率特征融合块LHRFFB和残差学习模块RL;
    所述USB包括反卷积层和信息蒸馏层,其中,第i个上采样处理块中反卷积层的输入为
    Figure PCTCN2021140258-appb-100001
    通过所述反卷积层进行反卷积操作后的输出为
    Figure PCTCN2021140258-appb-100002
    所述信息蒸馏层接收所述
    Figure PCTCN2021140258-appb-100003
    并进行通道分割操作,获得粗糙图像特征图
    Figure PCTCN2021140258-appb-100004
    和精致图像特征图
    Figure PCTCN2021140258-appb-100005
    其中所述
    Figure PCTCN2021140258-appb-100006
    输入至后续所有的IUDDB中的DSB中,所述
    Figure PCTCN2021140258-appb-100007
    输入至当前IUDDB中的LHRFFB中;
    其中,当i为1时,所述USB的输入为H 0,当i不为1时,当前USB的输入为当前USB之前所有DSB输出的级联;
    所述DSB包括平均池化层,所述平均池化层用于对输入特征图进行平均池化,其中所述DSB的输入为当前DSB之前所有USB输出的
    Figure PCTCN2021140258-appb-100008
    的级联,所述DSB输出低分辨率特征图,并分别输入至当前IUDDB中的LLRFFB中以及当前IUDDB之后的所有USB中;
    所述LLRFFB用于将接收到的所有低分辨率特征图进行融合,并对融合后的特征进行特征降维,并输出H LLRFFB-out至所述F GLRFFB中;
    所述LHRFFB用于将接收到的所有
    Figure PCTCN2021140258-appb-100009
    进行特征融合,完成局部多级高分辨率特征融合,并输出H LHRFFB-out至所述F GHRFFB中;
    所述残差学习模块RL,用于学习所述F DF中第一个DSP的输出和当前DSP的输出之间的残差,获得残差输出为
    Figure PCTCN2021140258-appb-100010
    并且将
    Figure PCTCN2021140258-appb-100011
    输入到后续所有的IUDDB中去,使得各个IUDDB之间构成稠密连接的结构。
  4. 根据权利要求1所述的一种图像超分辨率放大模型,其特征在于,所述F GLRFFB包括特征融合单元和反卷积上采样单元;
    所述特征融合单元,用于将接收到的所有低分辨率特征图进行特征融合,获取融合后的低分辨率特征图作为中间特征图H GLRFFB-1
    所述反卷积上采样单元,用于对所述H GLRFFB-1进行反卷积放大,获得所述F GLRFFB的输出H GLRFFB
  5. 根据权利要求1所述的一种图像超分辨率放大模型,其特征在于,所述F REC包括一个特征融合单元和两个串联的卷积单元;
    所述特征融合单元,用于将输入到F REC的所述H GLRFFB和所述H GHRFFB进行特征融合;
    两个串联的所述卷积单元用于将融合后的特征图依次进行两次卷积获得I SR
  6. 一种图像超分辨率放大方法,其特征在于,包括以下步骤:
    S1.对所输入的低分辨率图像I LR进行浅层特征提取,得到浅层特征图H 0
    S2.依次进行稠密连接的M个层级的低分辨率和高分辨率特征提取依次进行M个层级的低分辨率和高分辨率特征提取,获得低分辨率特征图H DF-L和高分辨率特征图H DF-H
    S3.接收M个所述H DF-L并进行特征融合,得到融合后的低分辨率特征图H GLRFFB
    S4.接收M个所述H DF-H并进行特征融合,得到融合后的高分辨率特征图H GHRFFB
    S5.接收所述H GLRFFB和所述H GHRFFB,生成超分辨率放大的图像I SR
  7. 根据权利要求1所述的一种图像超分辨率放大模型,其特征在于,S1中通过卷积层从输入低分辨率图像I LR中提取浅层特征图H 0
  8. 根据权利要求1所述的一种图像超分辨率放大模型,其特征在于,S2具体包括以下内容:
    对输入特征图进行上采样,具体包括:对第i个输入
    Figure PCTCN2021140258-appb-100012
    进行反卷积操作,输出
    Figure PCTCN2021140258-appb-100013
    对输入特征图进行反卷积操作后的特征图进行通道分割操作,获得粗糙图像特征图
    Figure PCTCN2021140258-appb-100014
    和精致图像特征图
    Figure PCTCN2021140258-appb-100015
    将所述
    Figure PCTCN2021140258-appb-100016
    进行下采样,对所有所述
    Figure PCTCN2021140258-appb-100017
    进行特征融合;
    其中,第1个输入
    Figure PCTCN2021140258-appb-100018
    为H 0,当i不为1时,输入为前i个层级下采样的输出级联;
    对经过上采样后的低分辨率特征图进行平均池化,将平均池化后的低分辨率特征图分别进行特征融合以及上采样;
    将接收到的所有低分辨率特征图进行融合,并对融合后的特征进行特征降维,并输出H LLRFFB-out
    将接收到的所有
    Figure PCTCN2021140258-appb-100019
    进行特征融合,完成局部多级高分辨率特征融合,并输出H LHRFFB-out
    学习第一个层级的上采样输出和当前层级的上采样输出之间的残差,获得残差输出为
    Figure PCTCN2021140258-appb-100020
    并且进行下一个层级的上采样。
  9. 根据权利要求8所述的一种图像超分辨率放大模型,其特征在于,S3的具体内容包括:
    将S2输出的降维后的所有低分辨率特征图进行特征融合,获取融合后的低分辨率特征图作为中间特征图H GLRFFB-1
    对所述H GLRFFB-1进行反卷积放大,输出H GLRFFB
    S4的具体内容包括:
    对S2输出的所有高分辨率特征图进行特征融合,获取融合后的高分辨率特征图H GHRFFB
  10. 根据权利要求1所述的一种图像超分辨率放大模型,其特征在于,S5具体包括:将所述H GLRFFB和所述H GHRFFB进行特征融合,将融合后的特征图依次进行两次卷积获得I SR
PCT/CN2021/140258 2021-09-14 2021-12-22 一种图像超分辨率放大模型及其方法 WO2023040108A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111075866.5A CN113763251B (zh) 2021-09-14 2021-09-14 一种图像超分辨率放大模型及其方法
CN202111075866.5 2021-09-14

Publications (1)

Publication Number Publication Date
WO2023040108A1 true WO2023040108A1 (zh) 2023-03-23

Family

ID=78795698

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/140258 WO2023040108A1 (zh) 2021-09-14 2021-12-22 一种图像超分辨率放大模型及其方法

Country Status (2)

Country Link
CN (1) CN113763251B (zh)
WO (1) WO2023040108A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117132472A (zh) * 2023-10-08 2023-11-28 兰州理工大学 基于前向-反向可分离自注意力的图像超分辨率重建方法
CN117495681A (zh) * 2024-01-03 2024-02-02 国网山东省电力公司济南供电公司 一种红外图像超分辨重建系统及方法
CN117590761A (zh) * 2023-12-29 2024-02-23 广东福临门世家智能家居有限公司 用于智能家居的开门状态检测方法及系统
CN117132472B (zh) * 2023-10-08 2024-05-31 兰州理工大学 基于前向-反向可分离自注意力的图像超分辨率重建方法

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113763251B (zh) * 2021-09-14 2023-06-16 浙江师范大学 一种图像超分辨率放大模型及其方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10304193B1 (en) * 2018-08-17 2019-05-28 12 Sigma Technologies Image segmentation and object detection using fully convolutional neural network
CN111861961A (zh) * 2020-07-25 2020-10-30 安徽理工大学 单幅图像超分辨率的多尺度残差融合模型及其复原方法
CN112070702A (zh) * 2020-09-14 2020-12-11 中南民族大学 多尺度残差特征判别增强的图像超分辨率重构系统及方法
CN112862688A (zh) * 2021-03-08 2021-05-28 西华大学 基于跨尺度注意力网络的图像超分辨率重建模型及方法
CN113763251A (zh) * 2021-09-14 2021-12-07 浙江师范大学 一种图像超分辨率放大模型及其方法

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109829855B (zh) * 2019-01-23 2023-07-25 南京航空航天大学 一种基于融合多层次特征图的超分辨率重建方法
US11398013B2 (en) * 2019-10-18 2022-07-26 Retrace Labs Generative adversarial network for dental image super-resolution, image sharpening, and denoising
CN111161150B (zh) * 2019-12-30 2023-06-23 北京工业大学 一种基于多尺度注意级联网络的图像超分辨率重建方法
CN112581409B (zh) * 2021-01-05 2024-05-07 戚如嬅耳纹科技(深圳)有限公司 一种基于端到端的多重信息蒸馏网络的图像去雾方法
CN112884650B (zh) * 2021-02-08 2022-07-19 武汉大学 一种基于自适应纹理蒸馏的图像混合超分辨率方法
CN113240580B (zh) * 2021-04-09 2022-12-27 暨南大学 一种基于多维度知识蒸馏的轻量级图像超分辨率重建方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10304193B1 (en) * 2018-08-17 2019-05-28 12 Sigma Technologies Image segmentation and object detection using fully convolutional neural network
CN111861961A (zh) * 2020-07-25 2020-10-30 安徽理工大学 单幅图像超分辨率的多尺度残差融合模型及其复原方法
CN112070702A (zh) * 2020-09-14 2020-12-11 中南民族大学 多尺度残差特征判别增强的图像超分辨率重构系统及方法
CN112862688A (zh) * 2021-03-08 2021-05-28 西华大学 基于跨尺度注意力网络的图像超分辨率重建模型及方法
CN113763251A (zh) * 2021-09-14 2021-12-07 浙江师范大学 一种图像超分辨率放大模型及其方法

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117132472A (zh) * 2023-10-08 2023-11-28 兰州理工大学 基于前向-反向可分离自注意力的图像超分辨率重建方法
CN117132472B (zh) * 2023-10-08 2024-05-31 兰州理工大学 基于前向-反向可分离自注意力的图像超分辨率重建方法
CN117590761A (zh) * 2023-12-29 2024-02-23 广东福临门世家智能家居有限公司 用于智能家居的开门状态检测方法及系统
CN117590761B (zh) * 2023-12-29 2024-04-19 广东福临门世家智能家居有限公司 用于智能家居的开门状态检测方法及系统
CN117495681A (zh) * 2024-01-03 2024-02-02 国网山东省电力公司济南供电公司 一种红外图像超分辨重建系统及方法
CN117495681B (zh) * 2024-01-03 2024-05-24 国网山东省电力公司济南供电公司 一种红外图像超分辨重建系统及方法

Also Published As

Publication number Publication date
CN113763251B (zh) 2023-06-16
CN113763251A (zh) 2021-12-07

Similar Documents

Publication Publication Date Title
WO2023040108A1 (zh) 一种图像超分辨率放大模型及其方法
CN111145170B (zh) 一种基于深度学习的医学影像分割方法
CN110097550B (zh) 一种基于深度学习的医学图像分割方法及系统
CN112734646B (zh) 一种基于特征通道划分的图像超分辨率重建方法
CN109034162B (zh) 一种图像语义分割方法
CN111369440B (zh) 模型训练、图像超分辨处理方法、装置、终端及存储介质
CN109118432A (zh) 一种基于快速循环卷积网络的图像超分辨率重建方法
CN111325751A (zh) 基于注意力卷积神经网络的ct图像分割系统
CN112508864B (zh) 基于改进UNet++的视网膜血管图像分割方法
CN110223304B (zh) 一种基于多路径聚合的图像分割方法、装置和计算机可读存储介质
CN106683048A (zh) 一种图像超分辨率方法及设备
CN105335929B (zh) 一种深度图超分辨方法
CN108764342B (zh) 一种对于眼底图中视盘和视杯的语义分割方法
CN116309648A (zh) 一种基于多注意力融合的医学图像分割模型构建方法
CN113888412B (zh) 一种用于糖尿病视网膜病变分类的图像超分辨率重建方法
CN114283158A (zh) 一种视网膜血管图像分割方法、装置及计算机设备
CN115375711A (zh) 基于多尺度融合的全局上下文关注网络的图像分割方法
CN115953303A (zh) 结合通道注意力的多尺度图像压缩感知重构方法及系统
CN111709882A (zh) 基于亚像素卷积与特征分割的超分辨率融合的计算方法
CN115861346A (zh) 一种基于场景感知融合网络的脊柱核磁共振图像分割方法
CN115439329A (zh) 人脸图像超分辨率重建方法及计算机可读取的存储介质
CN115526829A (zh) 基于ViT与上下文特征融合的蜂窝肺病灶分割方法及网络
CN112200809B (zh) 基于骨架分割与关键点检测的粘连染色体分离方法及装置
CN117095012A (zh) 一种眼底视网膜血管图像分割方法、系统及设备
CN116485654A (zh) 卷积神经网络与Transformer相结合的轻量级单图像超分辨率重建方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21957372

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE