CN115526777A

CN115526777A - Blind over-separation network establishing method, blind over-separation method and storage medium

Info

Publication number: CN115526777A
Application number: CN202211081493.7A
Authority: CN
Inventors: 邹腊梅; 连志祥; 李广磊; 谢佳; 王皓; 黎云
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2022-09-06
Filing date: 2022-09-06
Publication date: 2022-12-27

Abstract

The invention discloses a blind super-resolution network establishment method, a blind super-resolution method and a storage medium, belonging to the field of computer vision, including: constructing training samples from high-resolution images and corresponding degraded images, and dividing training sets, test sets and Validation set; build a blind super-resolution network, including a degradation estimation network and a generative network; the generative network includes an upsampling network and a feature extraction network with alternately connected multiple deformable convolutional layers and feature extraction modules; the degradation estimation network estimates the input image The degradation information of each pixel position in the image is input to each deformable convolutional layer; after the feature extraction network extracts the feature map of the input image, the upsampling module reconstructs it to the specified magnification of the input image size to obtain a super-resolution image; to train The degraded image in the sample is the input image, and the blind super-resolution network is trained, tested and verified to obtain a blind super-resolution network for super-resolution reconstruction of the image. The present invention can improve the effect of super-resolution reconstruction.

Description

A method for establishing a blind super-resolution network, a blind super-resolution method, and a storage medium

技术领域technical field

本发明属于计算机视觉领域，更具体地，涉及一种盲超分网络建立方法、盲超分方法及存储介质。The invention belongs to the field of computer vision, and more specifically relates to a blind super-resolution network establishment method, a blind super-resolution method and a storage medium.

背景技术Background technique

当今社会随着智能手机的普及、网络直播的兴起，以及遍布大街小巷的监控设备，图像已成为日常生活不可或缺的一部分。然而受拍摄设备的限制、复杂拍摄环境的影响，以及网络传输的压缩损失，图像总是存在各种各样的问题，如噪声、压缩伪影、分辨率低等，这些缺陷大大降低了视觉感受，并且给目标检测、人脸识别等任务带来不利影响。因此，如何在现有硬件水平的基础上，提高图像分辨率成为亟待解决的问题。In today's society, with the popularization of smartphones, the rise of webcasting, and surveillance equipment all over the streets, images have become an indispensable part of daily life. However, due to the limitation of shooting equipment, the influence of complex shooting environment, and the compression loss of network transmission, there are always various problems in the image, such as noise, compression artifacts, low resolution, etc. These defects greatly reduce the visual experience , and have adverse effects on tasks such as target detection and face recognition. Therefore, how to improve the image resolution on the basis of the existing hardware level has become an urgent problem to be solved.

图像超分辨率技术可以在不提升硬件水平基础上，仅通过相应算法就可提升图像分辨率，因此引发了广泛关注。然而现有超分辨率技术主要针对理想图像，其假设低分辨率图像是由高分辨率图像经过Bicubic下采样得来，并以这种方式构造数据集训练超分辨率网络。然而其在面对真实场景图像时，由于真实场景图像往往存在噪声、伪影等缺陷，使得现有的超分辨率技术的性能大大降低。因此，针对真实场景图像的盲超分辨率方法具有极高实际应用价值，也是超分辨率技术的发展趋势。近年来，随着以卷积神经网络为代表的深度学习的发展，研究者开始将其应用到超分辨率技术中，让网络自动从低分辨率图像中提取特征，进而构造高分辨率图像。Image super-resolution technology can improve the image resolution only through the corresponding algorithm without improving the hardware level, so it has attracted widespread attention. However, the existing super-resolution technology is mainly aimed at ideal images. It assumes that low-resolution images are obtained from high-resolution images through Bicubic downsampling, and constructs data sets in this way to train super-resolution networks. However, when faced with real scene images, the performance of existing super-resolution techniques is greatly reduced due to the fact that real scene images often have defects such as noise and artifacts. Therefore, the blind super-resolution method for real scene images has extremely high practical application value, and is also the development trend of super-resolution technology. In recent years, with the development of deep learning represented by convolutional neural networks, researchers have begun to apply it to super-resolution technology, allowing the network to automatically extract features from low-resolution images, and then construct high-resolution images.

但是由于真实场景图像缺陷种类繁多、背景复杂多样，现有的基于深度学习的盲超分方法，不能较好的关注到图像中纹理丰富、退化严重的区域，导致超分辨率重建结果不理想。However, due to the wide variety of defects and complex and diverse backgrounds in real scene images, the existing blind super-resolution methods based on deep learning cannot pay attention to the texture-rich and severely degraded areas in the image, resulting in unsatisfactory super-resolution reconstruction results.

发明内容Contents of the invention

针对现有技术的缺陷和改进需求，本发明提供了一种盲超分网络建立方法、盲超分方法及存储介质，其目的在于，提高超分辨率重建效果。Aiming at the defects and improvement needs of the prior art, the present invention provides a method for establishing a blind super-resolution network, a blind super-resolution method and a storage medium, the purpose of which is to improve the effect of super-resolution reconstruction.

为实现上述目的，按照本发明的一个方面，提供了一种盲超分网络建立方法，包括：In order to achieve the above object, according to one aspect of the present invention, a method for establishing a blind super-resolution network is provided, including:

对高分辨率图像数据集中的各高分辨率图像分别进行退化操作，得到对应的退化图像；由高分辨率图像及对应的退化图像构建训练样本，并将所有训练样本划分为训练集、测试集和验证集；Perform degeneration operations on each high-resolution image in the high-resolution image dataset to obtain the corresponding degraded image; construct training samples from the high-resolution image and the corresponding degraded image, and divide all training samples into training set and test set and validation set;

构建待训练的盲超分网络；盲超分网络包括退化估计网络和生成网络；退化估计网络用于估计输入图像中各像素位置的退化信息，生成网络用于利用退化信息对输入图像进行超分辨率重建；生成网络包括特征提取网络和上采样网络；特征提取网络包括交替连接的多个可变形卷积层和多个特征提取模块，退化估计网络输出的退化信息分别输入至各可变形卷积层，特征提取网络用于对输入图像进行特征提取，得到特征图；上采样模块用于将特征图重建为所述输入图像尺寸的指定放大倍数，得到超分图像；Construct the blind super-resolution network to be trained; the blind super-resolution network includes a degradation estimation network and a generation network; the degradation estimation network is used to estimate the degradation information of each pixel position in the input image, and the generation network is used to use the degradation information to perform super-resolution on the input image The generation network includes a feature extraction network and an upsampling network; the feature extraction network includes multiple deformable convolution layers and multiple feature extraction modules connected alternately, and the degradation information output by the degradation estimation network is input to each deformable convolution layer, the feature extraction network is used to extract features from the input image to obtain a feature map; the up-sampling module is used to reconstruct the feature map to a specified magnification of the input image size to obtain a super-resolution image;

以训练样本中的退化图像为输入图像，利用训练集、测试集和验证集分别对待训练的盲超分网络进行训练、测试和验证，得到用于对图像进行超分辨率重建的盲超分网络。Take the degraded image in the training sample as the input image, use the training set, test set and verification set to train, test and verify the blind super-resolution network to be trained respectively, and obtain the blind super-resolution network for super-resolution reconstruction of the image .

本发明所建立的盲超分网络，包括用于估计输入图像中各像素位置退化信息的退化估计网络和用于生成超分图像的生成网络，生成网络的特征提取网络采用可变形卷积模块将退化估计网络预测的各像素位置的退化信息引入生成网络，引入后会生成可变形卷积的偏移量，由于不同像素位置的退化信息可能存在差异，相应地，偏移量在不同位置也存在差异，使得可变形卷积对输入的退化图像不同位置针对性地提取到更有用的信息，提高超分辨率重建的效果。The blind super-resolution network established by the present invention includes a degradation estimation network for estimating the degradation information of each pixel position in an input image and a generation network for generating a super-resolution image. The feature extraction network of the generation network uses a deformable convolution module to convert The degradation information of each pixel position predicted by the degradation estimation network is introduced into the generation network, and the offset of the deformable convolution will be generated after the introduction. Since the degradation information of different pixel positions may be different, correspondingly, the offset also exists in different positions. The difference enables the deformable convolution to extract more useful information for different positions of the input degraded image and improve the effect of super-resolution reconstruction.

进一步地，退化估计网络为UNet网络，且其中的编码模块中插入了空间注意力模块。Furthermore, the degradation estimation network is a UNet network, and a spatial attention module is inserted into the encoding module.

本发明利用UNet网络作为退化估计网络的主干网络，并在其中的编码模块中引入空间注意力模块，使得退化估计网络能够关注到退化图像中纹理丰富的位置，增强各像素位置的退化信息的估计准确度，进而辅助生成网络，生成更好的高分辨率图像。The present invention uses the UNet network as the backbone network of the degradation estimation network, and introduces a spatial attention module into the encoding module, so that the degradation estimation network can pay attention to the texture-rich positions in the degradation image, and enhance the estimation of the degradation information of each pixel position Accuracy, which in turn assists the generative network to generate better high-resolution images.

进一步地，UNet网络的解码模块中插入了通道注意力模块。Further, a channel attention module is inserted into the decoding module of the UNet network.

不同的退化程度，其模糊核大小不同，在UNet网络中，解码模块的输入来自上一个解码模块以及编码模块，其感受野也不同，因此选择合适的感受野对退化信息的估计准确性至关重要；本发明利用UNet网络作为退化估计网络的主干网络，并在其中的解码模块中引入通道注意力模块，使网络对不同程度的退化图像自适应地选择不同程度的感受野，从而有效提高相应退化信息的估计准确度，进而辅助生成网络，生成更好的高分辨率图像。Different degrees of degradation have different sizes of fuzzy kernels. In the UNet network, the input of the decoding module comes from the previous decoding module and encoding module, and its receptive field is also different. Therefore, choosing a suitable receptive field is crucial to the estimation accuracy of degraded information. Important; the present invention uses the UNet network as the backbone network of the degradation estimation network, and introduces a channel attention module in the decoding module, so that the network can adaptively select different degrees of receptive fields for different degrees of degraded images, thereby effectively improving the corresponding The estimated accuracy of the degradation information can then assist the generation network to generate better high-resolution images.

进一步地，对待训练的盲超分网络进行训练，包括：Further, the blind super-resolution network to be trained is trained, including:

预训练阶段：利用训练集对退化估计网络进行训练，得到训练好的退化估计网络；Pre-training stage: use the training set to train the degradation estimation network to obtain the trained degradation estimation network;

联合训练阶段：利用训练集对训练好的退化估计网络和生成网络进行联合训练。Joint training phase: use the training set to jointly train the trained degradation estimation network and the generation network.

本发明所建立的盲超分网络，同时包括退化估计网络和生成网络，由于网络结构较为复杂，直接采用端到端的训练方式，训练难度较大；本发明采用两阶段的训练方式，在第一阶段，即预训练阶段中，先对退化估计网络进行预训练，使其具有较好的退化估计性能；之后在第二阶段，即联合训练阶段中，利用预训练好的退化估计网络和生成网络进行联合训练，在保证整体网络的超分辨率重建效果的基础上，能够有效降低训练难度，提高训练效率。The blind super-resolution network established by the present invention includes the degradation estimation network and the generation network at the same time. Due to the complex network structure, the end-to-end training method is directly adopted, and the training is difficult; the present invention adopts a two-stage training method. In the first stage, that is, in the pre-training stage, the degradation estimation network is first pre-trained to make it have better degradation estimation performance; then in the second stage, that is, the joint training stage, the pre-trained degradation estimation network and the generation network are used Joint training can effectively reduce the training difficulty and improve the training efficiency on the basis of ensuring the super-resolution reconstruction effect of the overall network.

进一步地，退化操作包括：利用空间变化模糊核对高分辨率图像进行模糊化操作后进行下采样，并且，训练样本还包括退化图像对应的空间变化模糊核，并且，退化估计网络估计的退化信息为空间变化模糊核；Further, the degradation operation includes: using the spatial variation blur kernel to perform the blurring operation on the high-resolution image and then downsampling, and the training sample also includes the spatial variation blur kernel corresponding to the degraded image, and the degradation information estimated by the degradation estimation network is spatially varying blur kernel;

预训练阶段的损失函数为：The loss function in the pre-training phase is:

loss_p＝||k_p-k_g||₁+||k_p-k_g||₁*g(I_LR)loss _p ＝||k _p -k _g || ₁ +||k _p -k _g || ₁ *g(I _LR )

联合训练阶段的损失函数为：The loss function of the joint training phase is:

loss＝ω₁*loss_p+ω₂*loss_g loss＝ω ₁ *loss _p +ω ₂ *loss _g

其中，loss表示盲超分网络的损失；loss_p和loss_g分别表示退化估计网络和生成网络的损失，ω₁和ω₂分别表示对应的权重；k_p表示退化估计网络估计的空间变化模糊核，k_g表示退化操作中所使用的空间变化模糊核；I_LR表示输入的退化图像，g()表示梯度算子；||||₁表示平均绝对误差。Among them, loss represents the loss of the blind super-resolution network; loss _p and loss _g represent the losses of the degradation estimation network and the generation network, respectively, and ω ₁ and ω ₂ represent the corresponding weights; k _p represents the spatially varying fuzzy kernel estimated by the degradation estimation network , k _g represents the spatially varying blur kernel used in the degradation operation; I _LR represents the input degraded image, g() represents the gradient operator; |||| ₁ represents the mean absolute error.

本发明在对高分辨率图像进行退化操作时，会利用空间变化模糊核对高分辨率图像进行模糊化操作，保证了退化图像中不同像素位置的退化信息不同，有效提高了盲超分网络对于真实场景下的图像进行超分辨率重建的能力；在对网络进行训练时，采用退化梯度损失与退化像素损失相结合，作为退化估计网络的损失函数，退化梯度损失使退化估计网络更加关注退化图像中纹理丰富的位置，增强纹理丰富位置的估计准确度。When the present invention degrades the high-resolution image, it will use the spatial variation blur kernel to perform the fuzzy operation on the high-resolution image, which ensures that the degradation information of different pixel positions in the degraded image is different, and effectively improves the accuracy of the blind super-resolution network for real The ability to perform super-resolution reconstruction of images in the scene; when training the network, the combination of degraded gradient loss and degraded pixel loss is used as the loss function of the degraded estimation network. The degraded gradient loss makes the degraded estimation network pay more attention to the degraded image. Texture-rich locations, enhancing the estimation accuracy of texture-rich locations.

进一步地，loss_g＝||I_SR-I_HR||₁+||g(I_SR)-g(I_HR)||₁ Further, loss _g =||I _SR -I _HR || ₁ +||g(I _SR )-g(I _HR )|| ₁

其中，I_SR表示生成网络输出的超分图像，I_HR表示高分辨率图像。Among them, I _SR represents the super-resolution image output by the generation network, and I _HR represents the high-resolution image.

本发明采用超分图像的像素损失和梯度损失相结合，作为生成网络的损失函数，超分图像的梯度损失可以使生成网络更加关注退化图像中纹理丰富的位置，增强超分图像的重建效果。The present invention uses the combination of pixel loss and gradient loss of the super-resolution image as the loss function of the generation network. The gradient loss of the super-resolution image can make the generation network pay more attention to the texture-rich positions in the degraded image, and enhance the reconstruction effect of the super-resolution image.

进一步地，梯度算子为Scharr算子。Further, the gradient operator is a Scharr operator.

本发明在针对退化估计网络和生成网络构建损失函数时，具体利用Scharr算子计算梯度，能够获得较优的训练效果。When the present invention constructs the loss function for the degraded estimation network and the generation network, it specifically uses the Scharr operator to calculate the gradient, so that better training effect can be obtained.

进一步地，对待训练的盲超分网络进行训练的过程中，使用含有动量项的随机梯度下降作为优化器。Furthermore, in the process of training the blind super-scoring network to be trained, stochastic gradient descent with momentum term is used as the optimizer.

按照本发明的另一个方面，提供了一种面向真实场景图像的盲超分方法，包括：将真实场景图像输入由本发明提供的盲超分网络建立方法所建立的盲超分网络，由盲超分网络对真实场景图像进行超分辨率重建，得到高分辨率图像。According to another aspect of the present invention, there is provided a blind super-resolution method for real scene images, comprising: inputting the real scene images into the blind super-resolution network established by the blind super-resolution network establishment method provided by the present invention, by blind super-resolution The sub-network performs super-resolution reconstruction on real scene images to obtain high-resolution images.

按照本发明的又一个方面，提供了一种计算机可读存储介质，包括：存储的计算机程序；计算机程序被处理器执行时，控制计算机可读存储介质所在设备执行本发明提供的上述盲超分网络建立方法，和/或本发明提供的上述面向真实场景图像的盲超分方法。According to another aspect of the present invention, a computer-readable storage medium is provided, including: a stored computer program; when the computer program is executed by a processor, it controls the device where the computer-readable storage medium is located to perform the above-mentioned blind super-resolution method provided by the present invention. A network establishment method, and/or the above-mentioned blind super-resolution method for real scene images provided by the present invention.

总体而言，通过本发明所构思的以上技术方案，能够取得以下有益效果：Generally speaking, through the above technical solutions conceived by the present invention, the following beneficial effects can be obtained:

(1)本发明采用可变形卷积将退化估计网络预测的各像素位置的退化信息引入到生成网络，由于不同位置的退化信息可能存在差异，因此，利用该退化信息生成可变形卷积的偏移量，偏移量在不同位置也存在差异，这样可以使可变形卷积对退化图像不同位置针对性的提取到更有用的信息，提高超分辨率重建的效果。(1) The present invention uses deformable convolution to introduce the degradation information of each pixel position predicted by the degradation estimation network into the generation network. Since there may be differences in the degradation information of different positions, the bias of the deformable convolution is generated by using the degradation information. There are also differences in the offset and offset at different positions, so that the deformable convolution can extract more useful information for different positions of the degraded image, and improve the effect of super-resolution reconstruction.

(2)本发明使用UNet作为退化预测网络的主干网络，并在其中的编码模块中引入空间注意力模块，在解码模块中引入通道注意力模块。其中，在编码模块中引入空间注意力模块，使退化估计网络能关注到退化图像中纹理丰富的位置，增强纹理丰富位置的退化信息的估计准确度，进而增强纹理丰富区域的超分辨率重建效果；在解码模块中添加通道注意力模块，可以使网络对不同程度的退化自适应的选择合适的感受野，进而提高相应退化信息的估计准确度，进而辅助生成网络，生成更好的高分辨率图像。(2) The present invention uses UNet as the backbone network of the degradation prediction network, and introduces a spatial attention module in the encoding module and a channel attention module in the decoding module. Among them, the spatial attention module is introduced in the encoding module, so that the degradation estimation network can pay attention to the texture-rich position in the degraded image, enhance the estimation accuracy of the degradation information of the texture-rich position, and then enhance the super-resolution reconstruction effect of the texture-rich region ;Adding a channel attention module to the decoding module can make the network adaptively select the appropriate receptive field for different degrees of degradation, thereby improving the estimation accuracy of the corresponding degradation information, and then assisting the generation network to generate better high-resolution images. image.

(3)本发明采用退化梯度损失与退化像素损失相结合，作为退化估计网络的损失函数，退化梯度损失使退化估计网络关注退化图像中纹理丰富位置，增强纹理丰富位置的估计准确度，进而辅助生成网络，生成更好的高分辨率图像。(3) The present invention uses the combination of degraded gradient loss and degraded pixel loss as the loss function of the degraded estimation network. The degraded gradient loss makes the degraded estimation network pay attention to the texture-rich position in the degraded image, enhances the estimation accuracy of the texture-rich position, and then assists Generative networks for better high-resolution images.

附图说明Description of drawings

图1为本发明实施例提供的盲超分网络建立方法流程图；Fig. 1 is a flow chart of a method for establishing a blind super-resolution network provided by an embodiment of the present invention;

图2为本发明实施例提供的盲超分网络结构示意图；FIG. 2 is a schematic structural diagram of a blind super-resolution network provided by an embodiment of the present invention;

图3为本发明实施例提供的可变形卷积结构示意图。Fig. 3 is a schematic diagram of a deformable convolution structure provided by an embodiment of the present invention.

具体实施方式detailed description

为了使本发明的目的、技术方案及优点更加清楚明白，以下结合附图及实施例，对本发明进行进一步详细说明。应当理解，此处所描述的具体实施例仅仅用以解释本发明，并不用于限定本发明。此外，下面所描述的本发明各个实施方式中所涉及到的技术特征只要彼此之间未构成冲突就可以相互组合。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not constitute a conflict with each other.

在本发明中，本发明及附图中的术语“第一”、“第二”等(如果存在)是用于区别类似的对象，而不必用于描述特定的顺序或先后次序。In the present invention, the terms "first", "second" and the like (if any) in the present invention and drawings are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence.

为了解决现有的超分辨率方法的超分辨率重建效果不理想的技术问题，本发明提供了一种盲超分网络建立方法、盲超分方法及存储介质，其整体思路在于：针对真实场景下的图像缺陷种类繁多、背景复杂多样的问题，在进行盲超分时，使盲超分网络中的退化估计网络针对各像素估计退化信息，并引入到生成网络中，并使生成网络进行超分辨率重建时，针对不同位置针对性的提取更有用的信息，从而提高超分辨率重建的效果；在此基础上，在退化估计网络中引入空间注意力机制和通道注意力机制，使得网络更好地关注到退化图像中纹理丰富的位置，提高退化信息的估计准确度，辅助生成网络，生成更好的高分辨率图像。In order to solve the technical problem that the super-resolution reconstruction effect of the existing super-resolution method is not ideal, the present invention provides a method for establishing a blind super-resolution network, a blind super-resolution method and a storage medium. There are many types of image defects and complex and diverse backgrounds. When performing blind super-resolution, the degradation estimation network in the blind super-resolution network is used to estimate the degradation information for each pixel, and then introduced into the generation network, and the generation network is used to perform super-resolution. During resolution reconstruction, more useful information is extracted for different locations, thereby improving the effect of super-resolution reconstruction; on this basis, a spatial attention mechanism and a channel attention mechanism are introduced in the degradation estimation network to make the network more efficient. Pay attention to the texture-rich positions in the degraded image, improve the estimation accuracy of the degraded information, and assist the generation network to generate better high-resolution images.

以下为实施例。The following are examples.

实施例1：Example 1:

一种盲超分网络建立方法，如图1所示，包括：A method for establishing a blind super-resolution network, as shown in Figure 1, comprising:

首先，对高分辨率图像数据集中的各高分辨率图像分别进行退化操作，得到对应的退化图像；Firstly, the degradation operation is performed on each high-resolution image in the high-resolution image dataset to obtain the corresponding degraded image;

可选地，本实施例中，选取的高分辨率图像数据集具体为DIV2K数据集；在本发明其他的一些实施例，也可采用其他由高分辨率图像所构成的数据集。Optionally, in this embodiment, the selected high-resolution image data set is specifically the DIV2K data set; in some other embodiments of the present invention, other data sets composed of high-resolution images may also be used.

本实施例中，对于高分辨率图像数据集中的高分辨率图像，对其进行退化操作的具体方式是：利用空间变化模糊核对高分辨率图像进行模糊化操作后进行下采样；以I_HR和I_LR分别表示高分辨率图像和低分辨率图像(退化图像)，则上述退化操作可用如下表达式进行表示：In this embodiment, for the high-resolution image in the high-resolution image data set, the specific way of degrading it is: use the spatial variation blur kernel to perform the blurring operation on the high-resolution image and then perform down-sampling; use _IHR and I _LR represent high-resolution images and low-resolution images (degraded images) respectively, then the above degradation operations can be expressed by the following expressions:

I_LR＝(k*I_HR)↓I _LR ＝(k*I _HR )↓

其中，k表示空间变化模糊核，所谓空间变化模糊核即图像中不同像素位置的模糊核不相同；

C、H、W分别代表图像通道数、图像高、图像宽，并且d代表空间变化模糊核在每个位置的模糊核宽度，↓表示下采样操作；Among them, k represents the spatial variation blur kernel, the so-called spatial variation blur kernel means that the blur kernels of different pixel positions in the image are different;

C, H, and W represent the number of image channels, image height, and image width, respectively, and d represents the blur kernel width of the spatially varying blur kernel at each position, and ↓ represents the downsampling operation;

可选地，本实施例所采用的空间变化模糊核为各向异性高斯模糊核组成，随机选取各项异性高斯模糊核；之后采用选取的模糊核在输入图像上做卷积，完成模糊化操作；本发明实施例采用DIV2K数据集作为输入图像，将输入图像随机裁剪为320×320大小，各向异性高斯模糊核的核宽

其中s为放大因子，旋转角为

核的大小d为21；需要说明的是，此处参数描述仅为示例性描述，不应理解为对本发明的唯一限定，在实际应用中，可以根据需要设定为其他大小；Optionally, the spatially varying blur kernel used in this embodiment is composed of an anisotropic Gaussian blur kernel, and the anisotropic Gaussian blur kernel is randomly selected; then the selected blur kernel is used to perform convolution on the input image to complete the blurring operation ; The embodiment of the present invention adopts the DIV2K data set as the input image, and the input image is randomly cropped to a size of 320×320, and the kernel width of the anisotropic Gaussian blur kernel is

where s is the magnification factor and the rotation angle is

The size d of the core is 21; it should be noted that the parameter description here is only an exemplary description, and should not be understood as the only limitation to the present invention. In practical applications, it can be set to other sizes as needed;

模糊操作后的图像进行1/s下采样，即在s×s的像素位置，选取左上角的像素，完成下采样；The image after the blur operation is down-sampled by 1/s, that is, at the pixel position of s×s, the pixel in the upper left corner is selected to complete the down-sampling;

之后，由高分辨率图像、退化图像及对应的空间变化模糊核构成训练样本；作为一种可选的实施方式，本实施例在对高分辨率图像数据集进行退化操作之后，会进一步对所有训练样本构成的组合进行水平和垂直翻转、随机旋转(90°、180°、270°)的数据增强操作，最后按照8：1：1的数量比例划分训练集、验证集和测试集。Afterwards, training samples are composed of high-resolution images, degraded images, and corresponding spatially varying blur kernels; as an optional implementation, this embodiment will further perform all The combination of training samples is subjected to data enhancement operations of horizontal and vertical flipping and random rotation (90°, 180°, 270°), and finally divides the training set, verification set and test set according to the ratio of 8:1:1.

如图1所示，本实施例还包括：构建待训练的盲超分网络，盲超分网络包括退化估计网络和生成网络，退化估计网络用于估计输入图像中各像素位置的退化信息，本实施例中，退化信息具体指生成退化图像对应的空间变化模糊核，生成网络用于利用退化信息对输入图像进行超分辨率重建；本实施例中，网络结构具体如图2所示；As shown in Figure 1, this embodiment also includes: constructing a blind super-resolution network to be trained, the blind super-resolution network includes a degradation estimation network and a generation network, and the degradation estimation network is used to estimate the degradation information of each pixel position in the input image. In the embodiment, the degradation information specifically refers to generating a spatially varying blur kernel corresponding to the degraded image, and the generation network is used to perform super-resolution reconstruction of the input image using the degradation information; in this embodiment, the network structure is specifically shown in FIG. 2 ;

本实施例中，退化估计网络以UNet网络作为主干网络，传统的UNet网络包括编码模块(EncBlock)、解码模块(DecBlock)以及编码模块和解码模块之间的中间连接模块，本实施例在传统的UNet网络的基础上，在编码模块中加入了空间注意力模块(SAM)，用于提取退化图像的空间信息，使网络关注到纹理丰富位置，在解码模块中加入了通道注意力模块(CAM)，使网络对不同程度的退化图像自适应地选择不同程度的感受野；参阅图2，本实施例中，编码模块包括卷积层(Conv)、激活函数(Relu)、最大池化层(MaxPool)以及空间注意力模块(SAM)，解码模块包括通道注意力模块(CAM)、卷积层(Conv)和激活函数(Relu)，中间连接模块由两个卷积层和激活函数组成；应当说明的是，空间注意力模块(SAM)的具体结构及其在编码模块中的位置，以及通道注意力模块(CAM)的具体结构及其在解码模块中的位置，可根据实际需要灵活调整；可选地，本实施例中，空间注意力模块以及通道注意力模块均采用CBAM网络中的结构形式，空间注意力模块在最后一个卷积层之后引入，通道注意力模块在级联层之后引入；In this embodiment, the degradation estimation network uses the UNet network as the backbone network. The traditional UNet network includes an encoding module (EncBlock), a decoding module (DecBlock) and an intermediate connection module between the encoding module and the decoding module. In this embodiment, the traditional On the basis of the UNet network, a spatial attention module (SAM) is added to the encoding module to extract the spatial information of the degraded image, so that the network can focus on texture-rich positions, and a channel attention module (CAM) is added to the decoding module , so that the network adaptively selects different degrees of receptive fields for different degrees of degraded images; referring to Figure 2, in this embodiment, the encoding module includes a convolutional layer (Conv), an activation function (Relu), a maximum pooling layer (MaxPool ) and spatial attention module (SAM), the decoding module includes channel attention module (CAM), convolutional layer (Conv) and activation function (Relu), the intermediate connection module consists of two convolutional layers and activation function; it should be explained What is worth is that the specific structure of the spatial attention module (SAM) and its position in the encoding module, as well as the specific structure of the channel attention module (CAM) and its position in the decoding module, can be flexibly adjusted according to actual needs; Optionally, in this embodiment, both the spatial attention module and the channel attention module adopt the structural form in the CBAM network, the spatial attention module is introduced after the last convolution layer, and the channel attention module is introduced after the cascade layer;

本实施例中，生成网络包括特征提取网络和上采样网络；参阅图2，特征提取网络包括交替连接的多个可变形卷积层(DCN)和多个特征提取模块，退化估计网络输出的空间变化模糊核分别输入至各可变形卷积层，特征提取网络用于对输入图像进行特征提取，得到特征图；上采样模块用于将特征图重建为所述输入图像尺寸的指定放大倍数，得到超分图像；应当说明的是，可变形卷积、特征提取网络和上采用模块的具体结构可根据实际需要灵活选择；在输入图像与特征提取网络之间，还包括一个可变形卷积，可选地，本实施例中，第一个可变形卷积(即输入图像与特征提取网络之间的可变形卷积)采用现有的DCNv2结构，随后的特征提取网络中的可变形卷积如图3所示，其偏移量(offsets)在空间变化模糊核上进行卷积操作得到，特征提取网络采用ESRGAN网络中的结构形式，即RRDB模块，上采样模块由Pixelshuffle组成；In this embodiment, the generation network includes a feature extraction network and an upsampling network; referring to Fig. 2, the feature extraction network includes a plurality of deformable convolutional layers (DCN) and a plurality of feature extraction modules connected alternately, and the space of the degradation estimation network output The change blur kernel is input to each deformable convolutional layer respectively, and the feature extraction network is used to extract features from the input image to obtain a feature map; the up-sampling module is used to reconstruct the feature map to a specified magnification of the input image size to obtain super-resolution image; it should be noted that the specific structure of the deformable convolution, feature extraction network and upper module can be flexibly selected according to actual needs; between the input image and the feature extraction network, there is also a deformable convolution, which can Optionally, in this embodiment, the first deformable convolution (that is, the deformable convolution between the input image and the feature extraction network) adopts the existing DCNv2 structure, and the subsequent deformable convolution in the feature extraction network is as follows As shown in Figure 3, the offsets (offsets) are obtained by performing convolution operations on the spatially variable fuzzy kernel. The feature extraction network adopts the structural form in the ESRGAN network, that is, the RRDB module, and the upsampling module is composed of Pixelshuffle;

退化估计网络预测的空间变化模糊核代表了退化图像中各个像素位置的退化信息，本实施例的生成网络利用可变形卷积层将退化估计网络估计的退化信息引入到生成网络，可以充分利用退化信息中模糊核的空间信息，生成可变形卷积的偏移量，使可变形卷积提取到退化图像中更有用的信息，获得质量更高的超分辨率图像；The spatially varying blur kernel predicted by the degradation estimation network represents the degradation information of each pixel position in the degraded image. The generation network of this embodiment uses a deformable convolution layer to introduce the degradation information estimated by the degradation estimation network into the generation network, which can make full use of the degradation The spatial information of the fuzzy kernel in the information generates the offset of the deformable convolution, so that the deformable convolution can extract more useful information from the degraded image, and obtain a higher-quality super-resolution image;

参阅图2，本实施例中，在退化估计网络和生成网络之间还包括由卷积层和激活函数组成的连接模块，用于对空间变化模糊核进行调整，使退化估计网络输出的空间变化模糊可适于输入至生成网络中的可变形卷积层。Referring to Fig. 2, in this embodiment, a connection module composed of a convolutional layer and an activation function is also included between the degradation estimation network and the generation network, which is used to adjust the spatial variation blur kernel to make the spatial variation of the degradation estimation network output The blur can be adapted as input to a deformable convolutional layer in a generative network.

参阅图1，本实施例在构建好训练集、测试集和验证集，并建立好待训练的盲超分网络后，会执行如下步骤：以训练样本中的退化图像为盲超分网络的输入图像，利用训练集、测试集和验证集分别对盲超分网络进行训练、测试和验证，得到用于对图像进行超分辨率重建的盲超分网络。Referring to Fig. 1, in this embodiment, after constructing the training set, test set and verification set, and establishing the blind super-resolution network to be trained, the following steps will be performed: take the degraded image in the training sample as the input of the blind super-resolution network Image, use the training set, test set and verification set to train, test and verify the blind super-resolution network respectively, and obtain the blind super-resolution network for super-resolution reconstruction of the image.

考虑到网络结构较为复杂，直接采用端到端的训练方式，训练难度较大，因此，本实施例采用两阶段的训练方式对待训练的盲超分网络进行训练，具体包括：Considering that the network structure is relatively complex, it is difficult to directly adopt the end-to-end training method. Therefore, this embodiment adopts a two-stage training method to train the blind super-resolution network to be trained, which specifically includes:

联合训练阶段：利用训练集对训练好的退化估计网络和生成网络进行联合训练；Joint training stage: use the training set to jointly train the trained degradation estimation network and the generation network;

上述两阶段的训练方式，在第一阶段，即预训练阶段中，先对退化估计网络进行预训练，使其具有较好的退化估计性能；之后在第二阶段，即联合训练阶段中，利用预训练好的退化估计网络和生成网络进行联合训练，在保证整体网络的超分辨率重建效果的基础上，能够有效降低训练难度，提高训练效率。In the above two-stage training method, in the first stage, that is, the pre-training stage, the degradation estimation network is pre-trained to make it have better degradation estimation performance; then in the second stage, that is, the joint training stage, use The joint training of the pre-trained degradation estimation network and the generation network can effectively reduce the training difficulty and improve the training efficiency on the basis of ensuring the super-resolution reconstruction effect of the overall network.

为了使网络更好地关注图像中纹理丰富位置的信息，作为一种优选地实施方式，本实施例采用退化梯度损失与退化像素损失相结合，作为退化估计网络的损失函数，采用超分图像的像素损失和梯度损失相结合，作为生成网络的损失函数；退化梯度损失使退化估计网络更加关注退化图像中纹理丰富的位置，增强纹理丰富位置的估计准确度，超分图像的梯度损失可以使生成网络更加关注退化图像中纹理丰富的位置，增强超分图像的重建效果。具体地，退化估计网络的损失函数为：In order to make the network pay more attention to the information of the texture-rich position in the image, as a preferred implementation mode, this embodiment uses the combination of degraded gradient loss and degraded pixel loss as the loss function of the degraded estimation network, and adopts the super-resolution image The combination of pixel loss and gradient loss is used as the loss function of the generation network; the degradation gradient loss makes the degradation estimation network pay more attention to the texture-rich position in the degraded image, and enhances the estimation accuracy of the texture-rich position. The gradient loss of the super-resolution image can make the generation The network pays more attention to the texture-rich positions in the degraded image, and enhances the reconstruction effect of the super-resolution image. Specifically, the loss function of the degradation estimation network is:

生成网络的损失函数为：The loss function of the generative network is:

loss_g＝||I_SR-I_HR||₁+||g(I_SR)-g(I_HR)||₁ loss _g ＝||I _SR -I _HR || ₁ +||g(I _SR )-g(I _HR )|| ₁

相应地，预训练阶段的损失函数为：loss_p；Correspondingly, the loss function in the pre-training stage is: loss _p ;

loss＝ω₁*loss_p+ω₂*loss_g loss＝ω ₁ *loss _p +ω ₂ *loss _g

其中，loss表示盲超分网络的损失；loss_p和loss_g分别表示退化估计网络和生成网络的损失，ω₁和ω₂分别表示对应的权重；k_p表示退化估计网络估计的空间变化模糊核，k_g表示退化操作中所使用的空间变化模糊核；I_LR表示输入的退化图像；g()表示梯度算子，本实施例中，该梯度算子具体为Scharr算子，实验表明，在上述损失函数中利用Scharr算子计算梯度，可有效提高网络的训练效果；‖ ‖₁表示平均绝对误差。Among them, loss represents the loss of the blind super-resolution network; loss _p and loss _g represent the losses of the degradation estimation network and the generation network, respectively, and ω ₁ and ω ₂ represent the corresponding weights; k _p represents the spatially varying fuzzy kernel estimated by the degradation estimation network , k _g represents the spatial variation blur kernel used in the degradation operation; I _LR represents the input degraded image; g() represents the gradient operator. In this embodiment, the gradient operator is specifically the Scharr operator. Experiments show that, in In the above loss function, the Scharr operator is used to calculate the gradient, which can effectively improve the training effect of the network; ‖ ‖ ₁ means the average absolute error.

为了进一步提高网络的训练效果，本实施例中，对待训练的盲超分网络进行训练的过程中，使用含有动量项的随机梯度下降(SGD)作为优化器，动量(Momentum)为0.9，权重惩罚系数(Weight Decay)为5×10^-4，批尺寸(Batch Size)为8，初始化学习率为10^-3，每50轮(Epoch)缩小10倍，训练轮数为200轮。In order to further improve the training effect of the network, in this embodiment, in the process of training the blind super-resolution network to be trained, stochastic gradient descent (SGD) containing momentum is used as the optimizer, the momentum (Momentum) is 0.9, and the weight penalty The coefficient (Weight Decay) is 5×10 ^-4 , the batch size (Batch Size) is 8, the initial learning rate is 10 ^-3 , every 50 rounds (Epoch) is reduced by 10 times, and the number of training rounds is 200 rounds.

总的来说，本实施例所建立的盲超分网络，能够准确估计图像的退化信息，使网络更好地关注图像中纹理丰富的区域，有效提高超分辨率重建效果。In general, the blind super-resolution network established in this embodiment can accurately estimate the degradation information of the image, so that the network can better focus on the texture-rich regions in the image, and effectively improve the effect of super-resolution reconstruction.

实施例2：Example 2:

一种面向真实场景图像的盲超分方法，包括：将真实场景图像输入由本发明提供的盲超分网络建立方法所建立的盲超分网络，由盲超分网络对真实场景图像进行超分辨率重建，得到高分辨率图像。A blind super-resolution method for real scene images, comprising: inputting the real scene images into the blind super-resolution network established by the blind super-resolution network establishment method provided by the present invention, and performing super-resolution on the real scene images by the blind super-resolution network Reconstructed to obtain a high-resolution image.

实施例3：Example 3:

一种计算机可读存储介质，包括：存储的计算机程序；计算机程序被处理器执行时，控制计算机可读存储介质所在设备执行上述实施例1提供的盲超分网络建立方法，和/或上述实施例2提供的面向真实场景图像的盲超分方法。A computer-readable storage medium, comprising: a stored computer program; when the computer program is executed by a processor, the device where the computer-readable storage medium is located is controlled to execute the blind super-resolution network establishment method provided in Embodiment 1 above, and/or the above implementation Example 2 provides a blind super-resolution method for real scene images.

本领域的技术人员容易理解，以上所述仅为本发明的较佳实施例而已，并不用以限制本发明，凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等，均应包含在本发明的保护范围之内。It is easy for those skilled in the art to understand that the above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention, All should be included within the protection scope of the present invention.

Claims

1. A method for establishing a blind super-resolution network, comprising:

Perform degeneration operations on each high-resolution image in the high-resolution image dataset to obtain the corresponding degraded image; construct training samples from the high-resolution image and the corresponding degraded image, and divide all training samples into training set and test set and validation set;

Build a blind super-resolution network to be trained; the blind super-resolution network includes a degradation estimation network and a generation network; the degradation estimation network is used to estimate the degradation information of each pixel position in an input image, and the generation network is used to utilize the Degradation information performs super-resolution reconstruction on the input image; the generation network includes a feature extraction network and an upsampling network; the feature extraction network includes a plurality of deformable convolutional layers and a plurality of feature extraction modules connected alternately, so The degradation information output by the degradation estimation network is respectively input to each deformable convolution layer, and the feature extraction network is used to perform feature extraction on the input image to obtain a feature map; the up-sampling module is used to convert the feature map Reconstruction is a specified magnification of the input image size to obtain a super-resolution image;

Taking the degraded image in the training sample as the input image, using the training set, the test set and the verification set to train, test and verify the blind super-resolution network to be trained respectively, to obtain Blind super-resolution network for super-resolution reconstruction.

2. The method for establishing a blind super-resolution network according to claim 1, wherein the degradation estimation network is a UNet network, and a spatial attention module is inserted in the coding module therein.

3. The method for establishing a blind super-resolution network according to claim 2, wherein a channel attention module is inserted in the decoding module of the UNet network.

4. The method for establishing a blind super-resolution network according to any one of claims 1 to 3, wherein training the blind super-resolution network to be trained includes:

Pre-training stage: using the training set to train the degradation estimation network to obtain a trained degradation estimation network;

Joint training stage: using the training set to jointly train the trained degradation estimation network and the generation network.

5. The method for establishing a blind super-resolution network according to claim 4, wherein the degeneration operation comprises: using a spatially variable blur kernel to perform a fuzzy operation on a high-resolution image and then downsampling, and the training samples It also includes a spatially varying blur kernel corresponding to the degraded image, and the degradation information estimated by the degradation estimation network is a spatially varying blur kernel;

The loss function of the pre-training stage is:

loss _p ＝||k _p -k _g || ₁ +||k _p -k _g || ₁ *g(I _LR )

The loss function of the joint training phase is:

loss＝ω ₁ *loss _p +ω ₂ *loss _g

Among them, loss represents the loss of the blind super-resolution network; loss _p and loss _g represent the loss of the degradation estimation network and the generation network respectively, and ω ₁ and ω ₂ represent the corresponding weights respectively; k _p represents the spatial variation estimated by the degradation estimation network The blur kernel, k _g represents the spatially varying blur kernel used in the degradation operation; I _LR represents the input degraded image, g() represents the gradient operator; ‖‖ ₁ represents the mean absolute error.

6. blind super-resolution network establishment method as claimed in claim 5, is characterized in that,

loss _g = ‖I _SR -I _HR ‖ ₁ + ‖g(I _SR )-g(I _HR )‖ ₁

Among them, I _SR represents the super-resolution image output by the generation network, and I _HR represents the high-resolution image.

7. The method for establishing a blind super-resolution network according to claim 5, wherein the gradient operator is a Scharr operator.

8. The method for establishing a blind super-resolution network as claimed in claim 4, wherein, in the process of training the blind super-resolution network to be trained, stochastic gradient descent containing a momentum term is used as an optimizer.

9. A blind super-resolution method for real scene images, comprising: inputting real scene images into the blind super-resolution network established by the blind super-resolution network establishment method described in any one of claims 1 to 8 , performing super-resolution reconstruction on the real scene image by the blind super-resolution network to obtain a high-resolution image.

10. A computer-readable storage medium, comprising: a stored computer program; when the computer program is executed by a processor, it controls the device where the computer-readable storage medium is located to perform any one of claims 1-8 The method for establishing a blind super-resolution network, and/or the blind super-resolution method for real scene images provided in claim 9.