CN115526777A - Blind over-separation network establishing method, blind over-separation method and storage medium - Google Patents
Blind over-separation network establishing method, blind over-separation method and storage medium Download PDFInfo
- Publication number
- CN115526777A CN115526777A CN202211081493.7A CN202211081493A CN115526777A CN 115526777 A CN115526777 A CN 115526777A CN 202211081493 A CN202211081493 A CN 202211081493A CN 115526777 A CN115526777 A CN 115526777A
- Authority
- CN
- China
- Prior art keywords
- network
- blind
- hyper
- resolution
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000000926 separation method Methods 0.000 title description 6
- 230000015556 catabolic process Effects 0.000 claims abstract description 115
- 238000006731 degradation reaction Methods 0.000 claims abstract description 115
- 238000012549 training Methods 0.000 claims abstract description 74
- 238000000605 extraction Methods 0.000 claims abstract description 23
- 238000005070 sampling Methods 0.000 claims abstract description 13
- 238000012360 testing method Methods 0.000 claims abstract description 12
- 238000012795 verification Methods 0.000 claims abstract description 5
- 238000010586 diagram Methods 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 6
- 230000007850 degeneration Effects 0.000 claims description 3
- 230000008569 process Effects 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 14
- 239000000284 extract Substances 0.000 abstract description 3
- 230000004438 eyesight Effects 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 17
- 230000004913 activation Effects 0.000 description 4
- 230000007547 defect Effects 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 3
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000002411 adverse Effects 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 230000016776 visual perception Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4046—Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/73—Deblurring; Sharpening
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a blind hyper-division network establishing method, a blind hyper-division method and a storage medium, belonging to the field of computer vision and comprising the following steps: constructing a training sample by using the high-resolution image and the corresponding degraded image, and dividing a training set, a test set and a verification set; constructing a blind hyper-diversity network, which comprises a degradation estimation network and a generation network; the generation network comprises an up-sampling network and a feature extraction network containing a plurality of deformable convolution layers and feature extraction modules which are alternately connected; the degradation estimation network estimates the degradation information of each pixel position in the input image and inputs the degradation information to each deformable convolution layer respectively; after a feature extraction network extracts a feature map of an input image, an up-sampling module reconstructs the feature map into a specified magnification factor of the size of the input image to obtain a super-resolution image; and taking the degraded image in the training sample as an input image, and training, testing and verifying the blind hyper-resolution network to obtain the blind hyper-resolution network for performing super-resolution reconstruction on the image. The super-resolution reconstruction method can improve the super-resolution reconstruction effect.
Description
Technical Field
The invention belongs to the field of computer vision, and particularly relates to a blind hyper-division network establishing method, a blind hyper-division method and a storage medium.
Background
In the current society, with the popularization of smart phones, the rise of network live broadcasting and monitoring equipment distributed throughout streets, images become an indispensable part of daily life. However, due to the limitations of the photographing devices, the influence of the complex photographing environment, and the compression loss of network transmission, the images always have various problems, such as noise, compression artifacts, low resolution, etc., which greatly reduce the visual perception and adversely affect the tasks of object detection, face recognition, etc. Therefore, how to improve the image resolution based on the existing hardware level becomes an urgent problem to be solved.
The image super-resolution technology can improve the image resolution only through a corresponding algorithm on the basis of not improving the hardware level, thereby causing wide attention. However, the existing super-resolution technology is mainly directed to ideal images, which assume that low-resolution images are obtained by performing Bicubic down-sampling on high-resolution images, and in this way, a data set is constructed to train a super-resolution network. However, when the super-resolution technology faces a real scene image, the performance of the existing super-resolution technology is greatly reduced because the real scene image often has the defects of noise, artifacts and the like. Therefore, the blind super-resolution method for the real scene image has extremely high practical application value and is a development trend of the super-resolution technology. In recent years, with the development of deep learning represented by a convolutional neural network, researchers have begun to apply the deep learning to super-resolution technology, and have the network automatically extract features from a low-resolution image and further construct a high-resolution image.
However, due to the fact that real scene images have various defects and complex and diverse backgrounds, the existing blind super-resolution method based on deep learning cannot well focus on the area with rich textures and serious degradation in the images, and therefore the super-resolution reconstruction result is not ideal.
Disclosure of Invention
In view of the defects and improvement requirements of the prior art, the invention provides a blind hyper-resolution network establishing method, a blind hyper-resolution method and a storage medium, and aims to improve the super-resolution reconstruction effect.
To achieve the above object, according to an aspect of the present invention, there is provided a blind hyper-division network establishing method, including:
performing degradation operation on each high-resolution image in the high-resolution image data set to obtain corresponding degraded images; constructing training samples by using the high-resolution images and the corresponding degraded images, and dividing all the training samples into a training set, a testing set and a verification set;
constructing a blind hyper-diversity network to be trained; the blind hyper-division network comprises a degradation estimation network and a generation network; the degradation estimation network is used for estimating degradation information of each pixel position in the input image, and the generation network is used for performing super-resolution reconstruction on the input image by using the degradation information; the generation network comprises a feature extraction network and an up-sampling network; the characteristic extraction network comprises a plurality of deformable convolution layers and a plurality of characteristic extraction modules which are alternately connected, degradation information output by the degradation estimation network is respectively input to each deformable convolution layer, and the characteristic extraction network is used for carrying out characteristic extraction on an input image to obtain a characteristic diagram; the up-sampling module is used for reconstructing the characteristic diagram into a specified magnification factor of the size of the input image to obtain a super-resolution image;
and taking the degraded image in the training sample as an input image, and respectively training, testing and verifying the blind hyper-resolution network to be trained by utilizing the training set, the testing set and the verifying set to obtain the blind hyper-resolution network for performing super-resolution reconstruction on the image.
The blind hyper-resolution network established by the invention comprises a degradation estimation network for estimating degradation information of each pixel position in an input image and a generation network for generating a hyper-resolution image, wherein the degradation information of each pixel position predicted by the degradation estimation network is introduced into the generation network by the feature extraction network of the generation network through a deformable convolution module, and an offset of deformable convolution can be generated after the introduction.
Further, the degradation estimation network is a UNet network, and a spatial attention module is inserted into the coding module.
The invention utilizes the UNet network as the backbone network of the degradation estimation network, and introduces the space attention module in the coding module, so that the degradation estimation network can pay attention to the position with rich texture in the degradation image, enhance the estimation accuracy of the degradation information of each pixel position, further assist the generation network, and generate better high-resolution images.
Further, a channel attention module is inserted into a decoding module of the UNet network.
In the UNet network, the input of a decoding module comes from a previous decoding module and a coding module, and the reception fields of the decoding module and the coding module are different, so that the selection of a proper reception field is crucial to the estimation accuracy of degradation information; the invention utilizes the UNet network as the backbone network of the degradation estimation network, and introduces the channel attention module in the decoding module, so that the network can adaptively select the receptive fields of different degrees for the degradation images of different degrees, thereby effectively improving the estimation accuracy of corresponding degradation information, assisting the generation of the network and generating better high-resolution images.
Further, training the blind hyper-diversity network to be trained comprises:
a pre-training stage: training the degradation estimation network by using a training set to obtain a trained degradation estimation network;
a joint training stage: and performing joint training on the trained degradation estimation network and the generation network by using a training set.
The blind hyper-diversity network established by the invention simultaneously comprises a degradation estimation network and a generation network, and the network structure is more complex, so that an end-to-end training mode is directly adopted, and the training difficulty is higher; the invention adopts a two-stage training mode, and in the first stage, namely a pre-training stage, the degradation estimation network is pre-trained firstly, so that the degradation estimation network has better degradation estimation performance; and then, in the second stage, namely the joint training stage, joint training is carried out by using the pre-trained degradation estimation network and the generation network, so that the training difficulty can be effectively reduced and the training efficiency can be improved on the basis of ensuring the super-resolution reconstruction effect of the whole network.
Further, the degeneration operation includes: carrying out fuzzification operation on the high-resolution image by using a spatial variation fuzzy core, then carrying out down-sampling, wherein the training sample also comprises a spatial variation fuzzy core corresponding to the degraded image, and the degradation information estimated by the degradation estimation network is the spatial variation fuzzy core;
the loss function for the pre-training phase is:
loss p =||k p -k g || 1 +||k p -k g || 1 *g(I LR )
the loss function for the joint training phase is:
loss=ω 1 *loss p +ω 2 *loss g
wherein loss represents the loss of the blind hyper-divided network; loss p And loss g Representing the loss, ω, of the degradation estimation network and the generation network, respectively 1 And ω 2 Respectively representing corresponding weights; k is a radical of p Spatially varying blur kernel, k, representing a degradation estimation network estimate g Representing a spatially varying blur kernel used in the degenerate operation; i is LR Representing the input degraded image, g () representing a gradient operator; | | non-woven hair 1 Mean absolute error is indicated.
When the method is used for carrying out degradation operation on the high-resolution image, the high-resolution image is fuzzified by using the spatial variation fuzzy core, so that different degradation information of different pixel positions in the degraded image is ensured, and the capability of the blind hyper-resolution network for carrying out super-resolution reconstruction on the image in a real scene is effectively improved; when the network is trained, the degradation gradient loss and the degradation pixel loss are combined to serve as a loss function of the degradation estimation network, the degradation gradient loss enables the degradation estimation network to focus more on the position with rich textures in a degradation image, and the estimation accuracy of the position with rich textures is enhanced.
Further, loss g =||I SR -I HR || 1 +||g(I SR )-g(I HR )|| 1
Wherein, I SR Representing hyper-divided images of the output of the generating network, I HR Representing a high resolution image.
The method adopts the combination of the pixel loss and the gradient loss of the super-resolution image as the loss function of the generated network, and the gradient loss of the super-resolution image can enable the generated network to pay more attention to the position with rich texture in the degraded image, thereby enhancing the reconstruction effect of the super-resolution image.
Further, the gradient operator is a Scharr operator.
According to the method, when the loss function is constructed aiming at the degradation estimation network and the generation network, the gradient is calculated by using the Scharr operator, and a better training effect can be obtained.
Further, in the process of training the blind hyper-diversity network to be trained, random gradient descent containing momentum items is used as an optimizer.
According to another aspect of the present invention, there is provided a blind hyper-segmentation method for real scene images, comprising: the real scene image is input into the blind hyper-division network established by the blind hyper-division network establishing method provided by the invention, and the blind hyper-division network carries out super-resolution reconstruction on the real scene image to obtain a high-resolution image.
According to still another aspect of the present invention, there is provided a computer-readable storage medium including: a stored computer program; when the computer program is executed by the processor, the apparatus on which the computer readable storage medium is located is controlled to execute the blind hyper-separation network establishing method provided by the present invention and/or the blind hyper-separation method for the real scene image provided by the present invention.
Generally, by the above technical solution conceived by the present invention, the following beneficial effects can be obtained:
(1) According to the method, the degradation information of each pixel position predicted by the degradation estimation network is introduced into the generation network by adopting the deformable convolution, and the degradation information of different positions possibly has differences, so that the offset of the deformable convolution is generated by utilizing the degradation information, and the offset also has differences at different positions, so that the deformable convolution can pertinently extract more useful information from different positions of a degraded image, and the super-resolution reconstruction effect is improved.
(2) The invention uses UNet as a backbone network of a degradation prediction network, and introduces a spatial attention module in an encoding module and introduces a channel attention module in a decoding module. A space attention module is introduced into the coding module, so that a degradation estimation network can focus on the position with rich texture in a degradation image, the estimation accuracy of degradation information of the position with rich texture is enhanced, and the super-resolution reconstruction effect of a texture-rich area is further enhanced; a channel attention module is added in a decoding module, so that the network can adaptively select proper receptive fields for degradation of different degrees, the estimation accuracy of corresponding degradation information is further improved, the network is generated in an auxiliary mode, and a better high-resolution image is generated.
(3) The method adopts the combination of the degradation gradient loss and the degradation pixel loss as the loss function of the degradation estimation network, and the degradation gradient loss enables the degradation estimation network to pay attention to the texture rich positions in the degradation image, enhances the estimation accuracy of the texture rich positions, further assists in generating the network and generates a better high-resolution image.
Drawings
Fig. 1 is a flowchart of a blind hyper-diversity network establishment method according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a blind hyper-division network according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a deformable convolution structure according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
In the present application, the terms "first," "second," and the like (if any) in the description and the drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
In order to solve the technical problem that the super-resolution reconstruction effect of the existing super-resolution method is not ideal, the invention provides a blind super-resolution network establishing method, a blind super-resolution method and a storage medium, and the overall thought is as follows: aiming at the problems of various image defects and complex and diverse backgrounds in a real scene, when blind super-resolution is carried out, degradation information is estimated by a degradation estimation network in a blind super-resolution network aiming at each pixel and is introduced into a generation network, and more useful information is pertinently extracted aiming at different positions when the super-resolution reconstruction is carried out by the generation network, so that the super-resolution reconstruction effect is improved; on the basis, a space attention mechanism and a channel attention mechanism are introduced into the degradation estimation network, so that the network can better pay attention to the position with rich textures in the degradation image, the estimation accuracy of degradation information is improved, the network is generated in an auxiliary mode, and a better high-resolution image is generated.
The following are examples.
Example 1:
a blind hyper-diversity network establishment method, as shown in fig. 1, includes:
firstly, performing degradation operation on each high-resolution image in a high-resolution image data set to obtain corresponding degraded images;
optionally, in this embodiment, the selected high-resolution image data set is specifically a DIV2K data set; in other embodiments of the invention, other datasets of high resolution images may be used.
In this embodiment, for a high resolution image in the high resolution image data set, a specific manner of performing a degradation operation on the high resolution image is as follows: carrying out fuzzification operation on the high-resolution image by using a spatial variation fuzzy core, and then carrying out down-sampling; with I HR And I LR Representing a high resolution image and a low resolution image (degraded image), respectively, the above degradation operation can be represented by the following expression:
I LR =(k*I HR )↓
wherein k represents a spatially varying blur kernel, i.e. the blur kernels at different pixel positions in the image are different;C. h and W respectively represent the number of image channels, the height of an image and the width of the image, d represents the width of a fuzzy kernel of a spatial variation fuzzy kernel at each position, and ↓ represents a downsampling operation;
optionally, the spatial variation fuzzy kernel adopted in this embodiment is composed of an anisotropic gaussian fuzzy kernel, and each anisotropic gaussian fuzzy kernel is randomly selected; then, performing convolution on the input image by adopting the selected fuzzy kernel to finish the fuzzification operation; the embodiment of the invention adopts a DIV2K data set as an input image, randomly cuts the input image into a size of 320 multiplied by 320 and the kernel width of an anisotropic Gaussian blur kernelWhere s is an amplification factor and the rotation angle isThe size d of the nucleus is 21; it should be noted that the parameter descriptions herein are only exemplary descriptions and should not be construed as the only limitation of the present inventionIn practical application, the size can be set to other sizes according to needs;
carrying out 1/s down sampling on the image after the blurring operation, namely selecting a pixel at the upper left corner at the position of s multiplied by s to finish the down sampling;
then, forming a training sample by the high-resolution image, the degraded image and the corresponding spatial variation fuzzy core; as an alternative implementation, after performing the degeneration operation on the high resolution image data set, the embodiment further performs a data enhancement operation of horizontal and vertical flipping and random rotation (90 °, 180 °, 270 °) on the combination of all training samples, and finally performs the following steps: 1:1 into a training set, a validation set and a test set.
As shown in fig. 1, the present embodiment further includes: constructing a blind hyper-resolution network to be trained, wherein the blind hyper-resolution network comprises a degradation estimation network and a generation network, the degradation estimation network is used for estimating degradation information of each pixel position in an input image, in the embodiment, the degradation information specifically refers to a spatial variation fuzzy core corresponding to the generated degradation image, and the generation network is used for performing super-resolution reconstruction on the input image by using the degradation information; in this embodiment, the network structure is specifically shown in fig. 2;
in the embodiment, a traditional UNet network comprises an encoding module (EncBlock), a decoding module (DecBlock) and an intermediate connection module between the encoding module and the decoding module, wherein on the basis of the traditional UNet network, a Spatial Attention Module (SAM) is added in the encoding module and is used for extracting spatial information of a degraded image to enable the network to pay attention to a position with rich texture, and a Channel Attention Module (CAM) is added in the decoding module, so that the network can adaptively select receptive fields with different degrees for the degraded images with different degrees; referring to fig. 2, in the present embodiment, the encoding module includes a convolutional layer (Conv), an activation function (Relu), a max pooling layer (MaxPool), and a Spatial Attention Module (SAM), the decoding module includes a Channel Attention Module (CAM), a convolutional layer (Conv), and an activation function (Relu), and the intermediate connection module is composed of two convolutional layers and an activation function; it should be noted that the specific structure of the Spatial Attention Module (SAM) and its position in the encoding module, and the specific structure of the Channel Attention Module (CAM) and its position in the decoding module can be flexibly adjusted according to the actual needs; optionally, in this embodiment, the spatial attention module and the channel attention module both adopt a structural form in a CBAM network, the spatial attention module is introduced after the last convolutional layer, and the channel attention module is introduced after the cascade layer;
in this embodiment, the generating network includes a feature extraction network and an upsampling network; referring to fig. 2, the feature extraction network includes a plurality of deformable convolution layers (DCNs) and a plurality of feature extraction modules that are alternately connected, a spatial variation fuzzy kernel output by the degradation estimation network is respectively input to each deformable convolution layer, and the feature extraction network is configured to perform feature extraction on an input image to obtain a feature map; the up-sampling module is used for reconstructing the characteristic diagram into a specified magnification factor of the size of the input image to obtain a super-resolution image; it should be noted that the specific structures of the deformable convolution, the feature extraction network and the upper application module can be flexibly selected according to actual needs; optionally, in this embodiment, the first deformable convolution (i.e., the deformable convolution between the input image and the feature extraction network) adopts an existing DCNv2 structure, and the subsequent deformable convolution in the feature extraction network is shown in fig. 3, where an offset (offsets) of the deformable convolution is obtained by performing a convolution operation on a spatially varying blur kernel, the feature extraction network adopts a structural form in an ESRGAN network, i.e., an RRDB module, and the upsampling module consists of Pixelshuffle;
the generation network of the embodiment utilizes the deformable convolution layer to introduce the degradation information estimated by the degradation estimation network into the generation network, can fully utilize the spatial information of the fuzzy kernel in the degradation information to generate the offset of the deformable convolution, enables the deformable convolution to extract more useful information in the degradation image, and obtains a super-resolution image with higher quality;
referring to fig. 2, in this embodiment, a connection module composed of a convolution layer and an activation function is further included between the degradation estimation network and the generation network, and is used to adjust the spatial variation blur kernel, so that the spatial variation blur output by the degradation estimation network can be adapted to the deformable convolution layer input into the generation network.
Referring to fig. 1, after a training set, a test set, and a verification set are constructed, and a blind hyper-diversity network to be trained is constructed, the following steps are performed: and taking the degraded image in the training sample as an input image of the blind hyper-resolution network, and respectively training, testing and verifying the blind hyper-resolution network by utilizing the training set, the testing set and the verifying set to obtain the blind hyper-resolution network for performing super-resolution reconstruction on the image.
Considering that the network structure is relatively complex, an end-to-end training mode is directly adopted, and the training difficulty is relatively large, so that the embodiment trains the blind hyper-diversity network to be trained by adopting a two-stage training mode, and specifically comprises the following steps:
a pre-training stage: training the degradation estimation network by using a training set to obtain a trained degradation estimation network;
a combined training stage: performing joint training on the trained degradation estimation network and the trained generation network by using a training set;
in the two-stage training mode, in the first stage, namely the pre-training stage, the degradation estimation network is pre-trained to have better degradation estimation performance; and then, in the second stage, namely the joint training stage, joint training is carried out by using the pre-trained degradation estimation network and the generation network, so that the training difficulty can be effectively reduced and the training efficiency can be improved on the basis of ensuring the super-resolution reconstruction effect of the whole network.
In order to make the network pay attention to the information of the texture rich position in the image better, as a preferred implementation mode, the embodiment adopts the combination of degradation gradient loss and degradation pixel loss as the loss function of the degradation estimation network, and adopts the combination of the pixel loss and the gradient loss of the over-resolution image as the loss function of the generation network; the gradient loss of degradation enables the degradation estimation network to pay more attention to the position with rich texture in the degradation image, the estimation accuracy of the position with rich texture is enhanced, the gradient loss of the super-resolution image enables the generation network to pay more attention to the position with rich texture in the degradation image, and the reconstruction effect of the super-resolution image is enhanced. Specifically, the loss function of the degradation estimation network is:
loss p =||k p -k g || 1 +||k p -k g || 1 *g(I LR )
the loss function of the generated network is:
loss g =||I SR -I HR || 1 +||g(I SR )-g(I HR )|| 1
accordingly, the loss function for the pre-training phase is: loss p ;
The loss function for the joint training phase is:
loss=ω 1 *loss p +ω 2 *loss g
wherein loss represents the loss of the blind hyper-divided network; loss p And loss g Representing the loss, ω, of the degradation estimation network and the generation network, respectively 1 And omega 2 Respectively representing corresponding weights; k is a radical of p Spatially varying blur kernel, k, representing a degradation estimation network estimate g Representing a spatially varying blur kernel used in the degenerate operation; I.C. A LR Representing an input degraded image; g () represents a gradient operator, in the embodiment, the gradient operator is specifically a Scharr operator, and experiments show that the Scharr operator is used for calculating the gradient in the loss function, so that the training effect of the network can be effectively improved; II |) 1 Mean absolute error is indicated.
In order to further improve the training effect of the network, in this embodiment, in the process of training the blind hyper-parting network to be trained, a random gradient descent (SGD) containing Momentum term is used as an optimizer, the Momentum (Momentum) is 0.9, and the Weight penalty coefficient (Weight penalty) is 5 × 10 -4 The Batch Size (Batch Size) is 8, and the initial learning rate is 10 -3 Every 50 rounds (Epoch) is reduced by a factor of 10, and the number of training rounds is 200 rounds.
In general, the blind hyper-resolution network established by the embodiment can accurately estimate the degradation information of the image, so that the network can better focus on the area with rich texture in the image, and the super-resolution reconstruction effect is effectively improved.
Example 2:
a blind hyper-resolution method for a real scene image comprises the following steps: the real scene image is input into the blind hyper-division network established by the blind hyper-division network establishing method provided by the invention, and the blind hyper-division network carries out super-resolution reconstruction on the real scene image to obtain a high-resolution image.
Example 3:
a computer-readable storage medium, comprising: a stored computer program; when the computer program is executed by the processor, the apparatus on which the computer readable storage medium is located is controlled to execute the blind hyper-segmentation network establishment method provided in embodiment 1 above and/or the blind hyper-segmentation method for the real scene image provided in embodiment 2 above.
It will be understood by those skilled in the art that the foregoing is only an exemplary embodiment of the present invention, and is not intended to limit the invention to the particular forms disclosed, since various modifications, substitutions and improvements within the spirit and scope of the invention are possible and within the scope of the appended claims.
Claims (10)
1. A blind hyper-diversity network establishment method is characterized by comprising the following steps:
performing degradation operation on each high-resolution image in the high-resolution image data set to obtain corresponding degraded images; constructing training samples by using the high-resolution images and the corresponding degraded images, and dividing all the training samples into a training set, a test set and a verification set;
constructing a blind hyper-diversity network to be trained; the blind hyper-diversity network comprises a degradation estimation network and a generation network; the degradation estimation network is used for estimating degradation information of each pixel position in an input image, and the generation network is used for performing super-resolution reconstruction on the input image by using the degradation information; the generation network comprises a feature extraction network and an up-sampling network; the characteristic extraction network comprises a plurality of deformable convolution layers and a plurality of characteristic extraction modules which are alternately connected, degradation information output by the degradation estimation network is respectively input to each deformable convolution layer, and the characteristic extraction network is used for carrying out characteristic extraction on the input image to obtain a characteristic diagram; the up-sampling module is used for reconstructing the characteristic diagram into a specified magnification factor of the size of the input image to obtain a super-resolution image;
and taking degraded images in training samples as input images, and utilizing the training set, the testing set and the verification set to respectively train, test and verify the blind hyper-resolution network to be trained to obtain the blind hyper-resolution network for performing super-resolution reconstruction on the images.
2. The blind hyper-diversity network setup method according to claim 1, wherein the degradation estimation network is a UNet network and wherein the coding module has a spatial attention module inserted therein.
3. The blind hyper-diversity network establishment method according to claim 2, wherein a channel attention module is inserted into a decoding module of the UNet network.
4. The blind hyper-parting network establishing method of any one of claims 1 to 3, wherein training the blind hyper-parting network to be trained comprises:
a pre-training stage: training the degradation estimation network by using the training set to obtain a trained degradation estimation network;
a joint training stage: and performing joint training on the trained degradation estimation network and the generation network by using the training set.
5. The blind hyper-diversity network setup method of claim 4, wherein the degeneration operation comprises: carrying out fuzzification operation on the high-resolution image by using a spatial variation fuzzy core, then carrying out down-sampling, wherein the training sample further comprises a spatial variation fuzzy core corresponding to a degraded image, and the degradation information estimated by the degradation estimation network is the spatial variation fuzzy core;
the loss function of the pre-training phase is:
loss p =||k p -k g || 1 +||k p -k g || 1 *g(I LR )
the loss function of the joint training phase is:
loss=ω 1 *loss p +ω 2 *loss g
wherein loss represents a loss of the blind hyper-divided network; loss p And loss g Representing the loss, ω, of the degradation estimation network and the generation network, respectively 1 And ω 2 Respectively representing corresponding weights; k is a radical of formula p Spatially varying blurring kernel, k, representing a network estimate of degradation estimation g Representing a spatially varying blur kernel used in the degenerate operation; i is LR Representing the input degraded image, g () representing a gradient operator; II |) 1 Mean absolute error is indicated.
6. The blind hyper-diversity network setup method according to claim 5,
loss g =‖I SR -I HR ‖ 1 +‖g(I SR )-g(I HR )‖ 1
wherein, I SR A hyper-resolution image representing the output of said generating network, I HR Representing a high resolution image.
7. The blind hyper-division network establishing method of claim 5, wherein the gradient operator is a Scharr operator.
8. The method for establishing the blind hyper-parting network of claim 4, wherein a random gradient descent containing momentum terms is used as an optimizer in the process of training the blind hyper-parting network to be trained.
9. A blind hyper-segmentation method for real scene images is characterized by comprising the following steps: inputting a real scene image into a blind hyper-resolution network established by the blind hyper-resolution network establishing method of any one of claims 1 to 8, and performing super-resolution reconstruction on the real scene image by the blind hyper-resolution network to obtain a high-resolution image.
10. A computer-readable storage medium, comprising: a stored computer program; when being executed by a processor, the computer program controls a device on the computer readable storage medium to execute the blind hyper-segmentation network establishment method in any one of claims 1 to 8 and/or the blind hyper-segmentation method facing to the real scene image provided by claim 9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211081493.7A CN115526777A (en) | 2022-09-06 | 2022-09-06 | Blind over-separation network establishing method, blind over-separation method and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211081493.7A CN115526777A (en) | 2022-09-06 | 2022-09-06 | Blind over-separation network establishing method, blind over-separation method and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115526777A true CN115526777A (en) | 2022-12-27 |
Family
ID=84698140
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211081493.7A Pending CN115526777A (en) | 2022-09-06 | 2022-09-06 | Blind over-separation network establishing method, blind over-separation method and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115526777A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115880158A (en) * | 2023-01-30 | 2023-03-31 | 西安邮电大学 | Blind image super-resolution reconstruction method and system based on variational self-coding |
CN116310959A (en) * | 2023-02-21 | 2023-06-23 | 南京智蓝芯联信息科技有限公司 | Method and system for identifying low-quality camera picture in complex scene |
-
2022
- 2022-09-06 CN CN202211081493.7A patent/CN115526777A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115880158A (en) * | 2023-01-30 | 2023-03-31 | 西安邮电大学 | Blind image super-resolution reconstruction method and system based on variational self-coding |
CN115880158B (en) * | 2023-01-30 | 2023-10-27 | 西安邮电大学 | Blind image super-resolution reconstruction method and system based on variation self-coding |
CN116310959A (en) * | 2023-02-21 | 2023-06-23 | 南京智蓝芯联信息科技有限公司 | Method and system for identifying low-quality camera picture in complex scene |
CN116310959B (en) * | 2023-02-21 | 2023-12-08 | 南京智蓝芯联信息科技有限公司 | Method and system for identifying low-quality camera picture in complex scene |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111062872B (en) | Image super-resolution reconstruction method and system based on edge detection | |
CN110119780B (en) | Hyper-spectral image super-resolution reconstruction method based on generation countermeasure network | |
Liu et al. | Video super-resolution based on deep learning: a comprehensive survey | |
CN114092330B (en) | Light-weight multi-scale infrared image super-resolution reconstruction method | |
CN111047516A (en) | Image processing method, image processing device, computer equipment and storage medium | |
CN111105352A (en) | Super-resolution image reconstruction method, system, computer device and storage medium | |
CN115526777A (en) | Blind over-separation network establishing method, blind over-separation method and storage medium | |
CN113284051B (en) | Face super-resolution method based on frequency decomposition multi-attention machine system | |
CN111681166A (en) | Image super-resolution reconstruction method of stacked attention mechanism coding and decoding unit | |
CN113538246B (en) | Remote sensing image super-resolution reconstruction method based on unsupervised multi-stage fusion network | |
CN113298718A (en) | Single image super-resolution reconstruction method and system | |
CN112541877A (en) | Condition-based generation of deblurring method, system, device and medium for countermeasure network | |
CN112529776B (en) | Training method of image processing model, image processing method and device | |
CN112907448A (en) | Method, system, equipment and storage medium for super-resolution of any-ratio image | |
CN112598604A (en) | Blind face restoration method and system | |
CN117114984A (en) | Remote sensing image super-resolution reconstruction method based on generation countermeasure network | |
CN115713462A (en) | Super-resolution model training method, image recognition method, device and equipment | |
CN116563110A (en) | Blind image super-resolution reconstruction method based on Bicubic downsampling image space alignment | |
CN116934592A (en) | Image stitching method, system, equipment and medium based on deep learning | |
CN115578262A (en) | Polarization image super-resolution reconstruction method based on AFAN model | |
Zheng et al. | Depth image super-resolution using multi-dictionary sparse representation | |
Li et al. | Image reflection removal using end‐to‐end convolutional neural network | |
CN111553861B (en) | Image super-resolution reconstruction method, device, equipment and readable storage medium | |
CN113240581A (en) | Real world image super-resolution method for unknown fuzzy kernel | |
CN116862765A (en) | Medical image super-resolution reconstruction method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |