CN114943646A - Gradient weight loss and attention mechanism super-resolution method based on texture guidance - Google Patents

Gradient weight loss and attention mechanism super-resolution method based on texture guidance Download PDF

Info

Publication number
CN114943646A
CN114943646A CN202210636553.0A CN202210636553A CN114943646A CN 114943646 A CN114943646 A CN 114943646A CN 202210636553 A CN202210636553 A CN 202210636553A CN 114943646 A CN114943646 A CN 114943646A
Authority
CN
China
Prior art keywords
image
resolution
loss
generator
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210636553.0A
Other languages
Chinese (zh)
Inventor
孙建德
王海涛
李静
万文博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Normal University
Original Assignee
Shandong Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Normal University filed Critical Shandong Normal University
Priority to CN202210636553.0A priority Critical patent/CN114943646A/en
Publication of CN114943646A publication Critical patent/CN114943646A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4007Interpolation-based scaling, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4046Scaling the whole image or part thereof using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4053Super resolution, i.e. output image resolution higher than sensor resolution

Abstract

The invention discloses a super-resolution method based on texture guidance, which mainly solves the problem that high-frequency details are seriously lost when a high-resolution image is generated by a low-resolution image in the prior art. The method mainly comprises the following steps: (1) pairs of high resolution and low resolution image training data are obtained. (2) And constructing an image super-resolution network, particularly using an attention mechanism to measure the weight among each channel to fuse the characteristics, training the generator network, and simultaneously using gradient weight loss constraint to constrain the training in texture guide. (3) And (3) training a relative discriminator by using the generator trained in the step (2) to finely tune a generator network to obtain a final image super-resolution network model. (4) And inputting a low-resolution image, and outputting a high-resolution generated image through an image super-resolution model. The method can more accurately recover the high-frequency detail information from the low-resolution image, and can be used in the fields of target identification, image classification and the like.

Description

Gradient weight loss and attention mechanism super-resolution method based on texture guidance
Technical Field
The invention belongs to a super-resolution method in an image processing technology, can be classified into single-image super-resolution, is used for improving the quality of an image, and can be used in the fields of target identification, image classification and the like.
Background
With the rapid development of image processing technology, more ultra-clear display devices are also emerging, and the demand for higher resolution images or videos is increasing. The image super-resolution method is a task aiming at generating a high-resolution image from a low-resolution image, and a basic visual task is always concerned.
In recent years, the super-resolution method can be mainly divided into three types: interpolation-based methods, reconstruction-based methods, learning-based methods. The interpolation-based method such as Bicubic and Lanczos interpolation method has the advantages of rapidness and directness, but the interpolation-based method can lose high-frequency detail information of the image, the recovered high-resolution image can be accompanied with unnatural details such as artifacts and the like, and the image quality is poor; the reconstruction-based method uses a priori knowledge to limit the possible solution space to generate clearer details, however, as the amplification factor increases, the performance of the reconstruction-based super-resolution method decreases, the quality of the recovered high-resolution image decreases, and the reconstruction-based method generally takes more time and is more computationally expensive. The learning-based method generally uses a machine learning algorithm to obtain a non-linear mapping model between a low-resolution image and a high-resolution image, such as a markov random field, a neighborhood embedding method, sparse coding, a random forest and the like. In recent years, a learning-based method has attracted much attention because of its superior performance as compared with other super-resolution methods.
The super-resolution method based on deep learning is one of learning-based methods, and a nonlinear mapping relation model between a low-resolution image and a high-resolution image is obtained by using a Convolutional Neural Network (CNN), so that a clearer high-resolution image with higher image quality is generated. Pioneering work to apply deep learning to the super-resolution domain is the proposal of SRCNN. The CNN network is used for solving the image super-resolution problem and has more superiority compared with the traditional method, because the CNN network can learn richer characteristics from a large amount of data. After the SRCNN is proposed, VDSR further deepens the depth of the network to solve the single graph super resolution problem. EDSR proposes to delete the Batch Normalization (BN) layer in the network because it will introduce transitions to the features that may adversely affect the final performance. However, the objective function of these methods is mainly focused on minimizing the mean square reconstruction error, which results in the generation of super-resolution images lacking high frequency information, and in order to solve this problem, it is proposed to use super-resolution (SRGAN) generating a countermeasure network, which can reconstruct finer texture details to some extent, and although the scores of evaluation indexes such as peak signal-to-noise ratio and structural similarity are not high, these images are more acceptable visually.
The existing super-resolution method does not consider the texture relation between the low-resolution image and the high-resolution image and the aggregation of global features, and ignores that the low-resolution image can generate more detailed textures through texture guidance and the aggregation of the global features.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a gradient weight loss and attention mechanism super-resolution method based on texture guidance so as to improve the detail information of a generated image.
The technical scheme for realizing the purpose is as follows:
a gradient weight loss and attention mechanism super-resolution method based on texture guidance is used for constructing a CNN-based image super-resolution network model and comprises a generator for generating images and a discriminator for discriminating the authenticity of the generated images, wherein the generator performs nonlinear feature mapping after extracting features from input low-resolution images, calculates the correlation among feature maps through the attention mechanism, redistributes the weight of each feature map, and obtains a high-resolution image through an attention mechanism image reconstruction module, and the gradient weight loss restrains a training process in the texture guidance mode, and the method specifically comprises the following steps:
(1) constructing a data set: acquiring a high-resolution image in a data set, wherein a low-resolution image is obtained by sampling the high-resolution image, and paired training data is acquired, and the specific formula is as follows:
x=F(y)
f () represents a down-sampling operation, x represents a low resolution image, and y represents a high resolution image;
(2) constructing an image super-resolution model, and training a generator: the model consists of two branch networks of a generator and an arbiter, and comprises the following steps: the device comprises a shallow feature extraction module, a nonlinear mapping module, an image reconstruction module of an attention mechanism and a discriminator module; the shallow feature extraction module consists of a convolutional layer, the nonlinear mapping module consists of a residual dense network, and the image reconstruction module of the attention mechanism consists of an attention module and an up-sampling module; extracting and mapping the features through CNN, and finally performing weighted reconstruction on the obtained features to obtain a high-resolution image;
constructing a target equation: generator usage generation to combat network loss L G And gradient weight loss L gw The constraint is specifically as follows:
L=L G +L gw
wherein, the specific formula for generating the network loss is as follows:
Figure BDA0003682326340000021
generating L in countering network loss 1 | | | y-f (x) | L, f (x) represents the generated image, y represents the high resolution image, λ represents the loss parameter, L per For sensing loss, the specific formula is:
L per =‖φ(y)-φ[f(x)]‖
wherein, phi (y), phi (f (x)) represent the characteristics of the real image extracted by the CNN and the generated image;
Figure BDA0003682326340000022
for the generator's fight loss, the specific formula is:
Figure BDA0003682326340000023
Figure BDA0003682326340000024
is a discriminator, X r ,X f Respectively representing the true image and the generated image, outputting a probability of being discriminated as the true image;
Figure BDA0003682326340000025
expressing the expectation of the distribution of the function;
the gradient weight loss is specifically formulated as follows:
Figure BDA0003682326340000026
wherein L is 1 Y-f (x) i, f (x) represents the generated image, y represents the real image; wherein D gw =(1+αD x )(1+αD y );
Figure BDA0003682326340000027
D x ,D y Representing the generation of a gradient difference map between the image and the high resolution in the horizontal and vertical directions,
Figure BDA0003682326340000028
a gradient image representing the generated image,
Figure BDA0003682326340000029
a gradient image representing a real image, α being a weight coefficient in the loss function;
(3) training a discriminator: after the initial training of the generator is completed, training a discriminator to fine tune the generator, and using a relative discriminator to replace a standard discriminator to generate an image with more authenticity and better quality, wherein the following formula is a discriminator loss function:
Figure BDA00036823263400000210
wherein the content of the first and second substances,
Figure BDA00036823263400000211
is a discriminator which inputs a real image and a generated image, outputs a probability of being discriminated as the real image,
Figure BDA00036823263400000212
Figure BDA00036823263400000213
representing the expectation of the distribution of the function.
(4) And (3) high-resolution image output: inputting a low-resolution image, outputting a high-resolution image through a trained image super-resolution network model, wherein the formula is as follows:
Y=F SR (x)
where x is the input low resolution image, F SR () The image super-resolution model is trained, and Y is the output high-resolution image.
More particularly, the specific steps of step (2) are:
(2a) obtaining a feature map m after an input image passes through a shallow feature extraction module, namely two convolution layers;
(2b) carrying out nonlinear mapping on the characteristic graph m through a plurality of residual error dense network modules to obtain the characteristic graph m 1
(2c) Constructing an attention mechanism image reconstruction module, and obtaining the global characteristics of the characteristic graph u on the whole spatial characteristics through global pooling, wherein the formula is as follows:
Figure BDA0003682326340000031
wherein H and W represent the height and width of the feature map, u c (i, j) is a feature value of the feature map u corresponding to the (i, j) position, and after obtaining the global feature description of the feature map u, the relationship between the channels is obtained by using the following formula:
z=F ex (z,W)=σ(g(z,W))=σ(W 2 ReLU(W 1 z))
wherein
Figure BDA0003682326340000032
r is a dimensionality reduction coefficient, ReLU is an activation function, σ () is a Sigmoid activation function, and z is a scalar of the final output;
finally, multiplying the activation value, namely the weight, of each channel by the original feature map u:
Figure BDA0003682326340000033
wherein F scale (u c ,s c ) Is a characteristic diagram u c And a scalar z c Corresponding channel products;
(2d) the feature map obtained in the step (2c) is processed by a convolution up-sampling module to obtain a result with higher resolution;
(2e) training the generator network, wherein the loss function is used for generating the countermeasure network loss and the gradient weight loss, and the specific formula is as follows:
L=L G +L gw
the specific formula of the generator loss is as follows:
Figure BDA0003682326340000034
l in generator losses 1 | | | y-f (x) | L, f (x) represents the generated image, y represents the high resolution image, L per For sensing loss, λ is a loss parameter, and loss is calculated by using a feature map output by the trained VGG16 network relu3 — 3 on the ImageNet dataset, wherein the specific formula is as follows: l is per =‖φ(y)-φ[f(x)]II,. phi. (y),. phi. (f) (x)) represents the features of the real image extracted and the generated image by the CNN,
Figure BDA0003682326340000035
for the countermeasure loss of the generator network, the specific formula is as follows:
Figure BDA0003682326340000036
Figure BDA0003682326340000037
is a discriminator, X r ,X f Is a true image and a generated image, outputs a probability of being discriminated as a true image,
Figure BDA0003682326340000038
Figure BDA0003682326340000039
expressing the expectation of function distribution, wherein the gradient weight loss is specifically formulated as follows:
Figure BDA00036823263400000310
wherein L is 1 | | | y-f (x) | l, f (x) represents the generated image, and y represents the high-resolution image; wherein D gw =(1+αD x )(1+αD y );
Figure BDA00036823263400000311
D x ,D y Representing the generation of a texture difference map between an image and a high resolution in the horizontal and vertical directions,
Figure BDA00036823263400000312
a gradient image representing the generated image is generated,
Figure BDA00036823263400000313
a gradient image representing a real image, and a is a weight coefficient in the loss function, which is 4 in the present example.
Compared with the prior art, the invention has the following advantages:
in generating a high resolution image from an input low resolution image, a guiding role of texture should be more emphasized. The correlation among the channels is learned by the introduction of an attention mechanism to emphasize the relation among different channels of the feature map. Because texture edges in the image are more emphasized, the image generated by the embodiment of the invention has more authenticity, the texture details are clearer, and the visual perception quality of the generated image is effectively improved; in addition, the gradient weight loss and the attention mechanism module used in the embodiment of the invention have lower calculation cost, but the effect is obviously improved.
Drawings
FIG. 1 is a flow chart of an image super-resolution network according to an embodiment of the present invention;
FIG. 2 is a diagram of a generator network architecture according to an embodiment of the present invention;
FIG. 3 is a diagram of a residual dense network architecture of an embodiment of the present invention;
FIG. 4 is a diagram of an arbiter network according to an embodiment of the present invention;
FIG. 5 is a graph comparing the effects of the examples of the present invention.
Detailed Description
The present invention will be further described with reference to the accompanying drawings, and the embodiments described herein are only for the purpose of more clearly illustrating the present invention and are not intended to limit the scope of the present invention.
Referring to fig. 1, the specific implementation steps of the present invention are as follows:
step 1, constructing a data set.
The high resolution image y and the corresponding down-sampled low resolution image x are input separately: in the embodiment of the present invention, images in the DIV2K dataset are used, the size of the input high resolution image is 128 × 128, and the size of the input low resolution image is 32 × 32, where the low resolution image is obtained by down-sampling the high resolution image into the low resolution image by the Bicubic interpolation method, and the specific formula is as follows:
x=F(y)
f () represents a down-sampling operation, x represents a low resolution image, and y represents a high resolution image. Paired training data are obtained.
And 2, constructing an image super-resolution network and training a generator.
Constructing an image super-resolution model, wherein the model consists of a generator and a discriminator, and comprises the following steps: the device comprises a shallow feature extraction module, a nonlinear mapping module, an image reconstruction module and a discriminator module, wherein the nonlinear mapping module consists of a residual dense network, and the image reconstruction module consists of an attention module and an up-sampling module. The image super-resolution network constructed by the invention extracts and maps the characteristics of the low-resolution image and the high-resolution image by constructing the image super-resolution network containing a plurality of convolution network modules, and finally performs weighted fusion on the extracted characteristics to obtain the final high-resolution image. Constructing an objective equation to train the generator, additionally adding gradient weight loss to constrain the training process, wherein the network structure of the generator is shown in FIG. 2,
(2a) the low-resolution image is subjected to shallow feature extraction, namely two-layer convolution to obtain a feature map m with the size of 32 × 32.
(2b) The characteristic diagram is nonlinearly mapped through a constructed nonlinear mapping module, and the specific structure of the nonlinear mapping module is shown in fig. 3. The module is composed of a plurality of residual error dense networks, the number of residual error blocks is 16, the number of residual error dense networks is 23, each residual error dense network module is composed of three dense network modules through residual error scaling fusion, a feature diagram is mapped to 64 32 by 32, and the dense network module is composed of a convolution layer, a LeakyReLU active layer, a convolution layer, a LeakyReLU active layer and a convolution layer and is built in a dense connection mode. Finally obtaining a nonlinear mapping characteristic graph m 1 And the size is 64 × 32, and the input is input to an image reconstruction module of the attention mechanism.
(2c) Obtaining global features of the feature map u over the entire spatial features through global pooling:
Figure BDA0003682326340000041
wherein H and W represent the height and width of the feature map, u c (i, j) is the feature value of the feature map u corresponding to the (i, j) position.
(2d) After obtaining the global feature description of the feature graph u, the relationship between the channels is obtained by using the following formula:
z=F ex (z,W)=σ(g(z,W))=σ(W 2 ReLU(W 1 z))
wherein
Figure BDA0003682326340000051
r is the dimensionality reduction coefficient, ReLU is the activation function, σ () is the Sigmoid activation function, and z is the final output scalar.
(2e) Finally, the activation values (i.e. weights) of the individual channels are multiplied by the original feature map u:
Figure BDA0003682326340000052
wherein F scale (u c ,s c ) Is a characteristic diagram u c And a scalar z c Corresponding to the channel product.
(2f) The size of the feature map obtained in (2e) is changed to 64 × 128 after convolution and the image is generated after the last convolution, wherein the size of the feature map is changed from 64 × 128 to 3 × 128.
(2g) Training the network by using the training samples generated in the step 1 and an Adam random gradient descent algorithm, wherein a loss function is used for generating a confrontation network loss and a gradient weight loss, and the specific formula is as follows:
L=L G +L gw
the specific formula of the generator loss is as follows:
Figure BDA0003682326340000053
l in generator losses 1 | | | y-f (x) | L, f (x) represents the generated image, y represents the high resolution image, L per For sensing loss, λ is a loss parameter, and loss is calculated by using a feature map output by the trained VGG16 network relu3 — 3 on the ImageNet dataset, wherein the specific formula is as follows: l is pre =‖φ(y)-φ[f(x)]II,. phi. (y),. phi. (f) (x)) represents the features of the real image extracted and the generated image by the CNN,
Figure BDA0003682326340000054
for the countermeasure loss of the generator network, the specific formula is as follows:
Figure BDA0003682326340000055
Figure BDA0003682326340000056
is a discriminator, X r ,X f Is a true image and a generated image, outputs a probability of being discriminated as a true image,
Figure BDA0003682326340000057
Figure BDA0003682326340000058
representing the expectation of the distribution of the function.
Wherein, the gradient weight loss is specifically formulated as follows:
Figure BDA0003682326340000059
wherein L is 1 | | | y-f (x) | l, f (x) represents the generated image, and y represents the high-resolution image; wherein D gw =(1+αD x )(1+αD y );
Figure BDA00036823263400000510
D x ,D y Representing the generation of a texture difference map between an image and high resolution in the horizontal and vertical directions.
Figure BDA00036823263400000511
A gradient image representing the generated image,
Figure BDA00036823263400000512
a gradient image representing a real image, and a is a weight coefficient in the loss function, which is 4 in the present example.
And 3, training a discriminator and finely adjusting a generator.
And (3) training a discriminator by using the generator of the image super-resolution network after training obtained in the step (2) and using the generator which is trained preliminarily, so as to finely tune the generator, and obtain a final image super-resolution network model. The network structure of the discriminator is shown in fig. 4, and comprises 8 convolutional layers, the number of features is continuously increased and the feature size is continuously reduced along with the deepening of the network layer number, an activation function is selected to be LeakyReLU, and finally the probability of being predicted as a natural image is obtained through two full-connection layers and a final Sigmoid activation function, so that the generated image is closer to a real image. A relative discriminator is used instead of a standard discriminator to generate a more realistic, better quality image, with the following formula being the discriminator network loss function:
Figure BDA00036823263400000513
Figure BDA00036823263400000514
is a discriminator, X r ,X f Is a true image and a generated image, and outputs a probability of being discriminated as a true image.
Figure BDA00036823263400000515
Figure BDA00036823263400000516
Representing the expectation of the distribution of the function.
And 4, inputting a low-resolution image and outputting a high-resolution generated image.
Inputting a low-resolution image, outputting the low-resolution image through the final image super-resolution network model obtained in the step (3) to obtain a high-resolution image, wherein the specific formula is as follows:
Y=F SR (x)
where x is the input low resolution image, F SR () And Y is the output high-resolution image.
In order to verify the effect of the invention, the method of the invention is respectively compared with other existing image super-resolution network methods, and the evaluation indexes are as follows:
PI (perceptual index) is a comprehensive image perceptual quality evaluation criterion proposed in the PIRM super-resolution match, and is a currently mainstream super-resolution quality evaluation index, and the specific formula is as follows:
Figure BDA0003682326340000061
where Ma uses statistics in the spatial and frequency domains as the feature representation image. And each group of extracted features are trained in a separate integrated regression tree, and the quality scores are predicted from a large number of visual perception scores by using a lower linear regression model, wherein the larger the Ma index is, the better the visual effect of the image is.
Niqe (natural image quality evaluator): based on the quality-aware features, a larger NIQE represents better image quality. The smaller the perception index PI finally obtained by combining the two quality evaluation methods is, the better the image quality and the visual quality is represented.
The result of the super-resolution method based on the deep learning is shown in fig. 5, compared with other super-resolution methods of the same type, the high-resolution image generated by the method has a better recovery effect on texture details and a better perception index.

Claims (2)

1. A super-resolution method of gradient weight loss and attention mechanism based on texture guidance constructs an image super-resolution network model based on CNN, and comprises a generator for generating images and a discriminator for discriminating the authenticity of the generated images, wherein the generator performs nonlinear feature mapping after extracting features from input low-resolution images, calculates the correlation among feature maps through the attention mechanism, redistributes the weight of each feature map, and obtains a high-resolution image through an image reconstruction module of the attention mechanism, and the gradient weight loss restrains a training process in the texture guidance, and the method specifically comprises the following steps:
(1) constructing a data set: acquiring a high-resolution image in a data set, wherein a low-resolution image is obtained by sampling the high-resolution image, and paired training data is acquired, and the specific formula is as follows:
x=F(y)
f () represents a down-sampling operation, x represents a low resolution image, and y represents a high resolution image;
(2) constructing an image super-resolution model, and training a generator: the model consists of two branch networks of a generator and an arbiter, and comprises the following steps: the device comprises a shallow feature extraction module, a nonlinear mapping module, an image reconstruction module of an attention mechanism and a discriminator module; the shallow feature extraction module consists of a convolutional layer, the nonlinear mapping module consists of a residual dense network, and the image reconstruction module of the attention mechanism consists of an attention module and an up-sampling module; extracting and mapping the features through CNN, and finally performing weighted reconstruction on the obtained features to obtain a high-resolution image;
constructing an objective equation: generator usage generation to combat network loss L G And loss of gradient weights
Figure FDA0003682326330000011
The constraint is specifically as follows:
Figure FDA0003682326330000012
wherein, the specific formula for generating the network loss is as follows:
Figure FDA0003682326330000013
generating L in countering network loss 1 | | | y-f (x) | f (x) generationTable generated image, y represents high resolution image, λ represents loss parameter, L per For the perception loss, the concrete formula is:
L per =‖φ(y)-φ[f(x)]‖
wherein, phi (y), phi (f (x)) represent the characteristics of the real image extracted by the CNN and the generated image;
Figure FDA0003682326330000014
for the generator's fight loss, the specific formula is:
Figure FDA0003682326330000015
Figure FDA0003682326330000016
is a discriminator, X r ,X f Respectively representing the true image and the generated image, outputting a probability of being discriminated as the true image;
Figure FDA0003682326330000017
expressing the expectation of the distribution of the function;
the gradient weight loss is specifically formulated as follows:
Figure FDA0003682326330000018
wherein L is 1 Y-f (x) i, f (x) represents the generated image, y represents the real image; wherein
Figure FDA0003682326330000019
Figure FDA00036823263300000110
D x ,D y Representing the difference in gradient between image generation and high resolution in the horizontal and vertical directionsIn the figure, the figure shows that,
Figure FDA00036823263300000111
a gradient image representing the generated image,
Figure FDA00036823263300000112
a gradient image representing a real image, α being a weight coefficient in the loss function;
(3) training a discriminator: after the generator is initially trained, training a discriminator to fine-tune the generator, and using a relative discriminator to replace a standard discriminator to generate an image with more authenticity and better quality, wherein the following formula is a discriminator loss function:
Figure FDA0003682326330000021
wherein the content of the first and second substances,
Figure FDA0003682326330000022
is a discriminator which inputs a real image and a generated image, outputs a probability of being discriminated as the real image,
Figure FDA0003682326330000023
Figure FDA0003682326330000024
expressing the expectation of the distribution of the function;
(4) high-resolution image output: inputting a low-resolution image, outputting a high-resolution image through a trained image super-resolution network model, wherein the formula is as follows:
Y=F SR (x)
where x is the input low resolution image, F SR () The image super-resolution model is trained, and Y is the output high-resolution image.
2. The texture-oriented gradient weight loss and attention mechanism super-resolution method of claim 1, wherein: the specific steps of the step (2) are as follows:
(2a) obtaining a feature map m after the input image passes through a shallow feature extraction module, namely two convolution layers;
(2b) carrying out nonlinear mapping on the feature map m through a plurality of residual error dense network modules to obtain the feature map m 1
(2c) Constructing an attention mechanism image reconstruction module, and obtaining the global characteristics of the characteristic graph u on the whole spatial characteristics through global pooling, wherein the formula is as follows:
Figure FDA0003682326330000025
wherein H and W represent the height and width of the feature map, u c (i, j) is the feature value of the feature map u corresponding to the position (i, j), and after obtaining the global feature description of the feature map u, the relationship between the channels is obtained by using the following formula:
z=F ex (z,W)=σ(g(z,W))=σ(W 2 ReLU(W 1 z))
wherein
Figure FDA0003682326330000026
r is a dimensionality reduction coefficient, ReLU is an activation function, σ () is a Sigmoid activation function, and z is a scalar of the final output;
finally, multiplying the activation value, namely the weight, of each channel by the original feature map u:
Figure FDA0003682326330000027
wherein F scale (u c ,s c ) Is a characteristic diagram u c And a scalar z c Corresponding channel products;
(2d) the feature map obtained in the step (2c) is processed by a convolution up-sampling module to obtain a result with higher resolution;
(2e) training the generator network, wherein the loss function is to use the generator network to generate the confrontation network loss and the gradient weight loss, and the specific formula is as follows:
Figure FDA0003682326330000028
the specific formula of the generator loss is as follows:
Figure FDA0003682326330000029
l in generator losses 1 | | | y-f (x) | L, f (x) represents the generated image, y represents the high resolution image, L per For sensing loss, λ is a loss parameter, and loss is calculated by using a feature map output by the trained VGG16 network relu3 — 3 on the ImageNet dataset, wherein the specific formula is as follows: l is per =‖φ(y)-φ[f(x)]II,. phi. (y),. phi. (f) (x)) represents the features of the real image extracted and the generated image by the CNN,
Figure FDA00036823263300000210
for the generator network, the specific formula is:
Figure FDA00036823263300000211
Figure FDA00036823263300000212
is a discriminator, X r ,X f Is a true image and a generated image, outputs a probability of being discriminated as a true image,
Figure FDA00036823263300000213
Figure FDA00036823263300000214
the expectation of the distribution of the function is expressed,
wherein, the gradient weight loss is specifically formulated as follows:
Figure FDA0003682326330000031
wherein L is 1 | | | y-f (x) | l, f (x) represents the generated image, and y represents the high-resolution image; wherein
Figure FDA0003682326330000032
Figure FDA0003682326330000033
D x ,D y Representing the generation of a texture difference map between an image and a high resolution in the horizontal and vertical directions,
Figure FDA0003682326330000034
a gradient image representing the generated image,
Figure FDA0003682326330000035
a gradient image representing a real image, and α is a weight coefficient in the loss function.
CN202210636553.0A 2022-06-07 2022-06-07 Gradient weight loss and attention mechanism super-resolution method based on texture guidance Pending CN114943646A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210636553.0A CN114943646A (en) 2022-06-07 2022-06-07 Gradient weight loss and attention mechanism super-resolution method based on texture guidance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210636553.0A CN114943646A (en) 2022-06-07 2022-06-07 Gradient weight loss and attention mechanism super-resolution method based on texture guidance

Publications (1)

Publication Number Publication Date
CN114943646A true CN114943646A (en) 2022-08-26

Family

ID=82909874

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210636553.0A Pending CN114943646A (en) 2022-06-07 2022-06-07 Gradient weight loss and attention mechanism super-resolution method based on texture guidance

Country Status (1)

Country Link
CN (1) CN114943646A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115131214A (en) * 2022-08-31 2022-09-30 南京邮电大学 Indoor aged person image super-resolution reconstruction method and system based on self-attention
CN117115064A (en) * 2023-10-17 2023-11-24 南昌大学 Image synthesis method based on multi-mode control

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115131214A (en) * 2022-08-31 2022-09-30 南京邮电大学 Indoor aged person image super-resolution reconstruction method and system based on self-attention
CN115131214B (en) * 2022-08-31 2022-11-29 南京邮电大学 Indoor old man image super-resolution reconstruction method and system based on self-attention
CN117115064A (en) * 2023-10-17 2023-11-24 南昌大学 Image synthesis method based on multi-mode control
CN117115064B (en) * 2023-10-17 2024-02-02 南昌大学 Image synthesis method based on multi-mode control

Similar Documents

Publication Publication Date Title
CN110119780B (en) Hyper-spectral image super-resolution reconstruction method based on generation countermeasure network
CN109509152B (en) Image super-resolution reconstruction method for generating countermeasure network based on feature fusion
CN109816593B (en) Super-resolution image reconstruction method for generating countermeasure network based on attention mechanism
CN110706302B (en) System and method for synthesizing images by text
CN112396607B (en) Deformable convolution fusion enhanced street view image semantic segmentation method
CN109903223B (en) Image super-resolution method based on dense connection network and generation type countermeasure network
Yan et al. Fine-grained attention and feature-sharing generative adversarial networks for single image super-resolution
CN112734646B (en) Image super-resolution reconstruction method based on feature channel division
CN112001847A (en) Method for generating high-quality image by relatively generating antagonistic super-resolution reconstruction model
CN111062872A (en) Image super-resolution reconstruction method and system based on edge detection
CN109685716B (en) Image super-resolution reconstruction method for generating countermeasure network based on Gaussian coding feedback
CN114943646A (en) Gradient weight loss and attention mechanism super-resolution method based on texture guidance
CN109993702B (en) Full-text image super-resolution reconstruction method based on generation countermeasure network
CN113256494B (en) Text image super-resolution method
CN110852935A (en) Image processing method for human face image changing with age
CN114782694A (en) Unsupervised anomaly detection method, system, device and storage medium
CN115565056A (en) Underwater image enhancement method and system based on condition generation countermeasure network
CN115713462A (en) Super-resolution model training method, image recognition method, device and equipment
CN109447897B (en) Real scene image synthesis method and system
CN114066871A (en) Method for training new coronary pneumonia focus region segmentation model
CN113096015A (en) Image super-resolution reconstruction method based on progressive sensing and ultra-lightweight network
CN116703725A (en) Method for realizing super resolution for real world text image by double branch network for sensing multiple characteristics
Jia et al. AFcIHNet: Attention feature-constrained network for single image information hiding
CN113344110B (en) Fuzzy image classification method based on super-resolution reconstruction
CN113111906B (en) Method for generating confrontation network model based on condition of single pair image training

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination