CN115222592A - Underwater image enhancement method based on super-resolution network and U-Net network and training method of network model - Google Patents
Underwater image enhancement method based on super-resolution network and U-Net network and training method of network model Download PDFInfo
- Publication number
- CN115222592A CN115222592A CN202210733444.0A CN202210733444A CN115222592A CN 115222592 A CN115222592 A CN 115222592A CN 202210733444 A CN202210733444 A CN 202210733444A CN 115222592 A CN115222592 A CN 115222592A
- Authority
- CN
- China
- Prior art keywords
- network model
- image
- resolution
- super
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012549 training Methods 0.000 title claims abstract description 42
- 238000000034 method Methods 0.000 title claims abstract description 34
- 230000006870 function Effects 0.000 claims description 36
- 230000004913 activation Effects 0.000 claims description 17
- 230000008447 perception Effects 0.000 claims description 8
- 238000010606 normalization Methods 0.000 claims description 7
- 238000011176 pooling Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000010586 diagram Methods 0.000 claims description 3
- 230000004927 fusion Effects 0.000 claims description 3
- 229910052799 carbon Inorganic materials 0.000 claims description 2
- 150000001875 compounds Chemical class 0.000 claims description 2
- 230000007246 mechanism Effects 0.000 abstract description 5
- 238000012545 processing Methods 0.000 abstract description 4
- 230000002708 enhancing effect Effects 0.000 description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 3
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- 230000002238 attenuated effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 125000004432 carbon atom Chemical group C* 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 244000005700 microbiome Species 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4046—Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/05—Underwater scenes
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Processing (AREA)
Abstract
The invention discloses an underwater image enhancement method based on a super-resolution network and a U-Net network and a training method of a network model, and belongs to the technical field of underwater image processing. According to the invention, a depth residual error super-resolution network and a U-Net underwater image enhancement network based on an SK attention mechanism and a K estimation module are constructed, so that the resolution of an image can be improved, the image blur can be eliminated, and a natural color enhancement image can be generated.
Description
Technical Field
The invention relates to an underwater image enhancement method based on a super-resolution network and a U-Net network and a training method of a network model, belonging to the technical field of underwater image processing.
Background
With the development of the times and the progress of scientific technology, people continuously know and expand objects such as underwater organisms, underwater resources and the like, but due to the fact that an underwater complex environment and water bodies attenuate greatly, water molecules, various microorganisms and the like in the water bodies have certain absorption and reflection effects on light, the problems that the obtained underwater images are low in existing degree, low in contrast, fuzzy in outline, disordered in color and the like are caused, and the low-quality underwater images bring great difficulty to researchers to analyze underwater targets and recognize and detect the underwater targets. Therefore, in what way to enhance the details of the underwater image, recovering the information in the underwater image becomes a challenging problem.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, and provides an underwater image enhancement method based on a super-resolution network and a U-Net network and a training method of a network model, so as to solve the problem that underwater images are difficult to enhance in the prior art.
In order to solve the technical problem, the invention is realized by adopting the following scheme:
the invention provides an underwater image enhancement method based on a super-resolution network and a U-Net network, which comprises the following steps:
acquiring an underwater original image;
inputting the underwater original image into a trained depth residual super-resolution network model, and outputting a high-resolution underwater image;
inputting the high-resolution underwater image into a trained U-Net network model, and outputting an underwater enhanced image;
the depth residual super-resolution network model and the U-Net network model both comprise a generator network and a discriminator network;
the generator network of the depth residual super-resolution network model comprises a plurality of convolution blocks consisting of a depth residual channel attention block DRCAB and an additional convolution layer with a tanh activation function;
the generator network of the U-Net network model comprises an encoder module, a K estimation module, a converter module, an SK attention module and a decoder module which are sequentially connected.
The invention also provides a training method of the depth residual super-resolution network model and the U-Net network model, the trained depth residual super-resolution network model and the U-Net network model are used for enhancing the underwater image, and the training method of the depth residual super-resolution network model and the U-Net network model comprises the following steps:
acquiring a training data set, wherein the training data set comprises an original image and a distorted image mapped with the original image;
and respectively inputting the image samples in the training data set into a pre-established depth residual super-resolution network model and a U-Net network model to perform model alternating iterative training until the loss function value of the model does not decrease, and finishing the training.
Preferably, the training data sets are respectively a USR-248 super-resolution data set and an EUVP paired underwater image data set.
Preferably, the pre-established depth residual super-resolution network model and the pre-established U-Net network model both comprise a generator network and a discriminator network.
Preferably, the generator network of the pre-established depth residual super-resolution network model comprises three convolution blocks connected in sequence, and each convolution block comprises a depth residual channel attention block DRCAB, a convolution layer and a tanh activation function layer.
Preferably, the DRCAB includes a first convolutional layer, a first BN batch normalization layer, a first softmax activation function layer, a second convolutional layer, a second BN batch normalization layer, a third convolutional layer, an avgpool2d average pooling layer, a fourth convolutional layer, a second softmax activation function layer, a fifth convolutional layer, and an upsampling layer, which are sequentially connected.
Preferably, the generator network of the pre-established U-Net network model comprises an encoder module, a K estimation module, a converter module, an SK attention module and a decoder module which are connected in sequence; the encoder and the decoder of each layer in the generator network are connected in a jump connection mode.
Preferably, the SK attention module comprises a Split module, a Fuse module and a Select module which are connected in sequence.
Preferably, the Split module performs multiple convolutions on the input image by using 2 convolution kernels with different sizes;
the Fuse module calculates the weight parts of the 2 convolution kernels, and sums the feature maps of the two parts according to elements:
in the above formula, the first and second carbon atoms are,is the weight profile extracted by the first convolution kernel,is a weight profile extracted by a second convolution kernel;
U c generation of a feature map S by globally averaged pooling layers c ,S c And generating a compact characteristic diagram z through the full connection layer, wherein the calculation formula is as follows:
z=f(S C )=δ(B(S C W))
in the formula, C, H and W are the sizes of input images, delta is a ReLU activation function, and B is a BN batch standardization layer;
and the Select module calculates the weights of the 2 convolution kernels through a softmax activation function, applies the weights to the feature graph z, obtains 2 new feature graphs, and then performs connection fusion to obtain a final output image.
Preferably, the method for calculating the model loss function value includes:
calculating global similarity loss:
L 2 (G)=E X~Y [||Y-G(X)|| 2 ]
in the above formula, X is a distorted image, Y is a real image, and G is a generationA network of devices, G (X) is an image generated by the generator, wherein E X~Y Representing the expectation that the distorted image is a real image, | | Y-G (X) | charging 2 A distance between the real image and the image generated by the generator;
calculating the perception loss:
in the above formula, r, G and b respectively represent the difference of the normalized values of the red, green and blue channels between the image G (X) generated by the generator and the real image Y,average of the red channels;
calculating the content loss:
in the above formula, X and Y are respectively a distorted image and a real image,showing the feature maps extracted from the fourth and fifth convolutional layers of the pre-trained VGG-19 network,the distance between the feature map of the real image and the feature map of the distorted image is calculated;
the total loss was calculated:
L g (G)=λ c L C (G)+λ p L P (G)+λ 2 L 2 (G)
in the above formula, λ c ,λ p And λ 2 Are weight values for content loss, perceptual loss, and global similarity loss.
Compared with the prior art, the invention has the following beneficial effects:
1. according to the invention, a depth residual error super-resolution network (SRDRCAM) and a U-Net underwater image enhancement network based on an SK attention mechanism and a K estimation module are constructed, so that the resolution of an image can be improved, the image blur can be eliminated, and a natural color enhancement image can be generated.
2. In the invention, besides global similarity loss and perception loss, a content loss structure is additionally added, the overall structure of an input image is reserved by optimizing the global similarity loss, and the perception loss is optimized, so that a network can better recover the detail information of an underwater image; the content is lost so that the content of the output image is more similar to the input image.
3. The invention provides an end-to-end network structure, and the method does not need any underwater image imaging model parameters in the stages of training and testing.
4. The network parameters are fewer, the network training speed is higher, and the performance is better compared with other models.
Drawings
FIG. 1 is a network architecture of an SK attention mechanism provided by an embodiment of the present invention;
FIG. 2 is a network structure of a depth residual channel attention block provided by an embodiment of the present invention;
FIG. 3 is a generator network structure of a deep residual super-resolution network model provided by an embodiment of the invention;
FIG. 4 is a generator network structure of the U-Net network model provided by an embodiment of the present invention;
Detailed Description
The invention is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
In the description of the present invention, it is to be understood that the terms "central," "longitudinal," "lateral," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like are used in the orientations and positional relationships indicated in the drawings, which are based on the orientations and positional relationships indicated in the drawings, and are used for convenience in describing the present invention and for simplicity in description, but do not indicate or imply that the device or element so referred to must have a particular orientation, be constructed in a particular orientation, and be operated, and thus should not be construed as limiting the present invention.
Example 1:
the embodiment provides a training method of a depth residual super-resolution network (SRDRCAM) model and a U-Net network model, and the trained depth residual super-resolution network model and the U-Net network model can effectively enhance underwater images. The training steps are as follows:
the method comprises the following steps: preparing a training data set
The 2x, 4x and 8x super resolution networks were trained using the prior published super resolution dataset USR-248, which contains 2x, 4x and 8x images, eliminating detail blur of the underwater images. The EUVP paired underwater image datasets are used to train an improved U-Net network. The training iteration times of the super-resolution network and the U-Net network are set to be 10 times, and the batch processing size is set to be 1. And the images are all resized to 256x256.
Step two: network structure for constructing SRDRCAM and U-Net network structure
The SRDRCAM network architecture includes a generator network and an arbiter network. Therein, as shown in fig. 3, the generator network consists of three depth residual channel attention blocks DRCAB, three convolutional layers Conv and three tanh activation functions, each DRCAB block, convolutional layer (Conv) and tanh activation function constituting a super resolution per 2 x. The network structure of the depth residual attention block DRCAB is shown in fig. 2, where the DRCAB includes a convolution layer and eight repeated residual channel attention multiplication blocks, followed by a convolution layer, and finally an upsampling layer, and specifically includes a first convolution layer, a first BN batch normalization layer, a first softmax activation function layer, a second convolution layer, a second BN batch normalization layer, a third convolution layer, an avgpool2d average pooling layer, a fourth convolution layer, a second softmax activation function layer, a fifth convolution layer, and an upsampling layer, which are sequentially connected. Inputting the image size of wxh x3, and obtaining 2w x 2h x3 output through DRACMB, conv and tanh for the first time; obtaining output of 4w x 4h x3 through DRACMB, conv and tanh for the second time; and obtaining 8w x 8h x3 output through the third DRACMB, conv and tanh, and zooming to eliminate the blurring of the underwater image.
A U-Net network structure is improved, and a K-estimation module and a clear image generation module are designed. The K estimation module is the core of the improved U-Net network, and is inspired by the use of an image enhancement algorithm of the U-Net in the space, and an improved U-Net framework is used for generating underwater image features and enhancing underwater images. An SK attention mechanism is also added into the U-Net network to modify the architecture of the U-Net. The network structure of the SK attention mechanism is shown in fig. 1. SK attention networks have different weights for different convolution kernels, i.e. a network that dynamically generates convolution kernels for images of different scales. The composition mainly comprises three parts of Split, fuse and Select:
the Split part is to perform multiple convolution operations on the input image using convolution kernels of different sizes, and the present invention employs convolution kernels of sizes 3x3 and 5x 5.
The Fuse part is a part for calculating the weight of each convolution kernel, and the feature maps of the two parts are summed according to elements, and the calculation formula is as follows:
in the above-mentioned formula, the compound has the following structure,is a weight feature extracted by a convolution kernel of size 3x3,is a weight feature map extracted by a 5x5 convolution kernel. U shape c Generating channel statistical information through a Global Average Pooling (GAP) layer to generate a feature map S c ,S c Dimension Cx1, S c And generating a compact characteristic diagram z (dimension is dx 1) through the full connection layer, wherein the calculation formula is as follows:
z=f(S C )=δ(B(S C W))
d=max(C/r,L)
wherein C (number of channels), H (height), W (width) are the size of the input image, delta is the ReLU activation function, B is the BN batch normalization layer, L is the optimal value selected according to the size of two convolution kernels, the value in the invention is set as 32, r is the compression factor; the dimension of z is the number of convolution kernels, the dimension of W is dxC, and d represents the characteristic dimension after full connection.
And the Select part calculates the weights of the 2 convolution kernels through softmax, then applies the weights to the feature map z to obtain 2 new feature maps, and then performs connection fusion to obtain a final output image.
The underwater image enhancement network (U-Net) comprises a generator network and a discriminator network. As shown in fig. 4, the generator network includes an encoder module, a K estimation module, a converter module, an SK attention module, and a decoder module, which are connected in sequence; the encoder and the decoder of each layer in the generator network are connected in a skip connection mode; and compensating the color model by using an end-to-end underwater image enhancement network U-Net network to generate a natural color enhancement image.
Step three: optimizing a loss function
Because the common error measurement can not reflect the optimization degree of the image in all aspects, the invention solves the problem by optimizing the loss function, so that the output image is closer to the real image. The invention uses 3 loss functions, respectively global similarity loss (L) 2 ) Loss of perception (L) p ) Content loss (L) c ). The method comprises the following specific steps:
global similarity loss function:
L 2 (G)=E X~Y [||Y-G(X)|| 2 ]
in the above equation, X is a distorted image, Y is a real image, G is a generator, and G (X) is an image generated by the generator, where E is X~Y Representing the expectation that the distorted image is a real image, | | Y-G (X) | charging 2 Is the distance between the real image and the image generated by the generator.
Global similarity loss function L 2 Refers to the overall visual effect, which is used to measure the difference between the real image and the image enhanced by the method of the present invention, and aims to improve the visual quality of the output image.
Perceptual loss function:
in the above equation, r, G, and b represent the difference of the normalized values of the red, green, and blue channels between the generated image G (X) and the real image Y, respectively.The average of the red channel.
The invention can eliminate the blue-green color cast of the image by adopting the perception loss, so that the generated image is more real, and the distortion of the image is reduced.
Content loss function:
because the red component of the light is most seriously attenuated underwater, which causes the color of the underwater image to be greenish or bluish, for the construction of the loss function, the invention introduces a content loss function L besides global similarity loss and perception loss c The color distribution of the image is corrected, so that the details of the enhanced image are clearer. The calculation formula is as follows:
in the above formula, X and Y are respectively a distorted image and a real image,showing high-level feature maps extracted from the fourth and fifth convolutional layers of a pre-trained VGG-19 network,the distance between the feature map of the real image and the feature map of the distorted image.
Calculate the total loss function: combining together the loss functions of the multiple modes:
L g (G)=λ c L C (G)+λ p L P (G)+λ 2 L 2 (G);
in the above formula, λ c ,λ p And λ 2 Are weight values for content loss, perceptual loss, and global similarity loss.
Step four: network model training and setup
The network model training of the invention is actually the training of two networks, namely an SRDRCAM network and a U-Net network. Firstly, a super-resolution public data set USR-248 is adopted to train an SRDRCAM network, and a discriminator judges whether the image is a real image or a false image. Then training the generator under the discriminator, and optimizing the network of the generator according to the global similarity loss function, the perception loss function and the content loss function in the third step. The optimization process is completed through a Pythrch frame, loss is input into the optimizer, the optimizer performs minimization processing on the loss, the arbiter and the generator are iterated in sequence in an alternating mode until the loss function value is not reduced any more, and network training is completed. And after the SRDRCAM network model is trained, then the U-Net network model is trained, the U-Net network trains the U-Net network by adopting the paired data sets EUVP, and the steps are repeated continuously until the network training is finished.
Inputting a low-resolution image into a trained SRDRCAM network model to obtain a deblurred high-resolution image, inputting the generated high-resolution image into the trained U-Net network model, and compensating the color model by the U-Net network by using an end-to-end underwater image enhancement network to generate a natural color enhancement image. At the moment, the network training is completely finished, and the goal of enhancing the underwater image is achieved.
This example uses an ADAM optimizer to train the model and set the learning rate to 0.0002, the momentum to 0.5, the batch size to 1, and the iteration cycles for both the SRDRCAM and U-Net networks to 20. In this embodiment, the network of this embodiment is implemented using a pytorech framework, and the network model is trained using NVIDIA RTX3060GPU and i5 10440 KFCPU.
Example 2
The embodiment provides an underwater image enhancement method based on a super-resolution network and a U-Net network, and the underwater image enhancement is carried out by adopting the depth residual error super-resolution network model and the U-Net network model which are trained in the embodiment 1. The underwater image enhancement method comprises the following steps:
acquiring an underwater original image;
inputting the underwater original image into a trained depth residual super-resolution network model, and outputting a high-resolution underwater image;
and inputting the underwater image with high resolution into the trained U-Net network model, and outputting an underwater enhanced image.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.
Claims (10)
1. An underwater image enhancement method based on a super-resolution network and a U-Net network is characterized by comprising the following steps:
acquiring an underwater original image;
inputting an underwater original image into a trained depth residual super-resolution network model, and outputting a high-resolution underwater image;
inputting the high-resolution underwater image into a trained U-Net network model, and outputting an underwater enhanced image;
the depth residual super-resolution network model and the U-Net network model both comprise a generator network and a discriminator network;
the generator network of the depth residual super-resolution network model comprises a plurality of convolution blocks consisting of a depth residual channel attention block DRCAB and an additional convolution layer with a tanh activation function;
the generator network of the U-Net network model comprises an encoder module, a K estimation module, a converter module, an SK attention module and a decoder module which are sequentially connected.
2. A method for training a depth residual super-resolution network model and a U-Net network model, wherein the trained depth residual super-resolution network model and the U-Net network model are used for underwater image enhancement according to claim 1, and the method for training the depth residual super-resolution network model and the U-Net network model comprises the following steps:
acquiring a training data set, wherein the training data set comprises an original image and a distorted image mapped with the original image;
and respectively inputting the image samples in the training data set into a pre-established depth residual super-resolution network model and a U-Net network model to perform model alternating iterative training until the loss function value of the model does not decrease, and finishing the training.
3. The method for training the depth residual super-resolution network model and the U-Net network model according to claim 2, wherein the training datasets are a USR-248 super-resolution dataset and an EUVP paired underwater image dataset, respectively.
4. The method for training the deep residual super-resolution network model and the U-Net network model of claim 2, wherein the pre-established deep residual super-resolution network model and the U-Net network model each comprise a generator network and a discriminator network.
5. The method for training the depth residual super-resolution network model and the U-Net network model according to claim 4, wherein the generator network of the pre-built depth residual super-resolution network model comprises three sequentially connected convolution blocks, each convolution block comprising a depth residual channel attention block DRCAB, a convolution layer and a tanh activation function layer.
6. The method for training the deep residual super-resolution network model and the U-Net network model of claim 5, wherein the DRCAB comprises a first convolutional layer, a first BN batch normalization layer, a first softmax activation function layer, a second convolutional layer, a second BN batch normalization layer, a third convolutional layer, an avgpool2d average pooling layer, a fourth convolutional layer, a second softmax activation function layer, a fifth convolutional layer, and an upsampling layer, which are connected in sequence.
7. The method for training the deep residual super-resolution network model and the U-Net network model of claim 4, wherein the generator network of the pre-established U-Net network model comprises an encoder module, a K estimation module, a converter module, an SK attention module and a decoder module which are connected in sequence; the encoder and the decoder of each layer in the generator network are connected in a jump connection mode.
8. The method for training the deep residual super-resolution network model and the U-Net network model of claim 7, wherein the SK attention module comprises a Split module, a Fuse module and a Select module which are connected in sequence.
9. The method for training the deep residual super-resolution network model and the U-Net network model of claim 8, wherein the Split module performs multiple convolutions on the input image by using convolution kernels of 2 different sizes;
the Fuse module calculates the weight parts of the 2 convolution kernels, and sums the feature maps of the two parts according to elements:
in the above-mentioned formula, the compound has the following structure,is the weight profile extracted by the first convolution kernel,is the weight feature graph extracted by the second convolution kernel;
U c generation of a feature map S by globally averaged pooling layers c ,S c And generating a compact characteristic diagram z through the full connection layer, wherein the calculation formula is as follows:
z=f(S C )=δ(B(S C W))
in the formula, C, H and W are the sizes of input images, delta is a ReLU activation function, and B is a BN batch standardization layer;
and the Select module calculates the weights of the 2 convolution kernels through a softmax activation function, applies the weights to the feature graph z, obtains 2 new feature graphs, and then performs connection fusion to obtain a final output image.
10. The method for training the deep residual super-resolution network model and the U-Net network model according to claim 2, wherein the method for calculating the model loss function value comprises:
calculating global similarity loss:
L 2 (G)=E X~Y [||Y-G(X)|| 2 ]
in the above equation, X is a distorted image, Y is a real image, G is a generator network, and G (X) is an image generated by a generator, where E is X~Y Representing the expectation that the distorted image is a real image, | | Y-G (X) | charging 2 A distance between the real image and the image generated by the generator;
calculating the perception loss:
in the above formula, r, G and b respectively represent the difference of the normalized values of the red, green and blue channels between the image G (X) generated by the generator and the real image Y,average of the red channels;
calculating the content loss:
in the above formula, X and Y are respectively a distorted image and a real image,showing the feature maps extracted from the fourth and fifth convolutional layers of the pre-trained VGG-19 network,the distance between the feature map of the real image and the feature map of the distorted image is calculated;
the total loss was calculated:
L g (G)=λ c L C (G)+λ p L P (G)+λ 2 L 2 (G)
in the above formula, λ c ,λ p And λ 2 Are weight values for content loss, perceptual loss, and global similarity loss.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210733444.0A CN115222592A (en) | 2022-06-27 | 2022-06-27 | Underwater image enhancement method based on super-resolution network and U-Net network and training method of network model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210733444.0A CN115222592A (en) | 2022-06-27 | 2022-06-27 | Underwater image enhancement method based on super-resolution network and U-Net network and training method of network model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115222592A true CN115222592A (en) | 2022-10-21 |
Family
ID=83609803
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210733444.0A Pending CN115222592A (en) | 2022-06-27 | 2022-06-27 | Underwater image enhancement method based on super-resolution network and U-Net network and training method of network model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115222592A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116467946A (en) * | 2023-04-21 | 2023-07-21 | 南京信息工程大学 | Deep learning-based mode prediction product downscaling method |
CN116740650A (en) * | 2023-08-10 | 2023-09-12 | 青岛农业大学 | Crop breeding monitoring method and system based on deep learning |
-
2022
- 2022-06-27 CN CN202210733444.0A patent/CN115222592A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116467946A (en) * | 2023-04-21 | 2023-07-21 | 南京信息工程大学 | Deep learning-based mode prediction product downscaling method |
CN116467946B (en) * | 2023-04-21 | 2023-10-27 | 南京信息工程大学 | Deep learning-based mode prediction product downscaling method |
CN116740650A (en) * | 2023-08-10 | 2023-09-12 | 青岛农业大学 | Crop breeding monitoring method and system based on deep learning |
CN116740650B (en) * | 2023-08-10 | 2023-10-20 | 青岛农业大学 | Crop breeding monitoring method and system based on deep learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111739078B (en) | Monocular unsupervised depth estimation method based on context attention mechanism | |
CN113658051B (en) | Image defogging method and system based on cyclic generation countermeasure network | |
CN110675321B (en) | Super-resolution image reconstruction method based on progressive depth residual error network | |
CN111784602B (en) | Method for generating countermeasure network for image restoration | |
CN113012172B (en) | AS-UNet-based medical image segmentation method and system | |
CN108022213A (en) | Video super-resolution algorithm for reconstructing based on generation confrontation network | |
CN111192200A (en) | Image super-resolution reconstruction method based on fusion attention mechanism residual error network | |
CN115222592A (en) | Underwater image enhancement method based on super-resolution network and U-Net network and training method of network model | |
CN110163246A (en) | The unsupervised depth estimation method of monocular light field image based on convolutional neural networks | |
CN109214989B (en) | Single image super resolution ratio reconstruction method based on Orientation Features prediction priori | |
CN106204447A (en) | The super resolution ratio reconstruction method with convolutional neural networks is divided based on total variance | |
CN111179167A (en) | Image super-resolution method based on multi-stage attention enhancement network | |
CN102915527A (en) | Face image super-resolution reconstruction method based on morphological component analysis | |
CN112837224A (en) | Super-resolution image reconstruction method based on convolutional neural network | |
CN111681166A (en) | Image super-resolution reconstruction method of stacked attention mechanism coding and decoding unit | |
CN111583285A (en) | Liver image semantic segmentation method based on edge attention strategy | |
CN113870124B (en) | Weak supervision-based double-network mutual excitation learning shadow removing method | |
CN111861886B (en) | Image super-resolution reconstruction method based on multi-scale feedback network | |
CN115578255B (en) | Super-resolution reconstruction method based on inter-frame sub-pixel block matching | |
CN115565056A (en) | Underwater image enhancement method and system based on condition generation countermeasure network | |
CN114842216A (en) | Indoor RGB-D image semantic segmentation method based on wavelet transformation | |
CN111383200A (en) | CFA image demosaicing method based on generative antagonistic neural network | |
CN114170286A (en) | Monocular depth estimation method based on unsupervised depth learning | |
CN115880158A (en) | Blind image super-resolution reconstruction method and system based on variational self-coding | |
CN116739899A (en) | Image super-resolution reconstruction method based on SAUGAN network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |