CN115565056A - Underwater image enhancement method and system based on condition generation countermeasure network - Google Patents

Underwater image enhancement method and system based on condition generation countermeasure network Download PDF

Info

Publication number
CN115565056A
CN115565056A CN202211179797.7A CN202211179797A CN115565056A CN 115565056 A CN115565056 A CN 115565056A CN 202211179797 A CN202211179797 A CN 202211179797A CN 115565056 A CN115565056 A CN 115565056A
Authority
CN
China
Prior art keywords
image
layer
underwater
global
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211179797.7A
Other languages
Chinese (zh)
Inventor
李振波
李一鸣
李飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Agricultural University
Original Assignee
China Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Agricultural University filed Critical China Agricultural University
Priority to CN202211179797.7A priority Critical patent/CN115565056A/en
Publication of CN115565056A publication Critical patent/CN115565056A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/05Underwater scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/32Normalisation of the pattern dimensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/42Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/80Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in fisheries management
    • Y02A40/81Aquaculture, e.g. of fish

Abstract

The invention provides an underwater image enhancement method and system based on a condition generation countermeasure network, which correct the color of a degraded underwater image by extracting and fusing multi-scale local features and global features, improve the feature extraction effect by constructing an attention module AMU for underwater image enhancement, improve the quality of the generated image by introducing perception loss and total variation loss in training and inhibit the occurrence of noise. The method can provide clear underwater environment information for high-level visual tasks such as behavior monitoring, disease identification and the like of intelligent aquaculture, and promote healthy and sustainable development of intelligent intensive aquaculture.

Description

Underwater image enhancement method and system based on condition generation countermeasure network
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to an underwater image enhancement method and system based on a condition generation countermeasure network.
Background
By 2021, the world aquaculture industry has reached $ 2094.2 billion in size. With the rapid development of the aquaculture industry, the requirements of work such as fish school behavior monitoring and fish disease identification are gradually expanded, so that clear underwater images are required to provide highly available image resources for the high-level visual tasks. At present, relevant researches prove that compared with an original image, the enhanced image has better improvement effects on key point matching, target detection, target tracking and the like. In the smart aquaculture industry, clear underwater image resources are needed for visual work such as underwater biological monitoring, underwater fish tracking and the like. However, unlike the atmospheric environment, in the underwater environment, the water body has the absorption and scattering effects on light, and suspended particles exist in water, which cause degradation phenomena such as color cast and blurring of the underwater image, and hinder the development of related underwater work.
The degraded underwater image is difficult to be directly applied to related underwater work of intelligent aquaculture, and challenges are brought to the traditional image processing technology. Therefore, researchers have gradually developed relevant studies for underwater image enhancement. The traditional underwater image enhancement method adopts fixed parameters and a physical model, and enhances a degraded image by adjusting the pixel value of the image. However, the above method only processes images in a single environment, and cannot adapt to various complicated underwater environments. Convolutional Neural Networks (CNNs) have found widespread use in many computer vision tasks due to their excellent performance. Therefore, researchers began to introduce CNNs into the field of underwater image enhancement. Providing an underwater image enhancement frame UIE-Net based on CNN for carrying out color correction; and introducing a residual learning strategy, and providing an underwater residual convolutional neural network URCNN by combining CNN. Since the emergence of the generation countermeasure network, the network has wide application in the fields of image processing, text generation, audio and video generation and the like, the network can generate data similar to reality in a countermeasure mode, and the characteristic makes up the defect that an underwater image data set lacks images before degradation. WaterGAN has also been proposed for generating paired underwater image datasets and performing color correction; by combining a cyclic generation countermeasure network (CycleGAN) and a dark channel preoperative algorithm, other researchers propose an underwater image restoration method based on a multi-scale cyclic generation countermeasure network (MCycle GAN) and a new underwater image enhancement model FUnIE-GAN based on a conditional generation countermeasure network (CGAN), and further construct a data set EUVP containing paired and unpaired underwater images. These learning-based methods are trained on large amounts of data to accommodate a variety of underwater environments. Therefore, the improvement of the clarity of underwater image extraction and the like are problems to be solved in urgent need of intelligent culture at the present stage.
Disclosure of Invention
In order to solve the problems, the invention provides an underwater image enhancement method and system based on a condition generation countermeasure network, which are used for carrying out color correction on degraded underwater images so as to provide a clear visual environment for subsequent visual work.
On one hand, the invention provides an underwater image enhancement method for generating a countermeasure network based on conditions, which comprises the following steps:
step 1: acquiring paired image sets of the underwater degraded images and the corresponding pure images, and dividing the paired image sets into a training set and a testing set;
step 2: scaling all images to the same size;
and step 3: model construction, comprising: global and local feature extraction is carried out on the image based on the structure of the coder and the decoder; fusing the global features with the local features of each scale respectively; performing layer-by-layer upsampling on the global features to perform image restoration, wherein each upsampling layer is connected with the fusion features of the corresponding scale; sending the generated image to a discriminator network, judging whether the image is from real data or not, and prompting the generator network to adjust;
and 4, step 4: training and testing the model, and storing the tested model;
and 5: and processing the actual underwater image by using the tested model.
Preferably, the codec structure is a modified U-Net network, which includes 8 layers of downsampling and performs global and local feature extraction on the input image by a layer-by-layer convolution.
More preferably, each of the down-sampling layers is composed of a LeakyRelu layer, a two-dimensional convolution layer, and a batch normalization layer.
Preferably, the attention module for underwater image enhancement is constructed by replacing the global averaging pooling module in SENEt with batch normalization scale factors in the NAM module based on the SENEt and NAM modules in the downsampling process.
More preferably, in the attention module, the input feature map is subjected to batch normalization layer and 1 × 1 convolution processing, multiplied by the weight coefficient, subjected to a ReLU activation function, 1 × 1 convolution layer and sigmoid activation function, and finally subjected to layer jump connection with the input feature map.
Preferably, global and local feature fusion is performed before layer-skipping connection is performed on the result of layer-by-layer up-sampling and the down-sampling result with the same resolution, and the fusion process is as follows:
step 4-1: global feature f is transformed by convolution layer with convolution kernel size 1 × 1 and step size 1 g C number of channels g Local feature map f adjusted to correspond to scale i l The same number of channels c i The step is expressed as f g1 =F conv (f g ,W)
Wherein, F conv Represents a convolution operation, W is a learnable weight;
step 4-2: to f is paired g1 Making copies with the number h i ×w i Wherein h is i And w i Local feature map f of scale i l Length and width of the operation expressed as
f g2 =F copy (f g1 ,num=h i ×w i )
Step 4-3: will f is g2 Remodelling into l Same dimension h i ×w i ×c i
f g3 =F re (f g2 ,size=h i ×w i ×c i )
Wherein, F re A reshaping operation is shown.
Step 4-4: will f is g3 And f l Performing a connecting operation
f out =F concat (f l ,f g3 )。
Preferably, the image restoration is performed based on a modified U-Net network, which includes 8-layer upsampling and corresponds to a downsampling layer.
More preferably, each of the upsampling layers includes a ReLU layer, a bilinear upsampling layer, a convolution layer, and a batch normalization layer.
Preferably, the overall objective function of the model training loss is:
Figure BDA0003866241950000041
wherein, WGAN-GP, L1, L p 、L TV Are all loss functions, λ 1 =10 -12 =10 -23 =10 -3
Figure BDA0003866241950000042
Wherein x is a degraded underwater image, gt is a real underwater image with good details,
Figure BDA0003866241950000043
for uniform sampling between the generated image G (x) and the real image gt, λ =10.
In another aspect, the present invention provides an underwater image enhancement system for generating a countermeasure network based on a condition, including:
the data set construction module is used for acquiring paired image sets of the underwater degraded images and the corresponding pure images, and dividing the paired image sets into a training set and a testing set;
the image processing module is used for scaling all the images into the same size;
a model building module comprising: extracting global and local features of the image based on the structure of a coder-decoder; fusing the global features with the local features of all scales respectively; performing layer-by-layer upsampling on the global features to perform image restoration, wherein each upsampling layer is connected with the fusion features of the corresponding scale; sending the generated image to a discriminator network, judging whether the image is from real data or not, and prompting the generator network to adjust;
the model training and testing module is used for inputting the image into the model for training and testing and storing the tested model;
and the model application module is used for processing the actual underwater image by using the tested model.
The invention has the beneficial effects that: aiming at the degradation phenomenon of the underwater image, the invention provides an underwater image enhancement method and system based on a condition generation confrontation network. An attention module AMU for underwater image enhancement is constructed at the tail end of the feature extraction network, so that the feature extraction effect is improved; the trained model weight difference measurement is utilized to highlight key features, and weight sparsity punishment is applied to the attention module, so that the calculation efficiency is improved; the perception loss and the total variation loss are introduced, so that the generated image has high-level semantic information similar to a real image, the image generation effect of a generator network is enhanced, and image noise is inhibited.
Drawings
FIG. 1 is a diagram of a prior art GAN model architecture;
FIG. 2 is a flow chart of an underwater image enhancement method for a condition generating countermeasure network according to an embodiment of the present invention;
FIG. 3 is a diagram of an SE module architecture with embedded ResNet;
FIG. 4 is a channel attention submodule diagram;
FIG. 5 is a spatial attention submodule diagram;
FIG. 6 is a block diagram of the SE module and AMU module;
fig. 7 is a visual comparison of an image enhanced using the method of the present invention in a UGAN dataset.
Detailed Description
The embodiments are described in detail below with reference to the accompanying drawings.
Due to the adoption of a confrontation type training mode of generating a confrontation network (GAN), the method has good performance in the fields of text generation, image processing and the like. The GAN contains two models, a generator and a discriminator, and during the network training process, the generator spoofs the discriminator by receiving random noise z to produce an instance similar to the original data, denoted G (z). The discriminator is used to determine whether the generator-produced instance is artificially forged or from genuine data. The input of the discriminator is x, namely the example generated by the generator; the output is D (x), i.e. the probability that x is the true data. The two parties are alternately optimized in the continuous iteration process, so that the balance of the two parties is achieved, namely, the generator can generate an example with better details, and the output result of the generator is difficult to judge by the judger. The overall flow of GAN is shown in figure 1 below.
The goal function of the GAN model is as follows:
Figure BDA0003866241950000051
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003866241950000052
refers to updating the parameters of the discriminator D by maximizing the cross entropy loss V (D, G) with the generator fixed.
Figure BDA0003866241950000061
Refers to the generator to minimize this cross-entropy loss in the case that the discriminator maximizes the true and false instance cross-entropy loss V (D, G). During the training process, the parameters of the discriminator are generally updated first, because the training is performed at the beginningIn addition, the performance of the discriminator is poor, and the function of pushing the generator to generate higher quality examples cannot be achieved.
Compared with the traditional generation of the countermeasure network, the underwater image enhancement method based on the condition generation countermeasure network has the advantages that condition information is introduced into the input of the condition generation countermeasure network, and the generation result of the whole network is more stable and controllable. Fig. 2 is a flowchart thereof. The method comprises the following steps:
step 1: data set construction
And selecting a UGAN data set as a training and testing data set of the method, wherein the UGAN data set is generated by learning the mapping relation between the degraded image and the pure image through the cycleGAN. In this embodiment the data set contains 6128 pairs of images. 6000 pairs of images were selected as the training set, and the remaining 128 pairs of images were the test set.
Step 2: image pre-processing
The image preprocessing is mainly to unify the image size and scale the image into an image with a consistent size. All images are scaled to a size of 256 x 256 in embodiments of the present invention.
And step 3: multi-scale feature extraction
The global feature map generally contains the overall information of the image, such as color, texture, shape and the like, and can enhance the perception capability of the model to the scene environment. The invention refers to a classical U-Net network and carries out global and local feature extraction based on a codec structure. In the global and local feature extraction network, 4-layer down sampling in the original U-Net network is expanded to 8-layer down sampling, so that local features with more scales are extracted, and semantic information of a global feature map is enriched. In addition, different from the maximum pooling method for down-sampling in the U-Net network, the global and local feature extraction is carried out on the input image in a layer-by-layer convolution mode, each down-sampling layer consists of a LeakyRelu layer, a two-dimensional convolution layer (the convolution kernel size is 4, the step length is 2) and a batch normalization layer, and the output size is 1 × 1 × c g ,c g Is the number of channels. The down-sampling mode can improve the extraction effect of local features, so that the generated image has more details.
With the increase of the number of the down-sampling layers, the number of the included features is gradually increased, and in order to enable the network to focus on key features of the image, an Attention Module (AMU) for underwater image enhancement is constructed based on a SEnet and NAM module in the down-sampling process, and the module focuses on detail information and context information, so that the feature extraction effect is improved.
The SEnet model can be conveniently embedded in other network structures. SENET focuses more on the connection in the channel dimension, including both the Squeeze and the Excitation operations. In the Squeeze operation, the model encodes features of the entire space in channel dimensions into a global feature map using global average pooling. In the Excitation operation, the SENET can learn the weight coefficient of each channel, and the distinguishing capability of the model for the characteristics of each channel is enhanced. In related experiments, the SE module is embedded into other networks, such as ResNet, VGG-16 and the like, and obviously improves the error index, and the SE module embedded into ResNet is shown in FIG. 3.
A normalization-based channel attention module (NAM) aims at focusing on insignificant feature weights, by imposing sparse weight penalties on the attention module, maintaining network performance while improving weight computation efficiency. The NAM module is integrated based on the CBAM module, redesigning the channel and spatial attention sub-modules. In the residual network, this module is embedded at the end of the residual structure, and in the channel attention submodule, this module uses the scale factor in the batch normalization, whose formula is as follows:
Figure BDA0003866241950000071
wherein gamma and beta are trainable transformation parameters,
Figure BDA0003866241950000072
and
Figure BDA0003866241950000073
respectively in small batches
Figure BDA0003866241950000074
Mean and standard deviation of (d). Channel attention submodule is shown in FIG. 4, M c Representing the output, gamma is the scale factor of each channel, and omega is the weight of each channel.
The normalized scale factor is also applied in the spatial attention sub-module, named pixel normalization. The spatial attention submodule is shown in FIG. 5, M s Representing the output, λ is a scaling factor.
In the invention, the global average pooling module in SENET is replaced by a batch normalization scale factor in a NAM module to improve the inhibition effect of the non-salient features. The structure of the AMU module is shown in fig. 6, and the input feature map is subjected to batch normalization layer and 1 × 1 convolution processing, multiplied by a weight coefficient, subjected to a ReLU activation function, a 1 × 1 convolution layer and a sigmoid activation function, and finally subjected to layer-skipping connection with the input feature map.
And 4, step 4: global and local feature fusion
In order to enable the global feature map with high-level semantic information to improve the processing effect of a low-resolution image and the color and detail of the image after enhancement, a global and local feature fusion module is constructed before the layer-by-layer up-sampling result and the down-sampling result with the same resolution are subjected to layer skipping connection, so that artifacts generated in the enhanced image are suppressed. The flow of the module is as follows:
first, the global feature map f is formed by a convolution layer with a convolution kernel size of 1 × 1 and a step size of 1 g C number of channels g Local feature map f adjusted to correspond to scale i l The same number of channels c i The step is represented as f g1 =F conv (f g ,W)
Wherein, F conv Representing a convolution operation, W is a learnable weight.
Then, for f g1 Making a copy with the number h i ×w i Wherein h is i And w i Local feature map f of scale i l Length and width of, the operation being represented as
f g2 =F copy (f g1 ,num=h i ×w i )
Then, f is mixed g2 Remodelling with f l Same dimension h i ×w i ×c i
f g3 =F re (f g2 ,size=h i ×w i ×c i )
Wherein, F re A remolding operation is shown.
Finally, f is g3 And f l Performing a connecting operation
f out =F concat (f l ,f g3 )
To this end, the global feature map completes the convolution, copy, reshaping and connection steps.
And 5: feature upsampling
And performing image restoration on the global feature map by up-sampling layer by layer, and performing connection operation on each up-sampling layer and the fusion features with the same size to correct the color cast phenomenon in the original image. And expanding the 4-layer upsampling in the U-Net network into 8-layer upsampling, wherein the 8-layer upsampling corresponds to a downsampling layer in a characteristic extraction stage. Each upsampled layer includes a ReLU layer, a bilinear upsampled layer, a convolutional layer (convolutional kernel size 4, step size 2), and a batch normalization layer, with the output size of 256 × 256 × c g ,c g Is the number of channels.
Step 6: image discrimination
The generated image is sent to a discriminator network PatchGAN for discrimination, and the input is 256 × 256 × c g ,c g For the number of channels, patchGAN maps the input image into an N × N matrix, each point in the matrix represents a discrimination value for a small region of the image, such discrimination can discriminate more details of the image, and when all regions have good details, the whole image is discriminated as true.
And 7: model training and testing
And introducing a loss function of the WGAN-GP in the model training stage to stabilize the training, wherein the formula is as follows:
Figure BDA0003866241950000091
wherein x is a degraded underwater image, gt is a real underwater image with good details,
Figure BDA0003866241950000092
for uniform sampling between the generated image G (x) and the real image gt, λ represents a weighting factor.
In addition, the conventional L1 penalty can cause the generator to generate less ambiguity than the L2 penalty, and therefore, the present invention introduces the L1 penalty, whose formula is as follows:
Figure BDA0003866241950000093
the invention introduces a perception loss function, and restricts the generated image on the depth characteristic layer surface, so that the generated image has high-level semantic information similar to a real image. The perception loss model is trained on the basis of a VGG-19 network, and weight distribution is carried out on the characteristic matching of each module, wherein the formula is as follows:
Figure BDA0003866241950000094
wherein the content of the first and second substances,
Figure BDA0003866241950000095
a jth convolution layer, J representing a reference image,
Figure BDA0003866241950000096
to an enhanced image.
In order to reduce the noise of the generated image and increase the smoothness of the image, the invention introduces the traditional total variation loss function, and the formula is as follows:
Figure BDA0003866241950000097
wherein the content of the first and second substances,
Figure BDA0003866241950000098
in order to be a horizontal gradient operator, the method comprises the following steps of,
Figure BDA0003866241950000099
representing a vertical gradient operator.
The overall objective function is as follows:
Figure BDA0003866241950000101
before training, all pictures involved in training were scaled to a size of 256 × 256. Model training was performed on Intel (R) Xeon (R) E5-2630 v4 and NVIDIA GTX 1080, with the environment configured as Pytroch 1.5, with the weight of the loss function set at λ =10, λ 1 =10 -12 =10 -23 =10 -3 . An Adam optimizer is introduced to replace the traditional gradient descent optimization algorithm, and the initial learning rate is set to be 1e -4 ,β 1 =0.5,β 2 =0.99, batch size set to 16, and number of iterations of model training was 50.
A comparison of UGAN dataset evaluation metrics is shown in table 1.
TABLE 1 UGAN data set evaluation index comparison
PSNR↑ SSIM↑ UIQM↑ UCIQE↑
Fusion 18.2647 0.6437 2.7266 0.0625
IBLA 20.2019 0.6059 3.1725 0.0523
UDCP 18,6979 0.6171 3.5883 0.0415
ULAP 20.6336 0.6535 3.3515 0.0533
UGAN 23.3311 0.7497 2.8354 0.0392
FunieGAN 22.8422 0.7248 3.1934 0.0788
WaterNet 23.5637 0.7491 2.4786 0.0393
Style-Transfer 24.2179 0.7714 2.9364 0.0695
UWCNN 17.2855 0.6332 2.3561 0.0452
MLFcGAN 25.1974 0.7982 4.1145 0.0533
MA-cGAN 26.1698 0.8281 5.0935 0.0638
MA-cGAN has obvious advantages in PSNR and SSIM indexes. Notably, the traditional approach generally scores lower than the learning-based approach, which also reflects the advantages of the learning-based approach. On no reference scale, MA-cGAN has a clear advantage in UIQM, which means that the image enhanced by the method of the present invention is at a good level in terms of color balance, sharpness and contrast.
Because of the lack of the matched pure images of the real images, the invention only selects the non-reference index to evaluate the quality of the enhanced result, and the comparison result is shown in the table 2.
TABLE 2 comparison of true data set evaluation indices
UIQM↑ UICM↑ UISM↑ UIConM↑ UCIQE↑
Fusion 3.8687 3.2421 1.8985 0.0536 0.0465
IBLA 3.7646 3.6631 1.3435 0.0643 0.0476
UDCP 3.4876 3.1727 1.0694 0.0319 0.0297
ULAP 3.6588 3.6379 1.2364 0.0719 0.0488
UGAN 2.4739 2.5876 0.9506 0.0374 0.0314
FunieGAN 2.5422 3.1297 1.1004 0.0526 0.0469
WaterNet 2.7389 2.6592 1.0301 0.0584 0.0421
Style-Transfer 3.3795 3.0789 1.1373 0.0939 0.0513
UWCNN 2.5208 2.2156 1.1123 0.0417 0.0494
MLFcGAN 3.4831 3.0118 1.2366 0.0549 0.0536
MA-cGAN 4.0794 3.3511 1.1626 0.0517 0.0562
MA-cGAN performed well in the UIQM and UCIQE indices, indicating that the results using the method of the present invention have good color density and sharpness. In the aspect of no reference index, the traditional method is superior to the learning-based method, the enhanced result of the methods may have better saturated colors, but the images with higher saturation degree may not be used for subsequent target detection and other works. The results also show that the method of the invention (MA-cGAN) can be applied to a variety of underwater environments.
The comparison of the enhanced images in the UGAN dataset is shown in fig. 7. The results show that the learning-based method can achieve better effects than the traditional method. The results of conventional methods are mostly supersaturation phenomena such as UDCP, IBLA and olap. For Fusion, there is an overexposure as a result. In the learning-based approach, some texture information is lost as a result of GAN-based approaches such as UGAN, funieGAN, and Style-Transfer. Meanwhile, the results of CNN-based methods, including waters net and UWCNN, lack detailed information. Unlike the above results, the effect of MLFcGAN appears to be more natural. The results using the method of the invention are further optimized with respect to color saturation compared to MLFcGAN.
And 8: and carrying out underwater image processing by using the trained model.
In addition, the invention also provides an underwater image enhancement system based on the condition generation countermeasure network, which comprises:
the data set construction module is used for acquiring an underwater degraded image and a corresponding pure image set and dividing the underwater degraded image and the corresponding pure image set into a training set and a testing set;
the image processing module is used for scaling all the images into the same size;
the global and local feature extraction module is used for extracting global and local features of the image based on the structure of the coder and the decoder;
the global and local feature fusion module is used for fusing the global feature map with the local features of each scale respectively;
the characteristic up-sampling module is used for carrying out up-sampling on the global characteristic diagram layer by layer to carry out image restoration, and each up-sampling layer is connected with the fusion characteristic of the corresponding scale;
the image discrimination module is used for sending the generated image into the discriminator network, judging whether the image comes from real data or not and prompting the generator network to adjust;
the model training and testing module is used for inputting the image into the model for training and testing and storing the tested model;
and the model application module is used for processing the underwater image by using the tested model.
The present invention is not limited to the above embodiments, and any changes or substitutions that can be easily made by those skilled in the art within the technical scope of the present invention are also within the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.

Claims (10)

1. An underwater image enhancement method for generating a countermeasure network based on conditions comprises the following steps:
step 1: acquiring paired image sets of the underwater degraded images and the corresponding pure images, and dividing the paired image sets into a training set and a testing set;
step 2: scaling all images to the same size;
and step 3: model construction, comprising: global and local feature extraction is carried out on the image based on the structure of the coder and the decoder; fusing the global features with the local features of each scale respectively; performing layer-by-layer upsampling on the global features to perform image restoration, wherein each upsampling layer is connected with the fusion features of the corresponding scale; sending the generated image to a discriminator network, judging whether the image is from real data or not, and prompting the generator network to adjust;
and 4, step 4: training and testing the model, and storing the tested model;
and 5: and processing the actual underwater image by using the tested model.
2. The underwater image enhancement method for generating the countermeasure network based on the conditions as claimed in claim 1, wherein the codec structure is a modified U-Net network, which includes 8 layers of downsampling, and performs global and local feature extraction on the input image by a layer-by-layer convolution manner.
3. The underwater image enhancement method based on the condition generating countermeasure network of claim 2, wherein each of the down-sampling layers is composed of a LeakyRelu layer, a two-dimensional convolution layer, and a batch normalization layer.
4. The method for enhancing underwater images based on the conditional generation countermeasure network of claim 2, wherein an attention module for underwater image enhancement is constructed by replacing a global average pooling module in SENEt with a batch normalization scale factor in an NAM module based on SENEt and NAM modules in a downsampling process.
5. The underwater image enhancement method based on the condition generation countermeasure network of claim 4, wherein in the attention module, the input feature map is processed by batch normalization layer and 1 x 1 convolution, multiplied by weight coefficients, then processed by ReLU activation function, 1 x 1 convolution layer and sigmoid activation function, and finally connected with the input feature map in a layer jump way.
6. The underwater image enhancement method for generating the countermeasure network based on the condition as claimed in claim 2, wherein global and local feature fusion is performed before layer-skipping connection is performed on the results of layer-by-layer up-sampling and the down-sampling results with the same resolution, and the fusion process is as follows:
step 4-1: by convolution of volumes with kernel size 1X 1 and step size 1Build-up global feature f g C number of channels g Local feature map f adjusted to correspond to scale i l The same number of channels c i The step is represented as
f g1 =F conv (f g ,W)
Wherein, F conv Represents a convolution operation, W is a learnable weight;
step 4-2: to f is paired g1 Making copies with the number h i ×w i Wherein h is i And w i Local feature map f of scale i l Length and width of, the operation being represented as
f g2 =F copy (f g1 ,num=h i ×w i )
Step 4-3: will f is g2 Remodelling into l Same dimension h i ×w i ×c i
f g3 =F re (f g2 ,size=h i ×w i ×c i )
Wherein, F re A reshaping operation is shown.
Step 4-4: will f is g3 And f l Performing a connecting operation
f out =F concat (f l ,f g3 )。
7. The underwater image enhancement method based on the condition generation countermeasure network of claim 2, wherein the image restoration is performed based on a modified U-Net network, which comprises 8-layer up-sampling and corresponds to a down-sampling layer.
8. The method of claim 7, wherein each of the upsampling layers comprises a ReLU layer, a bilinear upsampling layer, a convolution layer, and a batch normalization layer.
9. The underwater image enhancement method based on the condition generation countermeasure network of claim 1, wherein the overall objective function of the model training loss is as follows:
Figure FDA0003866241940000021
wherein, WGAN-GP, L1, L p 、L TV Are all loss functions, λ 1 =10 -1 ,λ 2 =10 -2 ,λ 3 =10 -3
Figure FDA0003866241940000031
Wherein x is a degraded underwater image, gt is a real underwater image with good details,
Figure FDA0003866241940000032
for uniform sampling between the generated image G (x) and the real image gt, λ =10.
10. An underwater image enhancement system for generating a confrontation network based on conditions, comprising:
the data set construction module is used for acquiring paired image sets of the underwater degraded images and the corresponding pure images, and dividing the paired image sets into a training set and a testing set;
the image processing module is used for scaling all the images into the same size;
a model building module comprising: global and local feature extraction is carried out on the image based on the structure of the coder and the decoder; fusing the global features with the local features of each scale respectively; performing layer-by-layer upsampling on the global features to perform image restoration, wherein each upsampling layer is connected with the fusion features of the corresponding scale; sending the generated image to a discriminator network, judging whether the image is from real data or not, and prompting the generator network to adjust;
the model training and testing module is used for inputting the image into the model for training and testing and storing the tested model;
and the model application module is used for processing the actual underwater image by using the tested model.
CN202211179797.7A 2022-09-27 2022-09-27 Underwater image enhancement method and system based on condition generation countermeasure network Pending CN115565056A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211179797.7A CN115565056A (en) 2022-09-27 2022-09-27 Underwater image enhancement method and system based on condition generation countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211179797.7A CN115565056A (en) 2022-09-27 2022-09-27 Underwater image enhancement method and system based on condition generation countermeasure network

Publications (1)

Publication Number Publication Date
CN115565056A true CN115565056A (en) 2023-01-03

Family

ID=84742138

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211179797.7A Pending CN115565056A (en) 2022-09-27 2022-09-27 Underwater image enhancement method and system based on condition generation countermeasure network

Country Status (1)

Country Link
CN (1) CN115565056A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116029947A (en) * 2023-03-30 2023-04-28 之江实验室 Complex optical image enhancement method, device and medium for severe environment
CN116681627A (en) * 2023-08-03 2023-09-01 佛山科学技术学院 Cross-scale fusion self-adaptive underwater image generation countermeasure enhancement method
CN117391975A (en) * 2023-12-13 2024-01-12 中国海洋大学 Efficient real-time underwater image enhancement method and model building method thereof
CN117808712A (en) * 2024-02-28 2024-04-02 山东科技大学 Image correction method based on underwater camera

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116029947A (en) * 2023-03-30 2023-04-28 之江实验室 Complex optical image enhancement method, device and medium for severe environment
CN116681627A (en) * 2023-08-03 2023-09-01 佛山科学技术学院 Cross-scale fusion self-adaptive underwater image generation countermeasure enhancement method
CN116681627B (en) * 2023-08-03 2023-11-24 佛山科学技术学院 Cross-scale fusion self-adaptive underwater image generation countermeasure enhancement method
CN117391975A (en) * 2023-12-13 2024-01-12 中国海洋大学 Efficient real-time underwater image enhancement method and model building method thereof
CN117391975B (en) * 2023-12-13 2024-02-13 中国海洋大学 Efficient real-time underwater image enhancement method and model building method thereof
CN117808712A (en) * 2024-02-28 2024-04-02 山东科技大学 Image correction method based on underwater camera

Similar Documents

Publication Publication Date Title
CN112001960B (en) Monocular image depth estimation method based on multi-scale residual error pyramid attention network model
CN115565056A (en) Underwater image enhancement method and system based on condition generation countermeasure network
CN112132959A (en) Digital rock core image processing method and device, computer equipment and storage medium
CN113392711B (en) Smoke semantic segmentation method and system based on high-level semantics and noise suppression
CN111583285A (en) Liver image semantic segmentation method based on edge attention strategy
CN109949200B (en) Filter subset selection and CNN-based steganalysis framework construction method
CN113256494B (en) Text image super-resolution method
CN115620010A (en) Semantic segmentation method for RGB-T bimodal feature fusion
CN115063318A (en) Adaptive frequency-resolved low-illumination image enhancement method and related equipment
CN116563693A (en) Underwater image color restoration method based on lightweight attention mechanism
CN116739899A (en) Image super-resolution reconstruction method based on SAUGAN network
CN112037225A (en) Marine ship image segmentation method based on convolutional nerves
CN115641391A (en) Infrared image colorizing method based on dense residual error and double-flow attention
CN115100165A (en) Colorectal cancer T staging method and system based on tumor region CT image
Zhang et al. Mffe: Multi-scale feature fusion enhanced net for image dehazing
CN116823659A (en) Low-light level image enhancement method based on depth feature extraction
CN111814693A (en) Marine ship identification method based on deep learning
CN116137043A (en) Infrared image colorization method based on convolution and transfomer
CN114663315B (en) Image bit enhancement method and device for generating countermeasure network based on semantic fusion
CN115660979A (en) Attention mechanism-based double-discriminator image restoration method
CN115205624A (en) Cross-dimension attention-convergence cloud and snow identification method and equipment and storage medium
Wu et al. Fish Target Detection in Underwater Blurred Scenes Based on Improved YOLOv5
CN116523985B (en) Structure and texture feature guided double-encoder image restoration method
CN117151990B (en) Image defogging method based on self-attention coding and decoding
CN117314751A (en) Remote sensing image super-resolution reconstruction method based on generation type countermeasure network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination