CN115713469A - Underwater image enhancement method for generating countermeasure network based on channel attention and deformation - Google Patents

Underwater image enhancement method for generating countermeasure network based on channel attention and deformation Download PDF

Info

Publication number
CN115713469A
CN115713469A CN202211394443.4A CN202211394443A CN115713469A CN 115713469 A CN115713469 A CN 115713469A CN 202211394443 A CN202211394443 A CN 202211394443A CN 115713469 A CN115713469 A CN 115713469A
Authority
CN
China
Prior art keywords
module
convolution
deformation
channel attention
convolution kernel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211394443.4A
Other languages
Chinese (zh)
Inventor
王宁
陈廷凯
孔祥军
陈延政
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Maritime University
Original Assignee
Dalian Maritime University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Maritime University filed Critical Dalian Maritime University
Priority to CN202211394443.4A priority Critical patent/CN115713469A/en
Publication of CN115713469A publication Critical patent/CN115713469A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Processing (AREA)

Abstract

The invention relates to an underwater image enhancement method for generating a confrontation network based on channel attention and deformation, which comprises the following steps of: acquiring an underwater image construction data set, and dividing the data set into a training set and a test set; constructing a self-adaptive channel attention module which is used for calibrating channel weight again and has a multi-scale receptive field; constructing a deformation convolution module which is used for carrying out feature extraction and faces convolution kernel migration; fusing the adaptive channel attention module and the deformation convolution module to generate a confrontation network; training the generated countermeasure network based on the training set data to obtain a trained generated countermeasure network; inputting the test set data into a trained generation countermeasure network to obtain an enhanced underwater image; by utilizing a single hidden layer neural network and a global average pooling technology, an adaptive channel attention module with different receptive fields is constructed, which is beneficial to reducing the influence of mixed noise on a characteristic layer; the enhanced consistency of the interested objects under different scene depths is improved.

Description

Underwater image enhancement method for generating confrontation network based on channel attention and deformation
Technical Field
The invention belongs to the field of underwater intelligent fishing robots, and relates to an underwater image enhancement method for generating a countermeasure network based on channel attention and deformation.
Background
The ocean reserves precious wealth for human sustainable development and is also strategic importance for high-quality development. The underwater optical image has the advantages of strong information bearing capacity, low cost, rich information and the like, and plays a vital role in the fields of underwater archaeology, sunken ship salvage, marine ranch culture, monitoring and the like. It is noted that due to the inherent water absorption and scattering effects, the underwater optical image usually shows comprehensive degradation such as uneven illumination, color cast, low contrast, blurred edge details, and the like in the imaging process, and the subsequent tasks such as detection and identification, image segmentation, and the like are seriously affected.
With the rapid development of artificial intelligence, the deep learning technology based on data driving has been widely applied to tasks such as underwater image denoising, detail recovery, super-resolution and the like. It is noted that underwater image enhancement can be used as a domain switching operation from degraded to sharp images. Currently, an underwater image enhancement method mainly includes: a model-less enhancement method, a model-based restoration method, and a data-driven domain mapping method.
(1) Model-free enhancement method
By solving the problems of color distortion and low Contrast in a single color space, related scholars have proposed Contrast Limited Adaptive Histogram Equalization (CLAHE) and White Balance (WB) techniques. By means of bilateral and trilateral filtering techniques, the relevant scholars design a multi-scale Retinex framework to realize underwater image enhancement. In addition, four representative weight maps of laplacian contrast, local contrast, saliency and exposure are fused by using white balance and histogram equalization strategies, and Fusion underwater image enhancement methods are proposed by related scholars.
(2) Model-based restoration method
By estimating the transmission profile and background light, the related scholars proposed a Dark Channel Prior (DCP) based approach. By comprehensively considering the transition differences between the airborne and Underwater scenarios, the related scholars build the Underwater Dark Channel Prior (UDCP) framework. By addressing the practical problem of the occasional lower pixel values in the blue channel than in the red channel, the related scholars propose a two-channel based approach to achieve underwater image enhancement.
(3) Domain mapping method based on data driving
By combining aerial images and depth images, related scholars propose a two-stage WaterGAN scheme to achieve underwater image enhancement. By utilizing the corrected underwater image imaging model and scene parameters, relevant scholars propose a UWCNN method based on a synthesized image. To eliminate the limitation of using paired underwater images, related scholars propose a weakly supervised learning approach with a loss of cyclic consistency. In order to significantly improve the training stability in the underwater image enhancement process, a relevant scholaree creates a Wasserstein GAN method by calculating the distance between data distribution and model distribution.
The existing underwater image enhancement method under the complex seabed environment mainly has the following defects: (1) The method is difficult to solve various degradation types based on a model-free enhancement method; (2) The method based on model restoration needs to estimate a large number of parameters and has complex modeling process; (3) The data-driven domain mapping method introduces a large amount of underwater mixed noise, incomplete foreground object construction and poor enhancement consistency of the interested objects under different scene depths.
Disclosure of Invention
In order to solve the above problems, the present invention provides the following technical solutions: an underwater image enhancement method for generating a countermeasure network based on channel attention and deformation comprises the following steps:
acquiring an underwater image construction data set, and dividing the data set into a training set and a test set;
constructing a self-adaptive channel attention module which is used for calibrating channel weight again and has a multi-scale receptive field;
constructing a deformation convolution module for feature extraction and convolution kernel offset oriented;
fusing the adaptive channel attention module and the deformation convolution module to generate a confrontation network;
training the generated countermeasure network based on the training set data to obtain a trained generated countermeasure network;
and inputting the test set data into a trained generation countermeasure network to obtain an enhanced underwater image.
Further: the process of constructing the adaptive channel attention module with multi-scale receptive fields for re-calibrating channel weights is as follows:
the convolution operation is represented as:
Figure BDA0003932851450000021
wherein the content of the first and second substances,
Figure BDA0003932851450000022
the ith layer representing the input features X,
Figure BDA0003932851450000024
represents the ith layer weight of the kth filter,
Figure BDA0003932851450000023
representing convolution operations, U k Representing the convolution output, c represents the number of feature layers;
the fusion operation is represented as:
Figure BDA0003932851450000031
wherein the content of the first and second substances,
Figure BDA0003932851450000032
and
Figure BDA0003932851450000033
respectively representing the characteristics obtained using the 3 x 3 and 5 x 5 filters,
Figure BDA0003932851450000034
is a fusion feature obtained by pixel-by-pixel addition;
the average response for each feature map is calculated using global average pooling, which is expressed as:
Figure BDA0003932851450000035
wherein the content of the first and second substances,
Figure BDA0003932851450000036
which is representative of the value of the response of the channel,
Figure BDA0003932851450000037
representing the fused features, w and h are the width and height of the feature map, respectively, and m and n are the abscissa and ordinate indices of the features, respectively;
the nonlinear mapping capability is improved by utilizing a single hidden layer neural network, which is expressed as:
z fc =R(W fc s) (4)
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003932851450000038
the output of the hidden layer is represented by,
Figure BDA0003932851450000039
representing the weight between the input layer and the hidden layer, d = max (c/R, L) representing the number of nodes of the hidden layer, R representing the reduction rate of the channel dimension, L representing the minimum value of the number of channels, and R being a leak ReLU activation function;
further, a suitable spatial receptive field is adaptively selected, which is expressed as:
Figure BDA00039328514500000310
Figure BDA00039328514500000311
wherein the content of the first and second substances,
Figure BDA00039328514500000312
and
Figure BDA00039328514500000313
the weights between the hidden and output layers for the 3 x 3 and 5 x 5 branches respectively,
Figure BDA00039328514500000314
and
Figure BDA00039328514500000315
channel attention weights for the 3 × 3 and 5 × 5 branches, respectively;
the recalibrated signature V is expressed as:
Figure BDA00039328514500000316
wherein the content of the first and second substances,
Figure BDA00039328514500000317
it is the output characteristic of the entire adaptive channel attention module, which indicates an element-by-element product.
Further, the process of constructing the deformed convolution module for feature extraction and facing the convolution kernel offset is as follows:
the sample positions of the standard convolution kernel are expressed as:
P={(u,v)|u,v∈{-k,-(k-1),…,k}} (8)
wherein u and v respectively represent the abscissa and ordinate of the sampling position of the convolution kernel,
Figure BDA0003932851450000041
representing the edge positions of the convolution kernel, w f Represents the current convolution kernel width;
the sample positions of the morpho-convolution are expressed as:
Figure BDA0003932851450000042
wherein p is 0 Representing the center point of the convolution kernel, p m =(x m ,y m ) Is the m-th element standard sample position, x, in the convolution kernel m And y m Respectively representing the abscissa and ordinate of the standard sampling position of the mth element in the convolution kernel, m =1,2, \ 8230;, n represents the index of the element in the convolution kernel, n = | P | is the total number of elements in the convolution kernel, Δ P m Representing the magnitude of the offset of the mth element in the convolution kernel,
Figure BDA0003932851450000043
is the deformed sample position of the mth element in the convolution kernel,
Figure BDA0003932851450000044
and
Figure BDA0003932851450000045
respectively representing the abscissa and the ordinate of the deformation position of the mth element in the convolution kernel;
the sampling positions of the deformed standard convolution kernels should satisfy:
Figure BDA0003932851450000046
Figure BDA0003932851450000047
computing non-integer positions from the X direction using bilinear interpolation techniques
Figure BDA0003932851450000048
Characteristic value of
Figure BDA0003932851450000049
It is expressed as:
Figure BDA00039328514500000410
wherein: p is a radical of formula tl =(x i ,y j ),p tr =(x i+1 ,y j ),p bl =(x i ,y j+1 ) And p br =(x i+1 ,y j+1 ) Respectively representing the coordinate positions of upper left, upper right, lower left and lower right integers closest to the deformation position;
having coordinates of
Figure BDA00039328514500000411
Characteristic value of
Figure BDA00039328514500000412
Expressed as:
Figure BDA00039328514500000413
position of
Figure BDA00039328514500000414
Characteristic value of
Figure BDA00039328514500000415
Expressed as:
Figure BDA00039328514500000416
wherein the content of the first and second substances,
Figure BDA00039328514500000417
is the m-th element in the convolution kernel at the deformed position
Figure BDA00039328514500000418
A characteristic value of (d);
the output of the entire morphable convolution module is represented as:
Figure BDA00039328514500000419
wherein, O (x) 0 ,y 0 ) Is the output of performing the inflected convolution operation,
Figure BDA00039328514500000420
represents the weight of the mth element, σ (w) m ) The modulation operation used to emphasize the importance of the mth bias position.
Further, the generation countermeasure network includes a generator and an arbiter connected in series;
the generator comprises an encoder and a decoder connected in series;
the encoder includes an adaptive channel attention module and a deformable convolution module connected in series.
Further, the losses employed for training generation of the countermeasure network based on the training set data include WGAN-GP losses, and image gradient difference losses.
An underwater image enhancement device for generating a confrontation network based on channel attention and deformation, comprising:
an acquisition module: acquiring an underwater image construction data set, and dividing the data set into a training set and a test set;
constructing a module I: constructing a self-adaptive channel attention module which is used for calibrating channel weight again and has a multi-scale receptive field;
constructing a module II: the deformation convolution module is used for constructing a deformation convolution module which is used for carrying out feature extraction and faces convolution kernel offset;
a fusion module: the adaptive channel attention module and the deformation convolution module are fused to generate a confrontation network;
a training module: the system is used for training the generated countermeasure network based on the training set data to obtain the trained generated countermeasure network;
obtaining a module: and inputting the test set data into the trained generation countermeasure network to obtain the enhanced underwater image.
The underwater image enhancement method based on the channel attention and deformation generation countermeasure network is adopted, speckle noise, gaussian noise and impulse noise introduced in the underwater imaging process are considered, and by means of a single-hidden-layer neural network and a global average pooling technology, the underwater image enhancement method constructs adaptive channel attention modules with different scale receptive fields; considering that the traditional stacked convolutional layer only has very limited foreground object coding performance, the invention provides a feature extraction method based on a deformation convolutional network by utilizing a layer-by-layer convolutional kernel migration strategy; the underwater image enhancement method for generating the countermeasure network based on the channel attention and the deformation has the following beneficial effects that by combining the adaptive channel attention module and the deformation convolution module, the underwater image enhancement method provided by the invention has the following advantages:
(1) According to the method, a single hidden layer neural network and a global average pooling technology are utilized to construct the adaptive channel attention module with different receptive fields, so that on one hand, the influence of mixed noise (speckle noise, gaussian noise and pulse noise) on a characteristic layer is reduced; on the other hand, the enhanced consistency of the interested objects under different scene depths is improved;
(2) A convolution kernel migration method and a bias position modulation mechanism are constructed, and a feature extraction strategy based on a deformation convolution network is provided, so that the foreground object coding capacity is enhanced from a spatial level;
(3) Utilizing L in conjunction with adaptive channel attention modules and morphed convolutional networks 1 Loss, image gradient error loss and generation countermeasure loss, and a generation countermeasure network framework is constructed, so that underwater image enhancement performance is improved from two layers of channels and space.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a diagram of an adaptive channel attention module;
FIG. 2 is a diagram of a morphed convolution module;
FIG. 3 is a block diagram of an overall underwater image enhancement framework;
FIG. 4 is a chart of subjective evaluation results of UIEB data sets;
FIG. 5 is a diagram of subjective evaluation results of URPC data sets;
FIG. 6 is a graph of keypoint match comparison results.
Detailed Description
It should be noted that, in the case of conflict, the embodiments and features of the embodiments of the present invention may be combined with each other, and the present invention will be described in detail with reference to the accompanying drawings and embodiments.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all the embodiments. The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
The relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise. Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description. Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate. Any specific values in all examples shown and discussed herein are to be construed as exemplary only and not as limiting. Thus, other examples of the exemplary embodiments may have different values. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
In the description of the present invention, it is to be understood that the orientation or positional relationship indicated by the directional terms such as "front, rear, upper, lower, left, right", "lateral, vertical, horizontal" and "top, bottom", etc., are generally based on the orientation or positional relationship shown in the drawings, and are used for convenience of description and simplicity of description only, and in the absence of any contrary indication, these directional terms are not intended to indicate and imply that the device or element so referred to must have a particular orientation or be constructed and operated in a particular orientation, and therefore should not be considered as limiting the scope of the present invention: the terms "inner and outer" refer to the inner and outer relative to the profile of the respective component itself.
For ease of description, spatially relative terms such as "over 8230," "upper surface," "above," and the like may be used herein to describe the spatial positional relationship of one device or feature to other devices or features as illustrated in the figures. It will be understood that the spatially relative terms are intended to encompass different orientations of the device in use or operation in addition to the orientation depicted in the figures. For example, if a device in the figures is turned over, devices described as "above" or "on" other devices or configurations would then be oriented "below" or "under" the other devices or configurations. Thus, the exemplary terms "at 8230; \8230; above" may include both orientations "at 8230; \8230; above" and "at 8230; \8230; below". The device may be otherwise variously oriented (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein interpreted accordingly.
It should be noted that the terms "first", "second", and the like are used to define the components, and are only used for convenience of distinguishing the corresponding components, and the terms have no special meanings unless otherwise stated, and therefore, the scope of the present invention should not be construed as being limited.
An underwater image enhancement method for generating a countermeasure network based on channel attention and deformation comprises the following steps:
s1, acquiring an underwater image construction data set, and dividing the data set into a training set and a test set;
s2, constructing a self-adaptive channel attention module which is used for calibrating channel weight again and has a multi-scale receptive field;
s3: constructing a deformation convolution module for feature extraction and convolution kernel offset oriented;
s4: fusing the adaptive channel attention module and the deformation convolution module to generate a confrontation network;
s5: training the generated countermeasure network based on the training set data to obtain a trained generated countermeasure network;
s6: and inputting the test set data into a trained generation countermeasure network to obtain an enhanced underwater image.
The steps S1, S2, S3, S4, S5 and S6 are executed in sequence;
further: the process of constructing an adaptive channel attention module with a multi-scale receptive field for re-scaling channel weights is as follows:
the adaptive channel attention module is mainly used for recalibrating the channel weight. Note that enhancing features at the channel level using convolution kernels of two different receptive fields can significantly improve visual quality. Thus, as shown in FIG. 1, the convolution operation can be expressed as:
Figure BDA0003932851450000081
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003932851450000082
the i-th layer representing the input feature X,
Figure BDA0003932851450000083
represents the ith layer weight of the kth filter,
Figure BDA0003932851450000084
representing a convolution operation, U k Representing the convolution output and c representing the number of feature layers.
The fusion operation can then be expressed as:
Figure BDA0003932851450000085
wherein the content of the first and second substances,
Figure BDA0003932851450000086
and
Figure BDA0003932851450000087
respectively representing the characteristics obtained using the 3 x 3 and 5 x 5 filters.
Figure BDA0003932851450000088
Is a fused feature obtained using a pixel-by-pixel addition operation.
Then, the average response of each feature map is computed using global average pooling, which can be expressed as:
Figure BDA0003932851450000089
wherein the content of the first and second substances,
Figure BDA00039328514500000810
which is representative of the value of the response of the channel,
Figure BDA00039328514500000811
representing the fused features, w and h are the width and height of the feature map, respectively, and m and n are the abscissa and ordinate indices of the features, respectively.
In addition, a single hidden layer neural network is utilized to improve the nonlinear mapping capability, which can be expressed as:
z fc =R(W fc s) (4)
wherein the content of the first and second substances,
Figure BDA00039328514500000812
the output of the hidden layer is represented by,
Figure BDA00039328514500000813
represents the weight between the input layer and the hidden layer, d = max (c/R, L) represents the number of hidden layer nodes, R represents the rate of reduction of channel dimensions, L represents the minimum value of the number of channels, and R is the leak ReLU activation function.
Further, a suitable spatial receptive field is adaptively selected, which can be expressed as:
Figure BDA0003932851450000091
Figure BDA0003932851450000092
wherein the content of the first and second substances,
Figure BDA0003932851450000093
and
Figure BDA0003932851450000094
are respectively provided withAre the weights between the hidden and output layers for the 3 x 3 and 5 x 5 branches,
Figure BDA0003932851450000095
and
Figure BDA0003932851450000096
channel attention weights for the 3 x 3 and 5 x 5 branches, respectively.
Finally, the recalibrated signature V may be expressed as:
Figure BDA0003932851450000097
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003932851450000098
it is the output characteristic of the entire adaptive channel attention module, which represents the element-by-element product.
Further: the process of constructing the deformation convolution module for feature extraction and convolution kernel offset orientation is as follows:
in order to enhance the integrity of foreground object construction, a feature extraction strategy based on a deformation convolution network is provided by constructing a convolution kernel migration method, which is beneficial to enhancing the encoding capability of the foreground object from a spatial level. As shown in fig. 2, the sample positions of the standard convolution kernel can be expressed as:
P={(u,v)|u,v∈{-k,-(k-1),…,k}} (8)
wherein u and v respectively represent the abscissa and ordinate of the sampling position of the convolution kernel,
Figure BDA0003932851450000099
representing the edge positions of the convolution kernel, w f Represents the current convolution kernel width;
accordingly, the sampling position of the morpho-convolution can be expressed as:
Figure BDA00039328514500000910
wherein p is 0 Representing the center point of the convolution kernel, p m =(x m ,y m ) Is the m-th element standard sample position, x, in the convolution kernel m And y m Respectively representing the abscissa and ordinate of the standard sampling position of the m-th element in the convolution kernel, m =1,2, \ 8230, n represents the index of the element in the convolution kernel, n = | P | is the total number of elements in the convolution kernel, and Δ P m Representing the magnitude of the offset of the mth element in the convolution kernel,
Figure BDA00039328514500000911
is the deformed sample position of the mth element in the convolution kernel,
Figure BDA00039328514500000912
and
Figure BDA00039328514500000913
respectively representing the abscissa and the ordinate of the deformation position of the mth element in the convolution kernel;
note that the deformed sampling positions should satisfy:
Figure BDA00039328514500000914
Figure BDA00039328514500000915
to account for eigenvalues at non-integer positions
Figure BDA0003932851450000101
To the problem of difficult direct acquisition, the present invention uses bilinear interpolation techniques to compute non-integer positions from the X-direction
Figure BDA0003932851450000102
Characteristic value of
Figure BDA0003932851450000103
It can be expressed as:
Figure BDA0003932851450000104
wherein p is tl =(x i ,y j ),p tr =(x i+1 ,y j ),p bl =(x i ,y j+1 ) And p br =(x i+1 ,y j+1 ) And respectively representing the integral coordinate positions of the upper left, the upper right, the lower left and the lower right closest to the deformation position.
Similarly, the coordinates are
Figure BDA0003932851450000105
Characteristic value of
Figure BDA0003932851450000106
Can be expressed as:
Figure BDA0003932851450000107
further, the position
Figure BDA0003932851450000108
Characteristic value of
Figure BDA0003932851450000109
Can be expressed as:
Figure BDA00039328514500001010
wherein the content of the first and second substances,
Figure BDA00039328514500001011
is that the mth element in the convolution kernel is at the deformation position
Figure BDA00039328514500001012
The characteristic value of (2).
Finally, the output of the entire morphable convolution module can be expressed as:
Figure BDA00039328514500001013
wherein, O (x) 0 ,y 0 ) Is the output of performing the morphed convolution operation,
Figure BDA00039328514500001014
represents the weight of the mth element, σ (w) m ) The modulation operation used to emphasize the importance of the mth offset position.
Further, as shown in fig. 3, the generation countermeasure network includes a generator and a discriminator connected in series;
the generator comprises an encoder and a decoder connected in series;
wherein the adaptive channel attention module and the deformable convolution module are integrated into the encoder frame in an organic tandem.
The losses employed to train the generation of the countermeasure network based on the training set data include WGAN-GP losses, and image gradient difference losses.
An underwater image enhancement device for generating a confrontation network based on channel attention and deformation, comprising:
an acquisition module: acquiring an underwater image construction data set, and dividing the data set into a training set and a test set;
constructing a module I: constructing a self-adaptive channel attention module which is used for calibrating channel weight again and has a multi-scale receptive field;
constructing a module II: the deformation convolution module is used for constructing a deformation convolution module which is used for carrying out feature extraction and faces convolution kernel offset;
a fusion module: the adaptive channel attention module and the deformation convolution module are fused to generate a confrontation network;
a training module: the system is used for training the generated countermeasure network based on the training set data to obtain the trained generated countermeasure network;
obtaining a module: and inputting the test set data into the trained generation countermeasure network to obtain the enhanced underwater image.
In order to fully prove the effectiveness and superiority of the proposed method, the underwater image enhancement method (ACAGGAN for short) based on channel attention and deformation generation countermeasure network proposed by the application is compared with a recovery method based on a physical model, a model-free enhancement method and an image-to-image conversion technology.
Fig. 4UIEB data set subjective evaluation result graph the comparative scene is shown in fig. 4, and mainly includes artificial light (first column), blue light (second column), green light (third column), dark light (fourth column) and violet light (fifth column). From fig. 4, it is clear that it is difficult to obtain satisfactory visual performance by the image restoration method based on the physical model. Specifically, the UDCP method makes the color deviation effect more noticeable in the above-described scene. Obviously, the model-free underwater image enhancement method, i.e., the UCM method, may bring a red effect. It is noted that the FUnIE-GAN and UWCNN methods can only achieve very limited enhancement effects. Meanwhile, the FUnIE-GAN method causes significant color deviation in case of violet light. Intuitively, the ACADGAN method proposed by the present invention can finally achieve the best visual enhancement quality. In addition, the consistent conclusion can also be summarized from fig. 5, fig. 5URPC data set subjective evaluation results map.
In order to further fully prove the superiority of the ACADGAN method proposed by the present invention, the peak signal-to-noise ratio (PSNR), structural Similarity (SSIM), underwater Image Color (UICM), sharpness (UISM), and sharpness (uicon) indexes are used for comparison on the UIEB and the urcc data sets, and the comparison results are summarized in table 1 and table 2, respectively.
Table 1 shows the quantitative comparison of objective image quality for UIEB data sets
Figure BDA0003932851450000111
TABLE 2 URPC data set objective image quality quantitative comparison
Figure BDA0003932851450000122
From tables 1 and 2 we can see that the ACADGAN method proposed by the present invention can achieve the best (in bold) or sub-best (in underlined) performance on most metrics. In particular, on the one hand, from the optimal SSIM and UIConM indices obtained on UIEB data sets, we can conclude that the proposed ACADGAN method can effectively preserve image structure, texture and contrast. Meanwhile, the ACARDGAN method provided by the invention can perfectly store image content. Furthermore, in terms of the evaluation of the URPC data set, the ACADGAN method proposed by the present invention achieves an optimal UIQM score, which means that the enhanced image is more consistent with human visual perception. More importantly, the ACADGAN method proposed by the present invention can obtain the optimal UCIQE score, which means that the enhanced image achieves a better balance in terms of chroma, saturation and contrast.
In order to verify the effectiveness and superiority of the ACADGAN method provided by the patent from the perspective of basic feature expression, the patent uses SIFT, harris and Canny methods to respectively extract key points, angular points and pixel-level edges. Accordingly, the keypoint matching results are shown in fig. 6, from which we can clearly see that few keypoints can be correctly extracted and matched due to poor quality of the original underwater image. Meanwhile, the matching performance of the key points can be enhanced by adopting a restoration or enhancement method. In addition, the ACADGAN method proposed by the patent can realize the best key point matching performance, which means that the ACADGAN method proposed by the patent can recover more basic characteristics of degraded underwater images. It is emphasized that the UIEB and URPC data sets mean evaluation performance are provided in tables 3 and 4,
table 3 shows the UIEB data set characteristic expression comparison
Figure BDA0003932851450000121
Figure BDA0003932851450000131
TABLE 4 characterization comparison of URPC datasets
Figure BDA0003932851450000132
From this, we can clearly see that the ACADGAN method proposed by this patent can achieve the best or suboptimal performance in terms of extracting SIFT key points, harris corner points and Canny edges, which indicates that the ACADGAN method proposed by this patent is helpful for the extraction of essential features.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (6)

1. An underwater image enhancement method for generating a confrontation network based on channel attention and deformation is characterized in that: the method comprises the following steps:
acquiring an underwater image construction data set, and dividing the data set into a training set and a test set;
constructing a self-adaptive channel attention module which is used for calibrating channel weight again and has a multi-scale receptive field;
constructing a deformation convolution module for feature extraction and convolution kernel offset oriented;
fusing the adaptive channel attention module and the deformation convolution module to generate a confrontation network;
training the generated countermeasure network based on the training set data to obtain a trained generated countermeasure network;
and inputting the test set data into the trained generation countermeasure network to obtain the enhanced underwater image.
2. The underwater image enhancement method based on the channel attention and deformation generation countermeasure network of claim 1 is characterized in that: the process of constructing the adaptive channel attention module with multi-scale receptive fields for re-calibrating channel weights is as follows:
the convolution operation is represented as:
Figure FDA0003932851440000011
wherein the content of the first and second substances,
Figure FDA0003932851440000012
the ith layer representing the input features X,
Figure FDA0003932851440000013
represents the ith layer weight of the kth filter,
Figure FDA0003932851440000014
representing a convolution operation, U k Representing the convolution output, c represents the number of feature layers;
the fusion operation is represented as:
Figure FDA0003932851440000015
wherein the content of the first and second substances,
Figure FDA0003932851440000016
and
Figure FDA0003932851440000017
respectively representing the characteristics obtained using the 3 x 3 and 5 x 5 filters,
Figure FDA0003932851440000018
is a fusion feature obtained by pixel-by-pixel addition;
the average response for each feature map is calculated using global average pooling, which is expressed as:
Figure FDA0003932851440000019
wherein the content of the first and second substances,
Figure FDA00039328514400000110
which is representative of the value of the response of the channel,
Figure FDA00039328514400000111
representing the fused features, w and h are the width and height of the feature map, respectively, and m and n are the abscissa and ordinate indices of the features, respectively;
the nonlinear mapping capability is improved by utilizing a single hidden layer neural network, which is expressed as:
z fc =R(W fc s) (4)
wherein the content of the first and second substances,
Figure FDA0003932851440000021
the output of the hidden layer is represented,
Figure FDA0003932851440000022
representing the weight between the input layer and the hidden layer, d = max (c/R, L) representing the number of nodes of the hidden layer, R representing the reduction rate of the channel dimension, L representing the minimum value of the number of channels, R being the leak ReLU activation function;
further, a suitable spatial receptive field is adaptively selected, which is expressed as:
Figure FDA0003932851440000023
Figure FDA0003932851440000024
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003932851440000025
and
Figure FDA0003932851440000026
the weights between the hidden and output layers for the 3 x 3 and 5 x 5 branches respectively,
Figure FDA0003932851440000027
and
Figure FDA0003932851440000028
channel attention weights of 3 × 3 and 5 × 5 branches, respectively;
the recalibrated signature V is expressed as:
Figure FDA0003932851440000029
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA00039328514400000210
it is the output characteristic of the entire adaptive channel attention module, which represents the element-by-element product.
3. The underwater image enhancement method based on the channel attention and deformation generation countermeasure network of claim 1 is characterized in that: the process of constructing the deformation convolution module for feature extraction and convolution kernel offset orientation is as follows:
the sample positions of the standard convolution kernel are expressed as:
P={(u,v)|u,v∈{-k,-(k-1),…,k}} (8)
wherein u and v represent the abscissa and ordinate of the sampling position of the convolution kernel respectively,
Figure FDA00039328514400000211
representing the edge positions of the convolution kernel, w f Represents the current convolution kernel width;
the sample positions of the morpho-convolution are expressed as:
Figure FDA00039328514400000212
wherein p is 0 Represents the center point of the convolution kernel, p m =(x m ,y m ) Is the m-th element standard sample position, x, in the convolution kernel m And y m Respectively representing the abscissa and ordinate of the standard sampling position of the mth element in the convolution kernel, m =1,2, \ 8230;, n represents the index of the element in the convolution kernel, n = | P | is the total number of elements in the convolution kernel, Δ P m Representing the magnitude of the offset of the mth element in the convolution kernel,
Figure FDA00039328514400000213
is the deformed sample position of the mth element in the convolution kernel,
Figure FDA0003932851440000031
and
Figure FDA0003932851440000032
respectively representing the abscissa and the ordinate of the deformation position of the mth element in the convolution kernel;
the sampling positions of the deformed standard convolution kernels should satisfy:
Figure FDA0003932851440000033
Figure FDA0003932851440000034
computing non-integer positions from the X direction using bilinear interpolation techniques
Figure FDA0003932851440000035
Characteristic value of
Figure FDA0003932851440000036
It is expressed as:
Figure FDA0003932851440000037
wherein: p is a radical of tl =(x i ,y j ),p tr =(x i+1 ,y j ),p bl =(x i ,y j+1 ) And p br =(x i+1 ,y j+1 ) Respectively representing the coordinate positions of upper left, upper right, lower left and lower right integers closest to the deformation position;
the coordinates are
Figure FDA0003932851440000038
Characteristic value of (2)
Figure FDA0003932851440000039
Expressed as:
Figure FDA00039328514400000310
position of
Figure FDA00039328514400000311
Characteristic value of
Figure FDA00039328514400000312
Expressed as:
Figure FDA00039328514400000313
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA00039328514400000314
is the m-th element in the convolution kernel at the deformed position
Figure FDA00039328514400000315
The characteristic value of (a);
the output of the entire morphed convolution module is represented as:
Figure FDA00039328514400000316
wherein, O (x) 0 ,y 0 ) Is the output of performing the morphed convolution operation,
Figure FDA00039328514400000317
represents the weight of the mth element, σ (w) m ) The modulation operation used to emphasize the importance of the mth bias position.
4. The underwater image enhancement method based on the channel attention and deformation generation countermeasure network of claim 1 is characterized in that:
the generation countermeasure network comprises a generator and a discriminator which are connected in series;
the generator comprises an encoder and a decoder connected in series;
the encoder includes an adaptive channel attention module and a morphing convolution module connected in series.
5. The underwater image enhancement method for generating the countermeasure network based on the channel attention and the deformation as claimed in claim 1, wherein: the losses employed to train the generation of the countermeasure network based on the training set data include WGAN-GP losses, and image gradient difference losses.
6. An underwater image enhancement device for generating a countermeasure network based on channel attention and deformation is characterized in that: the method comprises the following steps:
an acquisition module: acquiring an underwater image construction data set, and dividing the data set into a training set and a test set;
constructing a module I: constructing a self-adaptive channel attention module which is used for calibrating channel weight again and has a multi-scale receptive field;
constructing a module II: the deformation convolution module is used for constructing a deformation convolution module which is used for carrying out feature extraction and faces convolution kernel offset;
a fusion module: the adaptive channel attention module and the deformation convolution module are fused to generate a confrontation network;
a training module: the system is used for training the generated countermeasure network based on the training set data to obtain the trained generated countermeasure network;
obtaining a module: and the method is used for inputting the test set data into the trained generation countermeasure network to obtain the enhanced underwater image.
CN202211394443.4A 2022-11-08 2022-11-08 Underwater image enhancement method for generating countermeasure network based on channel attention and deformation Pending CN115713469A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211394443.4A CN115713469A (en) 2022-11-08 2022-11-08 Underwater image enhancement method for generating countermeasure network based on channel attention and deformation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211394443.4A CN115713469A (en) 2022-11-08 2022-11-08 Underwater image enhancement method for generating countermeasure network based on channel attention and deformation

Publications (1)

Publication Number Publication Date
CN115713469A true CN115713469A (en) 2023-02-24

Family

ID=85232509

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211394443.4A Pending CN115713469A (en) 2022-11-08 2022-11-08 Underwater image enhancement method for generating countermeasure network based on channel attention and deformation

Country Status (1)

Country Link
CN (1) CN115713469A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116563145A (en) * 2023-04-26 2023-08-08 北京交通大学 Underwater image enhancement method and system based on color feature fusion
CN116579918A (en) * 2023-05-19 2023-08-11 哈尔滨工程大学 Attention mechanism multi-scale image conversion method based on style independent discriminator

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116563145A (en) * 2023-04-26 2023-08-08 北京交通大学 Underwater image enhancement method and system based on color feature fusion
CN116563145B (en) * 2023-04-26 2024-04-05 北京交通大学 Underwater image enhancement method and system based on color feature fusion
CN116579918A (en) * 2023-05-19 2023-08-11 哈尔滨工程大学 Attention mechanism multi-scale image conversion method based on style independent discriminator
CN116579918B (en) * 2023-05-19 2023-12-26 哈尔滨工程大学 Attention mechanism multi-scale image conversion method based on style independent discriminator

Similar Documents

Publication Publication Date Title
Li et al. Underwater image enhancement via medium transmission-guided multi-color space embedding
Wang et al. An experimental-based review of image enhancement and image restoration methods for underwater imaging
Tian et al. Deep learning on image denoising: An overview
Islam et al. Simultaneous enhancement and super-resolution of underwater imagery for improved visual perception
Qi et al. SGUIE-Net: Semantic attention guided underwater image enhancement with multi-scale perception
Wu et al. A two-stage underwater enhancement network based on structure decomposition and characteristics of underwater imaging
Liu et al. IPMGAN: Integrating physical model and generative adversarial network for underwater image enhancement
US20220189017A1 (en) Medical image processing method and apparatus, image processing method and apparatus, terminal and storage medium
CN111275637A (en) Non-uniform motion blurred image self-adaptive restoration method based on attention model
CN115713469A (en) Underwater image enhancement method for generating countermeasure network based on channel attention and deformation
CN109376611A (en) A kind of saliency detection method based on 3D convolutional neural networks
Shi et al. Low-light image enhancement algorithm based on retinex and generative adversarial network
CN108460742A (en) A kind of image recovery method based on BP neural network
CN113449691A (en) Human shape recognition system and method based on non-local attention mechanism
CN113284061B (en) Underwater image enhancement method based on gradient network
CN112633274A (en) Sonar image target detection method and device and electronic equipment
Sun et al. Underwater image enhancement with encoding-decoding deep CNN networks
Han et al. UIEGAN: Adversarial learning-based photorealistic image enhancement for intelligent underwater environment perception
CN115861094A (en) Lightweight GAN underwater image enhancement model fused with attention mechanism
CN115700731A (en) Underwater image enhancement method based on dual-channel convolutional neural network
Wang et al. Underwater color disparities: Cues for enhancing underwater images toward natural color consistencies
CN116934592A (en) Image stitching method, system, equipment and medium based on deep learning
CN115272072A (en) Underwater image super-resolution method based on multi-feature image fusion
CN114862707A (en) Multi-scale feature recovery image enhancement method and device and storage medium
Qiao et al. Adaptive deep learning network with multi-scale and multi-dimensional features for underwater image enhancement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination