CN115713469A - Underwater image enhancement method for generating countermeasure network based on channel attention and deformation - Google Patents
Underwater image enhancement method for generating countermeasure network based on channel attention and deformation Download PDFInfo
- Publication number
- CN115713469A CN115713469A CN202211394443.4A CN202211394443A CN115713469A CN 115713469 A CN115713469 A CN 115713469A CN 202211394443 A CN202211394443 A CN 202211394443A CN 115713469 A CN115713469 A CN 115713469A
- Authority
- CN
- China
- Prior art keywords
- module
- convolution
- deformation
- channel attention
- convolution kernel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Image Processing (AREA)
Abstract
The invention relates to an underwater image enhancement method for generating a confrontation network based on channel attention and deformation, which comprises the following steps of: acquiring an underwater image construction data set, and dividing the data set into a training set and a test set; constructing a self-adaptive channel attention module which is used for calibrating channel weight again and has a multi-scale receptive field; constructing a deformation convolution module which is used for carrying out feature extraction and faces convolution kernel migration; fusing the adaptive channel attention module and the deformation convolution module to generate a confrontation network; training the generated countermeasure network based on the training set data to obtain a trained generated countermeasure network; inputting the test set data into a trained generation countermeasure network to obtain an enhanced underwater image; by utilizing a single hidden layer neural network and a global average pooling technology, an adaptive channel attention module with different receptive fields is constructed, which is beneficial to reducing the influence of mixed noise on a characteristic layer; the enhanced consistency of the interested objects under different scene depths is improved.
Description
Technical Field
The invention belongs to the field of underwater intelligent fishing robots, and relates to an underwater image enhancement method for generating a countermeasure network based on channel attention and deformation.
Background
The ocean reserves precious wealth for human sustainable development and is also strategic importance for high-quality development. The underwater optical image has the advantages of strong information bearing capacity, low cost, rich information and the like, and plays a vital role in the fields of underwater archaeology, sunken ship salvage, marine ranch culture, monitoring and the like. It is noted that due to the inherent water absorption and scattering effects, the underwater optical image usually shows comprehensive degradation such as uneven illumination, color cast, low contrast, blurred edge details, and the like in the imaging process, and the subsequent tasks such as detection and identification, image segmentation, and the like are seriously affected.
With the rapid development of artificial intelligence, the deep learning technology based on data driving has been widely applied to tasks such as underwater image denoising, detail recovery, super-resolution and the like. It is noted that underwater image enhancement can be used as a domain switching operation from degraded to sharp images. Currently, an underwater image enhancement method mainly includes: a model-less enhancement method, a model-based restoration method, and a data-driven domain mapping method.
(1) Model-free enhancement method
By solving the problems of color distortion and low Contrast in a single color space, related scholars have proposed Contrast Limited Adaptive Histogram Equalization (CLAHE) and White Balance (WB) techniques. By means of bilateral and trilateral filtering techniques, the relevant scholars design a multi-scale Retinex framework to realize underwater image enhancement. In addition, four representative weight maps of laplacian contrast, local contrast, saliency and exposure are fused by using white balance and histogram equalization strategies, and Fusion underwater image enhancement methods are proposed by related scholars.
(2) Model-based restoration method
By estimating the transmission profile and background light, the related scholars proposed a Dark Channel Prior (DCP) based approach. By comprehensively considering the transition differences between the airborne and Underwater scenarios, the related scholars build the Underwater Dark Channel Prior (UDCP) framework. By addressing the practical problem of the occasional lower pixel values in the blue channel than in the red channel, the related scholars propose a two-channel based approach to achieve underwater image enhancement.
(3) Domain mapping method based on data driving
By combining aerial images and depth images, related scholars propose a two-stage WaterGAN scheme to achieve underwater image enhancement. By utilizing the corrected underwater image imaging model and scene parameters, relevant scholars propose a UWCNN method based on a synthesized image. To eliminate the limitation of using paired underwater images, related scholars propose a weakly supervised learning approach with a loss of cyclic consistency. In order to significantly improve the training stability in the underwater image enhancement process, a relevant scholaree creates a Wasserstein GAN method by calculating the distance between data distribution and model distribution.
The existing underwater image enhancement method under the complex seabed environment mainly has the following defects: (1) The method is difficult to solve various degradation types based on a model-free enhancement method; (2) The method based on model restoration needs to estimate a large number of parameters and has complex modeling process; (3) The data-driven domain mapping method introduces a large amount of underwater mixed noise, incomplete foreground object construction and poor enhancement consistency of the interested objects under different scene depths.
Disclosure of Invention
In order to solve the above problems, the present invention provides the following technical solutions: an underwater image enhancement method for generating a countermeasure network based on channel attention and deformation comprises the following steps:
acquiring an underwater image construction data set, and dividing the data set into a training set and a test set;
constructing a self-adaptive channel attention module which is used for calibrating channel weight again and has a multi-scale receptive field;
constructing a deformation convolution module for feature extraction and convolution kernel offset oriented;
fusing the adaptive channel attention module and the deformation convolution module to generate a confrontation network;
training the generated countermeasure network based on the training set data to obtain a trained generated countermeasure network;
and inputting the test set data into a trained generation countermeasure network to obtain an enhanced underwater image.
Further: the process of constructing the adaptive channel attention module with multi-scale receptive fields for re-calibrating channel weights is as follows:
the convolution operation is represented as:
wherein the content of the first and second substances,the ith layer representing the input features X,represents the ith layer weight of the kth filter,representing convolution operations, U k Representing the convolution output, c represents the number of feature layers;
the fusion operation is represented as:
wherein the content of the first and second substances,andrespectively representing the characteristics obtained using the 3 x 3 and 5 x 5 filters,is a fusion feature obtained by pixel-by-pixel addition;
the average response for each feature map is calculated using global average pooling, which is expressed as:
wherein the content of the first and second substances,which is representative of the value of the response of the channel,representing the fused features, w and h are the width and height of the feature map, respectively, and m and n are the abscissa and ordinate indices of the features, respectively;
the nonlinear mapping capability is improved by utilizing a single hidden layer neural network, which is expressed as:
z fc =R(W fc s) (4)
wherein, the first and the second end of the pipe are connected with each other,the output of the hidden layer is represented by,representing the weight between the input layer and the hidden layer, d = max (c/R, L) representing the number of nodes of the hidden layer, R representing the reduction rate of the channel dimension, L representing the minimum value of the number of channels, and R being a leak ReLU activation function;
further, a suitable spatial receptive field is adaptively selected, which is expressed as:
wherein the content of the first and second substances,andthe weights between the hidden and output layers for the 3 x 3 and 5 x 5 branches respectively,andchannel attention weights for the 3 × 3 and 5 × 5 branches, respectively;
the recalibrated signature V is expressed as:
wherein the content of the first and second substances,it is the output characteristic of the entire adaptive channel attention module, which indicates an element-by-element product.
Further, the process of constructing the deformed convolution module for feature extraction and facing the convolution kernel offset is as follows:
the sample positions of the standard convolution kernel are expressed as:
P={(u,v)|u,v∈{-k,-(k-1),…,k}} (8)
wherein u and v respectively represent the abscissa and ordinate of the sampling position of the convolution kernel,representing the edge positions of the convolution kernel, w f Represents the current convolution kernel width;
the sample positions of the morpho-convolution are expressed as:
wherein p is 0 Representing the center point of the convolution kernel, p m =(x m ,y m ) Is the m-th element standard sample position, x, in the convolution kernel m And y m Respectively representing the abscissa and ordinate of the standard sampling position of the mth element in the convolution kernel, m =1,2, \ 8230;, n represents the index of the element in the convolution kernel, n = | P | is the total number of elements in the convolution kernel, Δ P m Representing the magnitude of the offset of the mth element in the convolution kernel,is the deformed sample position of the mth element in the convolution kernel,andrespectively representing the abscissa and the ordinate of the deformation position of the mth element in the convolution kernel;
the sampling positions of the deformed standard convolution kernels should satisfy:
computing non-integer positions from the X direction using bilinear interpolation techniquesCharacteristic value ofIt is expressed as:
wherein: p is a radical of formula tl =(x i ,y j ),p tr =(x i+1 ,y j ),p bl =(x i ,y j+1 ) And p br =(x i+1 ,y j+1 ) Respectively representing the coordinate positions of upper left, upper right, lower left and lower right integers closest to the deformation position;
wherein the content of the first and second substances,is the m-th element in the convolution kernel at the deformed positionA characteristic value of (d);
the output of the entire morphable convolution module is represented as:
wherein, O (x) 0 ,y 0 ) Is the output of performing the inflected convolution operation,represents the weight of the mth element, σ (w) m ) The modulation operation used to emphasize the importance of the mth bias position.
Further, the generation countermeasure network includes a generator and an arbiter connected in series;
the generator comprises an encoder and a decoder connected in series;
the encoder includes an adaptive channel attention module and a deformable convolution module connected in series.
Further, the losses employed for training generation of the countermeasure network based on the training set data include WGAN-GP losses, and image gradient difference losses.
An underwater image enhancement device for generating a confrontation network based on channel attention and deformation, comprising:
an acquisition module: acquiring an underwater image construction data set, and dividing the data set into a training set and a test set;
constructing a module I: constructing a self-adaptive channel attention module which is used for calibrating channel weight again and has a multi-scale receptive field;
constructing a module II: the deformation convolution module is used for constructing a deformation convolution module which is used for carrying out feature extraction and faces convolution kernel offset;
a fusion module: the adaptive channel attention module and the deformation convolution module are fused to generate a confrontation network;
a training module: the system is used for training the generated countermeasure network based on the training set data to obtain the trained generated countermeasure network;
obtaining a module: and inputting the test set data into the trained generation countermeasure network to obtain the enhanced underwater image.
The underwater image enhancement method based on the channel attention and deformation generation countermeasure network is adopted, speckle noise, gaussian noise and impulse noise introduced in the underwater imaging process are considered, and by means of a single-hidden-layer neural network and a global average pooling technology, the underwater image enhancement method constructs adaptive channel attention modules with different scale receptive fields; considering that the traditional stacked convolutional layer only has very limited foreground object coding performance, the invention provides a feature extraction method based on a deformation convolutional network by utilizing a layer-by-layer convolutional kernel migration strategy; the underwater image enhancement method for generating the countermeasure network based on the channel attention and the deformation has the following beneficial effects that by combining the adaptive channel attention module and the deformation convolution module, the underwater image enhancement method provided by the invention has the following advantages:
(1) According to the method, a single hidden layer neural network and a global average pooling technology are utilized to construct the adaptive channel attention module with different receptive fields, so that on one hand, the influence of mixed noise (speckle noise, gaussian noise and pulse noise) on a characteristic layer is reduced; on the other hand, the enhanced consistency of the interested objects under different scene depths is improved;
(2) A convolution kernel migration method and a bias position modulation mechanism are constructed, and a feature extraction strategy based on a deformation convolution network is provided, so that the foreground object coding capacity is enhanced from a spatial level;
(3) Utilizing L in conjunction with adaptive channel attention modules and morphed convolutional networks 1 Loss, image gradient error loss and generation countermeasure loss, and a generation countermeasure network framework is constructed, so that underwater image enhancement performance is improved from two layers of channels and space.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a diagram of an adaptive channel attention module;
FIG. 2 is a diagram of a morphed convolution module;
FIG. 3 is a block diagram of an overall underwater image enhancement framework;
FIG. 4 is a chart of subjective evaluation results of UIEB data sets;
FIG. 5 is a diagram of subjective evaluation results of URPC data sets;
FIG. 6 is a graph of keypoint match comparison results.
Detailed Description
It should be noted that, in the case of conflict, the embodiments and features of the embodiments of the present invention may be combined with each other, and the present invention will be described in detail with reference to the accompanying drawings and embodiments.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all the embodiments. The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
The relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise. Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description. Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate. Any specific values in all examples shown and discussed herein are to be construed as exemplary only and not as limiting. Thus, other examples of the exemplary embodiments may have different values. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
In the description of the present invention, it is to be understood that the orientation or positional relationship indicated by the directional terms such as "front, rear, upper, lower, left, right", "lateral, vertical, horizontal" and "top, bottom", etc., are generally based on the orientation or positional relationship shown in the drawings, and are used for convenience of description and simplicity of description only, and in the absence of any contrary indication, these directional terms are not intended to indicate and imply that the device or element so referred to must have a particular orientation or be constructed and operated in a particular orientation, and therefore should not be considered as limiting the scope of the present invention: the terms "inner and outer" refer to the inner and outer relative to the profile of the respective component itself.
For ease of description, spatially relative terms such as "over 8230," "upper surface," "above," and the like may be used herein to describe the spatial positional relationship of one device or feature to other devices or features as illustrated in the figures. It will be understood that the spatially relative terms are intended to encompass different orientations of the device in use or operation in addition to the orientation depicted in the figures. For example, if a device in the figures is turned over, devices described as "above" or "on" other devices or configurations would then be oriented "below" or "under" the other devices or configurations. Thus, the exemplary terms "at 8230; \8230; above" may include both orientations "at 8230; \8230; above" and "at 8230; \8230; below". The device may be otherwise variously oriented (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein interpreted accordingly.
It should be noted that the terms "first", "second", and the like are used to define the components, and are only used for convenience of distinguishing the corresponding components, and the terms have no special meanings unless otherwise stated, and therefore, the scope of the present invention should not be construed as being limited.
An underwater image enhancement method for generating a countermeasure network based on channel attention and deformation comprises the following steps:
s1, acquiring an underwater image construction data set, and dividing the data set into a training set and a test set;
s2, constructing a self-adaptive channel attention module which is used for calibrating channel weight again and has a multi-scale receptive field;
s3: constructing a deformation convolution module for feature extraction and convolution kernel offset oriented;
s4: fusing the adaptive channel attention module and the deformation convolution module to generate a confrontation network;
s5: training the generated countermeasure network based on the training set data to obtain a trained generated countermeasure network;
s6: and inputting the test set data into a trained generation countermeasure network to obtain an enhanced underwater image.
The steps S1, S2, S3, S4, S5 and S6 are executed in sequence;
further: the process of constructing an adaptive channel attention module with a multi-scale receptive field for re-scaling channel weights is as follows:
the adaptive channel attention module is mainly used for recalibrating the channel weight. Note that enhancing features at the channel level using convolution kernels of two different receptive fields can significantly improve visual quality. Thus, as shown in FIG. 1, the convolution operation can be expressed as:
wherein, the first and the second end of the pipe are connected with each other,the i-th layer representing the input feature X,represents the ith layer weight of the kth filter,representing a convolution operation, U k Representing the convolution output and c representing the number of feature layers.
The fusion operation can then be expressed as:
wherein the content of the first and second substances,andrespectively representing the characteristics obtained using the 3 x 3 and 5 x 5 filters.Is a fused feature obtained using a pixel-by-pixel addition operation.
Then, the average response of each feature map is computed using global average pooling, which can be expressed as:
wherein the content of the first and second substances,which is representative of the value of the response of the channel,representing the fused features, w and h are the width and height of the feature map, respectively, and m and n are the abscissa and ordinate indices of the features, respectively.
In addition, a single hidden layer neural network is utilized to improve the nonlinear mapping capability, which can be expressed as:
z fc =R(W fc s) (4)
wherein the content of the first and second substances,the output of the hidden layer is represented by,represents the weight between the input layer and the hidden layer, d = max (c/R, L) represents the number of hidden layer nodes, R represents the rate of reduction of channel dimensions, L represents the minimum value of the number of channels, and R is the leak ReLU activation function.
Further, a suitable spatial receptive field is adaptively selected, which can be expressed as:
wherein the content of the first and second substances,andare respectively provided withAre the weights between the hidden and output layers for the 3 x 3 and 5 x 5 branches,andchannel attention weights for the 3 x 3 and 5 x 5 branches, respectively.
Finally, the recalibrated signature V may be expressed as:
wherein, the first and the second end of the pipe are connected with each other,it is the output characteristic of the entire adaptive channel attention module, which represents the element-by-element product.
Further: the process of constructing the deformation convolution module for feature extraction and convolution kernel offset orientation is as follows:
in order to enhance the integrity of foreground object construction, a feature extraction strategy based on a deformation convolution network is provided by constructing a convolution kernel migration method, which is beneficial to enhancing the encoding capability of the foreground object from a spatial level. As shown in fig. 2, the sample positions of the standard convolution kernel can be expressed as:
P={(u,v)|u,v∈{-k,-(k-1),…,k}} (8)
wherein u and v respectively represent the abscissa and ordinate of the sampling position of the convolution kernel,representing the edge positions of the convolution kernel, w f Represents the current convolution kernel width;
accordingly, the sampling position of the morpho-convolution can be expressed as:
wherein p is 0 Representing the center point of the convolution kernel, p m =(x m ,y m ) Is the m-th element standard sample position, x, in the convolution kernel m And y m Respectively representing the abscissa and ordinate of the standard sampling position of the m-th element in the convolution kernel, m =1,2, \ 8230, n represents the index of the element in the convolution kernel, n = | P | is the total number of elements in the convolution kernel, and Δ P m Representing the magnitude of the offset of the mth element in the convolution kernel,is the deformed sample position of the mth element in the convolution kernel,andrespectively representing the abscissa and the ordinate of the deformation position of the mth element in the convolution kernel;
note that the deformed sampling positions should satisfy:
to account for eigenvalues at non-integer positionsTo the problem of difficult direct acquisition, the present invention uses bilinear interpolation techniques to compute non-integer positions from the X-directionCharacteristic value ofIt can be expressed as:
wherein p is tl =(x i ,y j ),p tr =(x i+1 ,y j ),p bl =(x i ,y j+1 ) And p br =(x i+1 ,y j+1 ) And respectively representing the integral coordinate positions of the upper left, the upper right, the lower left and the lower right closest to the deformation position.
wherein the content of the first and second substances,is that the mth element in the convolution kernel is at the deformation positionThe characteristic value of (2).
Finally, the output of the entire morphable convolution module can be expressed as:
wherein, O (x) 0 ,y 0 ) Is the output of performing the morphed convolution operation,represents the weight of the mth element, σ (w) m ) The modulation operation used to emphasize the importance of the mth offset position.
Further, as shown in fig. 3, the generation countermeasure network includes a generator and a discriminator connected in series;
the generator comprises an encoder and a decoder connected in series;
wherein the adaptive channel attention module and the deformable convolution module are integrated into the encoder frame in an organic tandem.
The losses employed to train the generation of the countermeasure network based on the training set data include WGAN-GP losses, and image gradient difference losses.
An underwater image enhancement device for generating a confrontation network based on channel attention and deformation, comprising:
an acquisition module: acquiring an underwater image construction data set, and dividing the data set into a training set and a test set;
constructing a module I: constructing a self-adaptive channel attention module which is used for calibrating channel weight again and has a multi-scale receptive field;
constructing a module II: the deformation convolution module is used for constructing a deformation convolution module which is used for carrying out feature extraction and faces convolution kernel offset;
a fusion module: the adaptive channel attention module and the deformation convolution module are fused to generate a confrontation network;
a training module: the system is used for training the generated countermeasure network based on the training set data to obtain the trained generated countermeasure network;
obtaining a module: and inputting the test set data into the trained generation countermeasure network to obtain the enhanced underwater image.
In order to fully prove the effectiveness and superiority of the proposed method, the underwater image enhancement method (ACAGGAN for short) based on channel attention and deformation generation countermeasure network proposed by the application is compared with a recovery method based on a physical model, a model-free enhancement method and an image-to-image conversion technology.
Fig. 4UIEB data set subjective evaluation result graph the comparative scene is shown in fig. 4, and mainly includes artificial light (first column), blue light (second column), green light (third column), dark light (fourth column) and violet light (fifth column). From fig. 4, it is clear that it is difficult to obtain satisfactory visual performance by the image restoration method based on the physical model. Specifically, the UDCP method makes the color deviation effect more noticeable in the above-described scene. Obviously, the model-free underwater image enhancement method, i.e., the UCM method, may bring a red effect. It is noted that the FUnIE-GAN and UWCNN methods can only achieve very limited enhancement effects. Meanwhile, the FUnIE-GAN method causes significant color deviation in case of violet light. Intuitively, the ACADGAN method proposed by the present invention can finally achieve the best visual enhancement quality. In addition, the consistent conclusion can also be summarized from fig. 5, fig. 5URPC data set subjective evaluation results map.
In order to further fully prove the superiority of the ACADGAN method proposed by the present invention, the peak signal-to-noise ratio (PSNR), structural Similarity (SSIM), underwater Image Color (UICM), sharpness (UISM), and sharpness (uicon) indexes are used for comparison on the UIEB and the urcc data sets, and the comparison results are summarized in table 1 and table 2, respectively.
Table 1 shows the quantitative comparison of objective image quality for UIEB data sets
TABLE 2 URPC data set objective image quality quantitative comparison
From tables 1 and 2 we can see that the ACADGAN method proposed by the present invention can achieve the best (in bold) or sub-best (in underlined) performance on most metrics. In particular, on the one hand, from the optimal SSIM and UIConM indices obtained on UIEB data sets, we can conclude that the proposed ACADGAN method can effectively preserve image structure, texture and contrast. Meanwhile, the ACARDGAN method provided by the invention can perfectly store image content. Furthermore, in terms of the evaluation of the URPC data set, the ACADGAN method proposed by the present invention achieves an optimal UIQM score, which means that the enhanced image is more consistent with human visual perception. More importantly, the ACADGAN method proposed by the present invention can obtain the optimal UCIQE score, which means that the enhanced image achieves a better balance in terms of chroma, saturation and contrast.
In order to verify the effectiveness and superiority of the ACADGAN method provided by the patent from the perspective of basic feature expression, the patent uses SIFT, harris and Canny methods to respectively extract key points, angular points and pixel-level edges. Accordingly, the keypoint matching results are shown in fig. 6, from which we can clearly see that few keypoints can be correctly extracted and matched due to poor quality of the original underwater image. Meanwhile, the matching performance of the key points can be enhanced by adopting a restoration or enhancement method. In addition, the ACADGAN method proposed by the patent can realize the best key point matching performance, which means that the ACADGAN method proposed by the patent can recover more basic characteristics of degraded underwater images. It is emphasized that the UIEB and URPC data sets mean evaluation performance are provided in tables 3 and 4,
table 3 shows the UIEB data set characteristic expression comparison
TABLE 4 characterization comparison of URPC datasets
From this, we can clearly see that the ACADGAN method proposed by this patent can achieve the best or suboptimal performance in terms of extracting SIFT key points, harris corner points and Canny edges, which indicates that the ACADGAN method proposed by this patent is helpful for the extraction of essential features.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.
Claims (6)
1. An underwater image enhancement method for generating a confrontation network based on channel attention and deformation is characterized in that: the method comprises the following steps:
acquiring an underwater image construction data set, and dividing the data set into a training set and a test set;
constructing a self-adaptive channel attention module which is used for calibrating channel weight again and has a multi-scale receptive field;
constructing a deformation convolution module for feature extraction and convolution kernel offset oriented;
fusing the adaptive channel attention module and the deformation convolution module to generate a confrontation network;
training the generated countermeasure network based on the training set data to obtain a trained generated countermeasure network;
and inputting the test set data into the trained generation countermeasure network to obtain the enhanced underwater image.
2. The underwater image enhancement method based on the channel attention and deformation generation countermeasure network of claim 1 is characterized in that: the process of constructing the adaptive channel attention module with multi-scale receptive fields for re-calibrating channel weights is as follows:
the convolution operation is represented as:
wherein the content of the first and second substances,the ith layer representing the input features X,represents the ith layer weight of the kth filter,representing a convolution operation, U k Representing the convolution output, c represents the number of feature layers;
the fusion operation is represented as:
wherein the content of the first and second substances,andrespectively representing the characteristics obtained using the 3 x 3 and 5 x 5 filters,is a fusion feature obtained by pixel-by-pixel addition;
the average response for each feature map is calculated using global average pooling, which is expressed as:
wherein the content of the first and second substances,which is representative of the value of the response of the channel,representing the fused features, w and h are the width and height of the feature map, respectively, and m and n are the abscissa and ordinate indices of the features, respectively;
the nonlinear mapping capability is improved by utilizing a single hidden layer neural network, which is expressed as:
z fc =R(W fc s) (4)
wherein the content of the first and second substances,the output of the hidden layer is represented,representing the weight between the input layer and the hidden layer, d = max (c/R, L) representing the number of nodes of the hidden layer, R representing the reduction rate of the channel dimension, L representing the minimum value of the number of channels, R being the leak ReLU activation function;
further, a suitable spatial receptive field is adaptively selected, which is expressed as:
wherein, the first and the second end of the pipe are connected with each other,andthe weights between the hidden and output layers for the 3 x 3 and 5 x 5 branches respectively,andchannel attention weights of 3 × 3 and 5 × 5 branches, respectively;
the recalibrated signature V is expressed as:
3. The underwater image enhancement method based on the channel attention and deformation generation countermeasure network of claim 1 is characterized in that: the process of constructing the deformation convolution module for feature extraction and convolution kernel offset orientation is as follows:
the sample positions of the standard convolution kernel are expressed as:
P={(u,v)|u,v∈{-k,-(k-1),…,k}} (8)
wherein u and v represent the abscissa and ordinate of the sampling position of the convolution kernel respectively,representing the edge positions of the convolution kernel, w f Represents the current convolution kernel width;
the sample positions of the morpho-convolution are expressed as:
wherein p is 0 Represents the center point of the convolution kernel, p m =(x m ,y m ) Is the m-th element standard sample position, x, in the convolution kernel m And y m Respectively representing the abscissa and ordinate of the standard sampling position of the mth element in the convolution kernel, m =1,2, \ 8230;, n represents the index of the element in the convolution kernel, n = | P | is the total number of elements in the convolution kernel, Δ P m Representing the magnitude of the offset of the mth element in the convolution kernel,is the deformed sample position of the mth element in the convolution kernel,andrespectively representing the abscissa and the ordinate of the deformation position of the mth element in the convolution kernel;
the sampling positions of the deformed standard convolution kernels should satisfy:
computing non-integer positions from the X direction using bilinear interpolation techniquesCharacteristic value ofIt is expressed as:
wherein: p is a radical of tl =(x i ,y j ),p tr =(x i+1 ,y j ),p bl =(x i ,y j+1 ) And p br =(x i+1 ,y j+1 ) Respectively representing the coordinate positions of upper left, upper right, lower left and lower right integers closest to the deformation position;
wherein, the first and the second end of the pipe are connected with each other,is the m-th element in the convolution kernel at the deformed positionThe characteristic value of (a);
the output of the entire morphed convolution module is represented as:
4. The underwater image enhancement method based on the channel attention and deformation generation countermeasure network of claim 1 is characterized in that:
the generation countermeasure network comprises a generator and a discriminator which are connected in series;
the generator comprises an encoder and a decoder connected in series;
the encoder includes an adaptive channel attention module and a morphing convolution module connected in series.
5. The underwater image enhancement method for generating the countermeasure network based on the channel attention and the deformation as claimed in claim 1, wherein: the losses employed to train the generation of the countermeasure network based on the training set data include WGAN-GP losses, and image gradient difference losses.
6. An underwater image enhancement device for generating a countermeasure network based on channel attention and deformation is characterized in that: the method comprises the following steps:
an acquisition module: acquiring an underwater image construction data set, and dividing the data set into a training set and a test set;
constructing a module I: constructing a self-adaptive channel attention module which is used for calibrating channel weight again and has a multi-scale receptive field;
constructing a module II: the deformation convolution module is used for constructing a deformation convolution module which is used for carrying out feature extraction and faces convolution kernel offset;
a fusion module: the adaptive channel attention module and the deformation convolution module are fused to generate a confrontation network;
a training module: the system is used for training the generated countermeasure network based on the training set data to obtain the trained generated countermeasure network;
obtaining a module: and the method is used for inputting the test set data into the trained generation countermeasure network to obtain the enhanced underwater image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211394443.4A CN115713469A (en) | 2022-11-08 | 2022-11-08 | Underwater image enhancement method for generating countermeasure network based on channel attention and deformation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211394443.4A CN115713469A (en) | 2022-11-08 | 2022-11-08 | Underwater image enhancement method for generating countermeasure network based on channel attention and deformation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115713469A true CN115713469A (en) | 2023-02-24 |
Family
ID=85232509
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211394443.4A Pending CN115713469A (en) | 2022-11-08 | 2022-11-08 | Underwater image enhancement method for generating countermeasure network based on channel attention and deformation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115713469A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116563145A (en) * | 2023-04-26 | 2023-08-08 | 北京交通大学 | Underwater image enhancement method and system based on color feature fusion |
CN116579918A (en) * | 2023-05-19 | 2023-08-11 | 哈尔滨工程大学 | Attention mechanism multi-scale image conversion method based on style independent discriminator |
-
2022
- 2022-11-08 CN CN202211394443.4A patent/CN115713469A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116563145A (en) * | 2023-04-26 | 2023-08-08 | 北京交通大学 | Underwater image enhancement method and system based on color feature fusion |
CN116563145B (en) * | 2023-04-26 | 2024-04-05 | 北京交通大学 | Underwater image enhancement method and system based on color feature fusion |
CN116579918A (en) * | 2023-05-19 | 2023-08-11 | 哈尔滨工程大学 | Attention mechanism multi-scale image conversion method based on style independent discriminator |
CN116579918B (en) * | 2023-05-19 | 2023-12-26 | 哈尔滨工程大学 | Attention mechanism multi-scale image conversion method based on style independent discriminator |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Li et al. | Underwater image enhancement via medium transmission-guided multi-color space embedding | |
Wang et al. | An experimental-based review of image enhancement and image restoration methods for underwater imaging | |
Tian et al. | Deep learning on image denoising: An overview | |
Islam et al. | Simultaneous enhancement and super-resolution of underwater imagery for improved visual perception | |
Qi et al. | SGUIE-Net: Semantic attention guided underwater image enhancement with multi-scale perception | |
Wu et al. | A two-stage underwater enhancement network based on structure decomposition and characteristics of underwater imaging | |
Liu et al. | IPMGAN: Integrating physical model and generative adversarial network for underwater image enhancement | |
US20220189017A1 (en) | Medical image processing method and apparatus, image processing method and apparatus, terminal and storage medium | |
CN111275637A (en) | Non-uniform motion blurred image self-adaptive restoration method based on attention model | |
CN115713469A (en) | Underwater image enhancement method for generating countermeasure network based on channel attention and deformation | |
CN109376611A (en) | A kind of saliency detection method based on 3D convolutional neural networks | |
Shi et al. | Low-light image enhancement algorithm based on retinex and generative adversarial network | |
CN108460742A (en) | A kind of image recovery method based on BP neural network | |
CN113449691A (en) | Human shape recognition system and method based on non-local attention mechanism | |
CN113284061B (en) | Underwater image enhancement method based on gradient network | |
CN112633274A (en) | Sonar image target detection method and device and electronic equipment | |
Sun et al. | Underwater image enhancement with encoding-decoding deep CNN networks | |
Han et al. | UIEGAN: Adversarial learning-based photorealistic image enhancement for intelligent underwater environment perception | |
CN115861094A (en) | Lightweight GAN underwater image enhancement model fused with attention mechanism | |
CN115700731A (en) | Underwater image enhancement method based on dual-channel convolutional neural network | |
Wang et al. | Underwater color disparities: Cues for enhancing underwater images toward natural color consistencies | |
CN116934592A (en) | Image stitching method, system, equipment and medium based on deep learning | |
CN115272072A (en) | Underwater image super-resolution method based on multi-feature image fusion | |
CN114862707A (en) | Multi-scale feature recovery image enhancement method and device and storage medium | |
Qiao et al. | Adaptive deep learning network with multi-scale and multi-dimensional features for underwater image enhancement |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |