CN114936983A - Underwater image enhancement method and system based on depth cascade residual error network - Google Patents

Underwater image enhancement method and system based on depth cascade residual error network Download PDF

Info

Publication number
CN114936983A
CN114936983A CN202210680325.3A CN202210680325A CN114936983A CN 114936983 A CN114936983 A CN 114936983A CN 202210680325 A CN202210680325 A CN 202210680325A CN 114936983 A CN114936983 A CN 114936983A
Authority
CN
China
Prior art keywords
network
image
residual error
underwater
error network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210680325.3A
Other languages
Chinese (zh)
Inventor
赵铁松
蔡晓文
江楠峰
胡可鉴
陈炜玲
胡锦松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN202210680325.3A priority Critical patent/CN114936983A/en
Publication of CN114936983A publication Critical patent/CN114936983A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/34Smoothing or thinning of the pattern; Morphological operations; Skeletonisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/05Underwater scenes
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/80Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in fisheries management
    • Y02A40/81Aquaculture, e.g. of fish

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)
  • Ultra Sonic Daignosis Equipment (AREA)

Abstract

The invention relates to an underwater image enhancement method and system based on a depth cascade residual error network, wherein the method comprises the following steps: s1: constructing a deep cascade residual error network; constructing a training set and a test set according to a proportion; s2: partitioning an input image, and then respectively inputting the partitioned input image into three cascaded subnets of a deep cascaded residual error network to enable the network to carry out forward propagation to obtain a clear image output by the trained network; s3: calculating a loss value of the output image compared with the target image, and performing error back propagation according to the loss value to update the network weight; s4: judging whether the deep cascade residual error network is trained completely, if so, selecting an optimal model of the network; s5: inputting the test set into the optimal model for testing, and judging whether the optimal model reaches the expectation; s6: and inputting the underwater degraded image into the tested depth cascade residual error network to obtain an enhanced underwater image. The method and the system are favorable for correcting the color deviation of the underwater image, improving the contrast and the definition and improving the overall visual effect.

Description

Underwater image enhancement method and system based on depth cascade residual error network
Technical Field
The invention belongs to the technical field of image enhancement and restoration, and particularly relates to an underwater image enhancement method and system based on a depth cascade residual error network.
Background
Underwater images are often subject to noise, color distortion, and low contrast due to attenuation of light as it propagates through the water. These problems add to the difficulty of various tasks, such as automated detection and identification of fish and marine species environments. Accordingly, a number of underwater image enhancement methods have been proposed to recover or enhance degraded underwater images. In order to improve the quality of underwater images, various methods based on prior enhancement, physical models and deep learning are fully explored. A priori based methods aim at directly processing image pixel values to enhance features of a particular image, such as color, contrast, and brightness; physical model-based methods then use image characteristics and physical imaging models to recover sharp images. Recently, deep neural networks have achieved significant performance in both advanced visual tasks and image processing due to their powerful modeling capabilities and the ability to learn rich features from large amounts of training data. Meanwhile, some underwater image enhancement methods based on deep learning are also proposed to improve the image quality by extracting effective features from the synthetic data. Although these methods based on deep learning have made great progress in underwater image tasks, the performance of the current methods still has much room for improvement. Underwater images have different types of distortion, some methods based on deep learning adopt a fixed end-to-end supervised training mode, and lack sufficient flexibility to process degraded images, so that the details of the images are lost.
Disclosure of Invention
The invention aims to provide an underwater image enhancement method and system based on a depth cascade residual error network, which are beneficial to correcting the color deviation of an underwater image, improving the contrast and the definition and improving the overall visual effect.
In order to achieve the purpose, the invention adopts the technical scheme that: an underwater image enhancement method based on a depth cascade residual error network comprises the following steps:
step S1: constructing a deep cascade residual error network and setting parameters of the deep cascade residual error network; constructing a training set and a testing set in proportion, wherein the training set comprises an underwater degraded image and a corresponding real image;
step S2: partitioning the underwater degraded images in the training set according to a set proportion, and then respectively inputting the partitioned underwater degraded images into three cascade subnets of the deep cascade residual error network, so that the deep cascade residual error network performs forward propagation to obtain a clear image output by the trained network;
step S3: calculating a loss value of an output image of the deep cascade residual error network compared with a corresponding real image, and performing error back propagation according to the loss value to update a weight value of the deep cascade residual error network;
step S4: judging whether the deep cascading residual error network is trained, if so, selecting the best model of the deep cascading residual error network after training and executing the step S5, otherwise, returning to execute the step S2;
step S5: inputting the test set into the optimal model of the deep cascade residual error network for testing, judging whether the optimal model meets the expected requirement according to the test result, if so, executing the next step S6, otherwise, returning to execute the step S2 again;
step S6: and inputting the underwater degraded image to be enhanced into the tested depth cascade residual error network to obtain the enhanced underwater image.
Furthermore, the deep cascade residual error network consists of three cascade subnets, and the three cascade subnets are used for recovering degraded underwater images from coarse to fine step by step; dividing an input image into blocks according to the proportion of 4-2-1 and inputting the blocks into a depth cascade residual error network, namely dividing the image into 4 non-overlapping blocks and inputting the blocks into a first subnet, dividing the image into 2 non-overlapping blocks and inputting the blocks into a second subnet, and inputting an original image into a third subnet; the first two subnets adopt a gated codec subnetwork for learning context information, and the third subnetwork adopts an original resolution subnetwork for reserving required fine textures under the condition of not using any up-down sampling operation; in order to further improve the inter-subnet information transfer and visual quality, the deep cascaded residual network embeds different modules between different subnetworks: embedding a detail enhancement module DEB to learn multi-scale features of the image; a supervisory restoration module SRB is embedded to fuse the previous information for final restoration.
Furthermore, the gated codec subnetwork firstly adopts the channel attention module to consider different weighting information contained in different channel features, and secondly utilizes the expanded convolutional layer to replace the transposed convolutional layer to improve the spatial resolution of the features in the decoder, further expands the acceptance domain, and avoids detail loss.
Further, the native resolution sub-network retains details from the input image to the output image without using any down-sampling operations; in consideration of the influence of the color and the water body of the underwater image, the original resolution sub-network adopts the channel attention block and the pixel attention block to obtain pixels and channel information so as to generate better enhancement; the native resolution sub-network is composed of a plurality of native resolution blocks, each native resolution block containing a channel attention block and a pixel attention block.
Further, the detail enhancement module embeds detail features of different scales based on a multilayer pyramid structure to obtain a final result; the detail enhancement module comprises two 3 x 3 front ends and other 1 x 1 convolutional layers; firstly, the output of a first subnet passes through a front-end convolutional layer, and 1/8, 1/16 and 1/32 down-sampling are carried out on the output of the front-end convolutional layer to establish a three-scale detail pyramid; secondly, the 1 × 1 convolution layer is used for dimensionality reduction, and the image is up-sampled to the original size; finally, the outputs are connected, and a final output is generated through a 3 x 3 convolutional layer; reconstructing details of the underwater image in the first sub-network by fusing features of different scales, and transmitting the rich detail feature map to the next sub-network; the detail enhancement module is specifically represented as follows:
r 0 =σ(C 3-1 (C 3-2 (I net 1-out ))),
r 1 =D 8 (r 0 ),r 2 =D 16 (r 0 ),r 3 =D 32 (r 0 ),
r 11 =σ(C 1-1 (r 1 )),r 22 =σ(C 1-2 (r 2 )),r 33 =σ(C 1-3 (r 3 )), (1)
r 4 =U 8 (r 11 ),r 5 =U 16 (r 22 ),r 6 =U 32 (r 33 ),
D out =C 3-3 (Cat(r 4 ,r 5 ,r 6 )),
wherein C is i-j Representing convolutional layers, i represents the size of the convolutional kernel, j represents the jth convolutional layer, σ is the Relu activation function, D p And U p Respectively representing pooling and upsampling operations, and p represents the size of the scale.
Further, the supervision restoration module generates an attention map to suppress information with the help of supervision prediction using an output of the second subnet as a supervision signalFewer features and only useful features are allowed to be trained; the process is represented as: first, the output of the second sub-network is processed using a 1 × 1 convolutional layer to generate a corresponding residual image, i.e., y 0 (ii) a At the same time, the input image of the third sub-network is processed by the same method to generate y 1 (ii) a Then y is 1 Addition to y 0 In (1), generating y 2 Generating an attention map by a 1 × 1 convolutional layer and sigmoid activation function; next, the generated attention map is associated with y 0 Multiplication to obtain y 3 ,y 3 Useful information containing more enhanced images; third, using skip connections, will y 3 Combined with supervisory signals to generate y 4 (ii) a Finally, let y 4 And y 1 Combining to obtain a final characteristic diagram, and inputting the final characteristic diagram into an original resolution sub-network; specifically, the following are shown:
y 0 =C 1-4 (Out Stage2 ),y 1 =C 1-5 (In Stage3 ),
y 2 =y 0 +y 1 ,
y 3 =ω(C 1-6 (y 2 ))*y 0 , (2)
y 4 =y 3 +Out Sub-Network-2 ,
S out =Cat(y 4 ,y 1 ),
where ω is the sigmoid activation function, C i-j Represents the convolutional layer, i represents the size of the convolutional kernel, and j represents the jth convolutional layer.
Further, the weighted sum of the smoothing L1 loss and the perception loss is used as the training loss of the network, the network training process is evaluated in real time, and the network and data obtained by training are stored in real time; where the smoothed L1 loss function is expressed as:
Figure BDA0003698115600000031
Figure BDA0003698115600000032
wherein, y' i And y i Representing the real image and the enhanced image at pixel i, N being the total number of pixels; in order to obtain a more realistic image, a perceptual loss function is introduced, namely, the characteristic difference between an output image and a realistic image is measured;
the perceptual loss function is expressed as:
Figure BDA0003698115600000041
wherein, V j (Φ(y′ i ) ) and V j Φ(y i ) Respectively representing an enhanced feature map and a real feature map of the ith layer of the VGG network; c j ,H j ,W j Dimension representing a feature map of a jth convolutional layer in the VGG network; m is the number of features used in the perceptual loss function;
the total loss function is weighted by the two functions described above, and is represented as:
L loss =L S +λ*L per (6)
where λ is used to adjust the relative weights of the components of the perceptual loss function.
Further, the network training process is evaluated by using performance evaluation indexes PSNR and SSIM with reference and performance evaluation indexes UCIQE and UIQM without reference.
Further, training adopts an underwater real data set UIEB; the UIEB data set consists of 890 real underwater degraded images and corresponding real images as well as 60 underwater degraded images to be enhanced.
The invention also provides an underwater image enhancement system based on the deep cascade residual error network, which comprises a memory, a processor and a computer program instruction which is stored on the memory and can be run by the processor, wherein when the processor runs the computer program instruction, the steps of the method can be realized.
Compared with the prior art, the invention has the following beneficial effects: the underwater image enhancement method and system based on the depth cascade residual error network solve the problem that various underwater distortions cannot be simultaneously solved in the existing underwater image enhancement algorithm. The method constructs a deep cascade residual error network, and the degraded image is enhanced from coarse to fine through a plurality of cascade sub-networks. The first two subnetworks use an attention and gate fusion strategy to learn multi-scale context information, while the last subnet is used to retain fine spatial detail. In order to further generate a real image, the method also embeds detail enhancement blocks and a supervision recovery module between different subnets, and gradually refines the rough image residual by utilizing detail recovery and attention supervision. Experimental results prove that the method can correct the color deviation of the underwater image, improve the contrast and the definition and improve the overall visual effect.
Drawings
FIG. 1 is a flow chart of a method implementation of an embodiment of the present invention;
FIG. 2 is a general framework structure diagram of a deep concatenated residual error network in an embodiment of the present invention;
FIG. 3 is a block diagram of a native resolution sub-network in an embodiment of the present invention;
FIG. 4 is a block diagram of a detail enhancement module in an embodiment of the present invention;
fig. 5 is a block diagram of a supervisory restoration module in an embodiment of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As shown in fig. 1, the present embodiment provides an underwater image enhancement method based on a depth cascade residual error network, including the following steps:
step S1: constructing a deep cascade residual error network (CURE-Net) and setting parameters of the deep cascade residual error network; and constructing a training set and a testing set in proportion, wherein the training set comprises the underwater degraded images and the corresponding real images.
Step S2: and partitioning the underwater degraded images in the training set according to a set proportion, and then respectively inputting the partitioned underwater degraded images into three cascaded subnets of the deep cascaded residual error network, so that the deep cascaded residual error network carries out forward propagation to obtain a clear image output by the trained network.
Step S3: and calculating a loss value of the output image of the depth cascade residual error network compared with the corresponding real image, and performing error back propagation according to the loss value to update the weight value of the depth cascade residual error network.
Step S4: and judging whether the deep cascading residual error network is trained, if so, selecting the best model of the deep cascading residual error network after training and executing the step S5, otherwise, returning to execute the step S2.
Step S5: inputting the test set into the optimal model of the deep cascade residual error network for testing, judging whether the optimal model meets the expected requirement according to the test result, if so, executing the next step S6, otherwise, returning to execute the step S2 again.
Step S6: and inputting the underwater degraded image to be enhanced into the tested depth cascade residual error network to obtain the enhanced underwater image.
As shown in fig. 2, the deep cascade residual network is composed of three cascade subnets, and the three cascade subnets are used for recovering degraded underwater images from coarse to fine step by step; dividing an input image into blocks according to the proportion of 4-2-1 and inputting the blocks into a depth cascade residual error network, namely dividing the image into 4 non-overlapping blocks and inputting the blocks into a first subnet, dividing the image into 2 non-overlapping blocks and inputting the blocks into a second subnet, and inputting an original image into a third subnet; the first two subnets adopt a Gate Encoder-Decoder Sub-Network (Gate Encoder-Decoder Sub-Network) for learning context information, and the third subnet adopts an Original Resolution Sub-Network (Original Resolution Sub-Network) for reserving required fine textures without using any up-down sampling operation; in order to further improve the information transfer and visual quality between the subnetworks, the deep cascade residual network embeds different modules between different subnetworks: embedding a detail Enhancement module DEB (detail Enhancement Block) to learn the multi-scale features of the image; a supervisory Recovery module srb (supervisory Recovery block) is embedded to fuse the previous information for final Recovery.
As shown in fig. 2, as seen in the GESNet, the gated codec subnetwork first adopts a Channel Attention Block (Channel Attention Block) to consider different weighting information included in different Channel features, and then uses an expanded convolutional layer to replace a transposed convolutional layer to improve the spatial resolution of the features in the decoder, further expand the acceptance domain, and avoid detail loss.
As shown in fig. 3, the Original Resolution Sub-Network (Original Resolution Sub-Network) retains details from the input image to the output image without using any down-sampling operation; considering the influence of the color and water body of the underwater image, the original resolution sub-network adopts a channel Attention Block (ChannelAttention Block) and a Pixel Attention Block (Pixel Attention Block) to obtain pixels and channel information so as to generate better enhancement; the native Resolution Sub-network is composed of a plurality of native Resolution blocks (native Resolution Sub-Networks), each native Resolution block containing a channel attention block and a pixel attention block.
As shown in fig. 4, the detail enhancement module embeds detail features of different scales based on a multi-layer pyramid structure to obtain a final result; the detail enhancement module comprises two 3 x 3 front ends and other 1 x 1 convolutional layers; firstly, the output of a first subnet passes through a front-end convolutional layer, and 1/8, 1/16 and 1/32 down-sampling are carried out on the output of the front-end convolutional layer to establish a three-scale detail pyramid; secondly, the 1 × 1 convolution layer is used for dimensionality reduction, and the image is up-sampled to the original size; finally, the outputs are connected, and a final output is generated through a 3 x 3 convolutional layer; reconstructing details of the underwater image in the first sub-network by fusing features of different scales, and transmitting the rich detail feature map to the next sub-network; the detail enhancement module is helpful for recovering the color of the underwater image and improving the visibility of the underwater image, and is specifically represented as follows:
r 0 =σ(C 3-1 (C 3-2 (I net 1-out ))),
r 1 =D 8 (r 0 ),r 2 =D 16 (r 0 ),r 3 =D 32 (r 0 ),
r 11 =σ(C 1-1 (r 1 )),r 22 =σ(C 1-2 (r 2 )),r 33 =σ(C 1-3 (r 3 )), (1)
r 4 =U 8 (r 11 ),r 5 =U 16 (r 22 ),r 6 =U 32 (r 33 ),
D out =C 3-3 (Cat(r 4 ,r 5 ,r 6 )),
wherein C is i-j Denotes convolution layer, i denotes the size of convolution kernel, j denotes the jth convolution layer, σ is Relu activation function, D p And U p Respectively representing pooling and upsampling operations, and p represents the size of the scale.
As shown in fig. 5, the supervision restoration module generates an attention map to suppress less informative features and allow only useful features to be trained with the help of supervision prediction using the output of the second subnet as a supervision signal; the process is represented as: first, the output of the second sub-network is processed using a 1 × 1 convolutional layer to generate a corresponding residual image, i.e., y 0 (ii) a At the same time, the input image of the third sub-network is processed in the same way to generate y 1 (ii) a Then y is 1 Is added to y 0 In (1), generating y 2 By 1X 1 of convolutional layers and siA gmoid activation function to generate an attention graph; next, the generated attention map is associated with y 0 Multiplication to obtain y 3 ,y 3 Useful information containing more enhanced images; third, using skip connections, will y 3 Combined with supervisory signals to generate y 4 (ii) a Finally, mixing y 4 And y 1 Combining to obtain a final feature map, and inputting the final feature map into the original resolution sub-network; specifically, the following are shown:
y 0 =C 1-4 (Out Stage2 ),y 1 =C 1-5 (In Stage3 ),
y 2 =y 0 +y 1 ,
y 3 =ω(C 1-6 (y 2 ))*y 0 , (2)
y 4 =y 3 +Out Sub-Network-2 ,
S out =Cat(y 4 ,y 1 ),
where ω is the sigmoid activation function, C i-j Represents the convolutional layer, i represents the size of the convolutional kernel, and j represents the jth convolutional layer.
In this embodiment, a weighted sum of a smooth L1 Loss and a Perceptual Loss (Perceptual Loss) is used as a training Loss of the network, a network training process is evaluated in real time, and the network and data obtained by training are stored in real time; where the smoothed L1 loss function is expressed as:
Figure BDA0003698115600000071
Figure BDA0003698115600000072
wherein, y' i And y i Representing the real image and the enhanced image at pixel i, N being the total number of pixels; in order to obtain a more realistic image, a perceptual loss function is introduced, i.e. the difference in features between the output image and the realistic image is measured.
The perceptual loss function is expressed as:
Figure BDA0003698115600000081
wherein, V j (Φ(y′ i ) And V) j Φ(y i ) Respectively representing an enhanced feature map and a real feature map of the ith layer of the VGG network; c j ,H j ,W j Dimension representing a feature map of a jth convolutional layer in the VGG network; m is the number of features used in the perceptual loss function;
the total loss function is weighted by the two functions described above, and is represented as:
L loss =L S +λ*L per (6)
where λ is used to adjust the relative weights of the components of the perceptual loss function.
In this embodiment, the network training process is evaluated by using the performance evaluation indexes PSNR and SSIM with references and the performance evaluation indexes UCIQE and UIQM without references.
In this embodiment, the training uses an underwater true data set UIEB; the UIEB data set consists of 890 real underwater degraded images and corresponding real images as well as 60 underwater degraded images to be enhanced.
The embodiment also provides an underwater image enhancement system based on the depth cascade residual error network, which comprises a memory, a processor and computer program instructions stored on the memory and capable of being executed by the processor, wherein when the processor executes the computer program instructions, the steps of the method can be realized.
Experiments prove that the method provided by the invention is superior to the most advanced method at present. Wherein the comparison algorithm comprises: IBLA, RGHS, ULAP, UWCNN WaterNet, LCNet and Ucolor. The experiments were performed on UIEB datasets and the specific experimental results were as follows:
Figure BDA0003698115600000082
Figure BDA0003698115600000091
in addition, the present embodiment also performs ablation experiments of the module to prove the effectiveness of the module proposed by the present invention, and the specific data is shown in the following table:
Figure BDA0003698115600000092
the UIEB test results in the table show the performance improvement with different modules. Obviously, when a Detail Enhancement module (Detail Enhancement Block) and the supervision restoration module (supervisory Recovery Block) are simultaneously applied between two subnets, the PSNR reaches a maximum value of 26.55 db; when a supervision Recovery Block (Supervised Recovery Block) or a Detail Enhancement Block (Detail Enhancement Block) is not used, the performance of the network is affected to different degrees; without any module connections between subnetworks, a significant degradation in network performance occurs, with PSNR reaching only 25.08 db. Reasonable use of a Detail Enhancement module (Detail Enhancement Block) and a supervisory Recovery module (supervisory Recovery Block) provides more outstanding contribution to the network.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.
The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the protection scope of the technical solution of the present invention.

Claims (10)

1. An underwater image enhancement method based on a depth cascade residual error network is characterized by comprising the following steps:
step S1: constructing a deep cascade residual error network and setting parameters of the deep cascade residual error network; constructing a training set and a testing set in proportion, wherein the training set comprises an underwater degraded image and a corresponding real image;
step S2: partitioning the underwater degraded images in the training set according to a set proportion, and then respectively inputting the partitioned underwater degraded images into three cascade subnets of the deep cascade residual error network, so that the deep cascade residual error network performs forward propagation to obtain a clear image output by the trained network;
step S3: calculating a loss value of an output image of the deep cascade residual error network compared with a corresponding real image, and performing error back propagation according to the loss value to update a weight value of the deep cascade residual error network;
step S4: judging whether the deep cascading residual error network is trained, if so, selecting the best model of the deep cascading residual error network after training and executing the step S5, otherwise, returning to execute the step S2;
step S5: inputting the test set into the optimal model of the deep cascade residual error network for testing, judging whether the optimal model meets the expected requirement according to the test result, if so, executing the next step S6, otherwise, returning to execute the step S2 again;
step S6: and inputting the underwater degraded image to be enhanced into the tested depth cascade residual error network to obtain the enhanced underwater image.
2. The underwater image enhancement method based on the depth cascade residual error network is characterized in that the depth cascade residual error network consists of three cascade subnetworks, and degraded underwater images are gradually recovered from thick to thin by using the three cascade subnetworks; dividing an input image into blocks according to the proportion of 4-2-1 and inputting the blocks into a depth cascade residual error network, namely dividing the image into 4 non-overlapping blocks and inputting the blocks into a first subnet, dividing the image into 2 non-overlapping blocks and inputting the blocks into a second subnet, and inputting an original image into a third subnet; the first two subnets adopt a gated codec subnetwork for learning context information, and the third subnet adopts an original resolution subnetwork for reserving required fine textures under the condition of not using any up-down sampling operation; in order to further improve the information transfer and visual quality between the subnetworks, the deep cascade residual network embeds different modules between different subnetworks: embedding a detail enhancement module DEB to learn multi-scale features of the image; a supervisory restoration module SRB is embedded to fuse the previous information for final restoration.
3. The underwater image enhancement method based on the depth cascade residual error network as claimed in claim 2, wherein the gated codec subnetwork firstly adopts the channel attention module to consider different weighting information contained in different channel features, and secondly utilizes the expanded convolutional layer to replace the transposed convolutional layer to improve the spatial resolution of the features in the decoder, further expand the acceptance domain and avoid detail loss.
4. The underwater image enhancement method based on the depth cascade residual network of claim 2, characterized in that the original resolution sub-network retains details from input image to output image without using any down-sampling operation; in consideration of the influence of the color and the water body of the underwater image, the original resolution sub-network adopts the channel attention block and the pixel attention block to obtain pixels and channel information so as to generate better enhancement; the native resolution sub-network is composed of a plurality of native resolution blocks, each native resolution block containing a channel attention block and a pixel attention block.
5. The underwater image enhancement method based on the depth cascade residual error network as claimed in claim 2, wherein the detail enhancement module embeds detail features of different scales based on a multilayer pyramid structure to obtain a final result; the detail enhancement module comprises two 3 x 3 front ends and other 1 x 1 convolutional layers; firstly, the output of a first subnet passes through a front-end convolutional layer, and 1/8, 1/16 and 1/32 down-sampling are carried out on the output of the front-end convolutional layer to establish a three-scale detail pyramid; secondly, the 1 × 1 convolution layer is used for dimensionality reduction, and the image is up-sampled to the original size; finally, the outputs are connected, and a final output is generated through a 3 x 3 convolutional layer; reconstructing details of the underwater image in the first sub-network by fusing features of different scales, and transmitting the rich detail feature map to the next sub-network; the detail enhancement module is specifically represented as follows:
Figure FDA0003698115590000021
wherein C is i-j Denotes convolution layer, i denotes the size of convolution kernel, j denotes the jth convolution layer, σ is Relu activation function, D p And U p Respectively representing pooling and upsampling operations, and p represents the size of the scale.
6. The underwater image enhancement method based on the depth cascade residual error network as claimed in claim 2, characterized in that the supervision restoration module uses the output of the second subnet as a supervision signal, and with the help of supervision prediction, generates an attention map to suppress less informative features and allow only useful features to be trained; the process is represented as: first, the output of the second sub-network is processed using a 1 × 1 convolutional layer to generate a corresponding residual image, i.e., y 0 (ii) a At the same time, the input image of the third sub-network is processed in the same way to generate y 1 (ii) a Then y is 1 Is added to y 0 In (1), generating y 2 Generating an attention map by a 1 × 1 convolutional layer and sigmoid activation function; next, the generated attention map is associated with y 0 Multiplication to obtain y 3 ,y 3 Useful information containing more enhanced images; third, using skip connections, will y 3 Combined with supervisory signals to generate y 4 (ii) a Finally, mixing y 4 And y 1 Combining to obtain a final feature map, and inputting the final feature map into the original resolution sub-network; specifically, the following are shown:
Figure FDA0003698115590000031
where ω is the sigmoid activation function, C i-j Represents the convolutional layer, i represents the size of the convolutional kernel, and j represents the jth convolutional layer.
7. The underwater image enhancement method based on the deep cascade residual error network is characterized in that the weighted sum of the smooth L1 loss and the perception loss is used as the training loss of the network, the network training process is evaluated in real time, and the network and the data obtained by training are stored in real time; where the smoothed L1 loss function is expressed as:
Figure FDA0003698115590000032
Figure FDA0003698115590000033
wherein, y' i And y i Representing the real image and the enhanced image at pixel i, N being the total number of pixels; in order to obtain a more real image, a perception loss function is introduced, namely, the characteristic difference between an output image and a real image is measured;
the perceptual loss function is expressed as:
Figure FDA0003698115590000034
wherein, V j (Φ(y′ i ) And V) j Φ(y i ) Respectively represent VAn enhanced characteristic diagram and a real characteristic diagram of the ith layer of the GG network; c j ,H j ,W j Dimension representing a feature map of a jth convolutional layer in the VGG network; m is the number of features used in the perceptual loss function;
the total loss function is weighted by the two functions, expressed as:
L loss =L S +λ*L per (6)
where λ is used to adjust the relative weights of the components of the perceptual loss function.
8. The underwater image enhancement method based on the deep cascade residual error network as claimed in claim 7, wherein the network training process is evaluated by using performance evaluation indexes PSNR and SSIM with reference and performance evaluation indexes UCIQE and UIQM without reference.
9. The underwater image enhancement method based on the depth cascade residual error network is characterized in that training adopts an underwater true data set UIEB; the UIEB data set consists of 890 real underwater degraded images and corresponding real images as well as 60 underwater degraded images to be enhanced.
10. An underwater image enhancement system based on a depth cascade residual network, comprising a memory, a processor and computer program instructions stored on the memory and executable by the processor, which when executed by the processor, are capable of implementing the method steps of any of claims 1 to 9.
CN202210680325.3A 2022-06-16 2022-06-16 Underwater image enhancement method and system based on depth cascade residual error network Pending CN114936983A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210680325.3A CN114936983A (en) 2022-06-16 2022-06-16 Underwater image enhancement method and system based on depth cascade residual error network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210680325.3A CN114936983A (en) 2022-06-16 2022-06-16 Underwater image enhancement method and system based on depth cascade residual error network

Publications (1)

Publication Number Publication Date
CN114936983A true CN114936983A (en) 2022-08-23

Family

ID=82869219

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210680325.3A Pending CN114936983A (en) 2022-06-16 2022-06-16 Underwater image enhancement method and system based on depth cascade residual error network

Country Status (1)

Country Link
CN (1) CN114936983A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110223234A (en) * 2019-06-12 2019-09-10 杨勇 Depth residual error network image super resolution ratio reconstruction method based on cascade shrinkage expansion
CN112070668A (en) * 2020-08-18 2020-12-11 西安理工大学 Image super-resolution method based on deep learning and edge enhancement
CN112348036A (en) * 2020-11-26 2021-02-09 北京工业大学 Self-adaptive target detection method based on lightweight residual learning and deconvolution cascade
CN112419219A (en) * 2020-11-25 2021-02-26 广州虎牙科技有限公司 Image enhancement model training method, image enhancement method and related device
US20210390339A1 (en) * 2020-06-15 2021-12-16 Dalian University Of Technology Depth estimation and color correction method for monocular underwater images based on deep neural network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110223234A (en) * 2019-06-12 2019-09-10 杨勇 Depth residual error network image super resolution ratio reconstruction method based on cascade shrinkage expansion
US20210390339A1 (en) * 2020-06-15 2021-12-16 Dalian University Of Technology Depth estimation and color correction method for monocular underwater images based on deep neural network
CN112070668A (en) * 2020-08-18 2020-12-11 西安理工大学 Image super-resolution method based on deep learning and edge enhancement
CN112419219A (en) * 2020-11-25 2021-02-26 广州虎牙科技有限公司 Image enhancement model training method, image enhancement method and related device
CN112348036A (en) * 2020-11-26 2021-02-09 北京工业大学 Self-adaptive target detection method based on lightweight residual learning and deconvolution cascade

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
""CURE-Net:A Cascaded Deep Network for Underwater Image Enhancement"", IEEE JOURNAL OF OCEANIC ENGINEERING, vol. 49, no. 1, 31 January 2024 (2024-01-31), pages 226 - 236 *
吴从中;陈曦;季栋;詹曙;: "结合深度残差学习和感知损失的图像去噪", 中国图象图形学报, no. 10, 16 October 2018 (2018-10-16), pages 55 - 63 *

Similar Documents

Publication Publication Date Title
CN111292264B (en) Image high dynamic range reconstruction method based on deep learning
CN108604369B (en) Method, device and equipment for removing image noise and convolutional neural network
JP6656111B2 (en) Method and system for removing image noise
CN109658344A (en) Image de-noising method, device, equipment and storage medium based on deep learning
CN110992270A (en) Multi-scale residual attention network image super-resolution reconstruction method based on attention
JP2020502665A (en) Convert source domain image to target domain image
CN112767468A (en) Self-supervision three-dimensional reconstruction method and system based on collaborative segmentation and data enhancement
CN113450288B (en) Single image rain removing method and system based on deep convolutional neural network and storage medium
CN110148088B (en) Image processing method, image rain removing method, device, terminal and medium
CN112489164B (en) Image coloring method based on improved depth separable convolutional neural network
CN111275638B (en) Face repairing method for generating confrontation network based on multichannel attention selection
KR20190062283A (en) Method and apparatus for traning of generative adversarial network using selective loss function
CN111179196B (en) Multi-resolution depth network image highlight removing method based on divide-and-conquer
CN110210524A (en) A kind of training method, image enchancing method and the device of image enhancement model
CN114820388B (en) Image defogging method based on codec structure
CN110443764A (en) Video repairing method, device and server
CN112927209A (en) CNN-based significance detection system and method
CN111724370A (en) Multi-task non-reference image quality evaluation method and system based on uncertainty and probability
CN114913083A (en) Underwater image enhancement method based on context decomposition feature fusion
CN112509144A (en) Face image processing method and device, electronic equipment and storage medium
CN113763268B (en) Blind restoration method and system for face image
CN117670687A (en) Underwater image enhancement method based on CNN and transducer mixed structure
CN113436107A (en) Image enhancement method, intelligent device and computer storage medium
CN115526891B (en) Training method and related device for defect data set generation model
CN114862699B (en) Face repairing method, device and storage medium based on generation countermeasure network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination