CN117218033B

CN117218033B - Underwater image restoration method, device, equipment and medium

Info

Publication number: CN117218033B
Application number: CN202311256209.XA
Authority: CN
Inventors: 郑建华; 赵若琳; 刘双印; 曹亮; 朱蓉; 冯大春; 罗智杰; 李锦慧; 张子豪; 傅雨莎; 陆俊德
Original assignee: Zhongkai University of Agriculture and Engineering
Current assignee: Zhongkai University of Agriculture and Engineering
Priority date: 2023-09-27
Filing date: 2023-09-27
Publication date: 2024-03-12
Anticipated expiration: 2043-09-27
Also published as: CN117218033A

Abstract

The application is suitable for the technical field of image restoration, and provides an underwater image restoration method, an underwater image restoration device and an underwater image restoration medium. The underwater image restoration method comprises the following steps: extracting a red channel image, a green channel image and a blue channel image; respectively carrying out background light estimation on the red channel image, the green channel image and the blue channel image to obtain a red background light estimated value, a green background light estimated value and a blue background light estimated value; fusing the RGB image to be restored, the red background light estimated value, the green background light estimated value and the blue background light estimated value to obtain a background light image; performing transmission map estimation on the RGB image to be restored and the background light image to obtain a transmission map of the RGB image to be restored; generating an initial restoration image of the RGB image to be restored based on the background light image and the transmission image; and adjusting the brightness of pixels meeting preset conditions in the initial restored image to obtain a final restored image. The underwater image restoration method can improve the underwater image restoration effect.

Description

Underwater image restoration method, device, equipment and medium

Technical Field

The present disclosure relates to the field of image restoration technologies, and in particular, to a method, an apparatus, a device, and a medium for restoring an underwater image.

Background

Underwater image restoration has wide application in underwater scenes. The high-quality underwater images are required in the fields of marine ecological environment monitoring, underwater archaeology, underwater robot detection, underwater oil and gas pipeline maintenance and the like. In the aquaculture environment, a large number of suspended particles with different sizes exist in water due to feed residues and fish and shrimp feces, so that the scattering effect is more obvious, and the ambiguity of underwater images is increased and the contrast is reduced; in addition, the water body turbidity, medium shaking and the like caused by the swimming of fishes and shrimps in the water can cause more serious scattering effect, so that the imaging environment is more severe. Therefore, how to restore the images with good visual effect in the aquiculture scene with serious complex degradation is of great significance to the subsequent monitoring of basic information of aquatic products such as growth monitoring, appearance weight data and the like, and the automatic and intelligent development of the aquiculture industry. It can be seen that the underwater image restoration method is one of the most basic and key pretreatment methods for realizing mechanization and intellectualization in the aquaculture industry, and the accuracy of subsequent information estimation and intelligent decision is directly determined by the quality of the restoration effect.

The current underwater image restoration method is mainly divided into two types, the first is an underwater image restoration method based on traditional image processing, modeling and analyzing an underwater image are carried out by utilizing an underwater optical principle and an imaging process, so that an imaging model suitable for an underwater environment is deduced, and the imaging model is solved by utilizing an inverse problem theory. The second is an underwater image restoration method based on deep learning, which describes an underwater image imaging process by combining a deep learning technology with a physical model, and fully utilizes the physical principle to describe the underwater image imaging process, so that a network can learn a more accurate image restoration method. Meanwhile, different physical model parameters are input to adapt to different underwater imaging scenes, so that the recovery of the underwater image is realized. The existing underwater image restoration methods are aimed at single-type underwater scenes, and the generalization performance of the algorithm is not enough. In addition, the underwater scene is relatively complex due to the depth of the water body and the existence of marine organisms, the light attenuation is fast, and the color cast is serious. The simple deep learning network does not design a special module to process the part, is still inaccurate in background light and transmission diagram estimation, and has the problem of non-ideal underwater image restoration effect.

Disclosure of Invention

The application provides an underwater image restoration method, an underwater image restoration device, underwater image restoration equipment and an underwater image restoration medium, which can solve the problem that the effect of underwater image restoration is not ideal.

In a first aspect, an embodiment of the present application provides an underwater image restoration method, including:

extracting a red channel image, a green channel image and a blue channel image of an RGB image to be restored;

respectively carrying out background light estimation on the red channel image, the green channel image and the blue channel image to obtain a red background light estimated value, a green background light estimated value and a blue background light estimated value;

fusing the RGB image to be restored, the red background light estimated value, the green background light estimated value and the blue background light estimated value to obtain a background light image;

performing transmission map estimation on the RGB image to be restored and the background light image to obtain a transmission map of the RGB image to be restored;

generating an initial restoration image of the RGB image to be restored based on the background light image and the transmission image;

adjusting the brightness of pixels meeting preset conditions in the initial restored image to obtain a final restored image; the preset condition is that the brightness of the pixel is larger than the preset value of the upper limit of the brightness or smaller than the preset value of the lower limit of the brightness.

Optionally, performing background light estimation on the red channel image, the green channel image and the blue channel image to obtain a red background light estimated value, a green background light estimated value and a blue background light estimated value, including:

respectively carrying out background light estimation on the red channel image, the green channel image and the blue channel image by using a background light estimation network to obtain a red background light estimated value, a green background light estimated value and a blue background light estimated value;

the background light estimation network comprises three background light estimation units;

the three background light estimation units are in one-to-one correspondence with the red channel image, the green channel image and the blue channel image;

the backlight estimation unit includes: the device comprises a first shallow feature extraction module, a second shallow feature extraction module, a channel deep feature extraction module, a feature fusion module, a pixel weight determination module and a global average pooling module which are sequentially connected; the output end of the second shallow layer feature extraction module is connected with the input end of the channel deep layer feature extraction module and the input end of the feature fusion module, the output end of the channel deep layer feature extraction module is connected with the input end of the feature fusion module, the output end of the feature fusion module is connected with the input end of the pixel weight determination module, and the output end of the pixel weight determination module is connected with the input end of the global average pooling module;

The input end of a first shallow feature extraction module of the background light estimation unit corresponding to the red channel image receives the red channel image, and a global averaging pooling module of the background light estimation unit corresponding to the red channel image outputs a red background light estimation value of the red channel image;

the input end of a first shallow feature extraction module of the background light estimation unit corresponding to the green channel image receives the green channel image, and a global averaging pooling module of the background light estimation unit corresponding to the green channel image outputs a green background light estimation value of the green channel image;

the input end of the first shallow feature extraction module of the background light estimation unit corresponding to the blue channel image receives the blue channel image, and the global average pooling module of the background light estimation unit corresponding to the blue channel image outputs a blue background light estimation value of the blue channel image.

Optionally, the first shallow feature extraction module and the second shallow feature extraction module each include a feature extraction module;

the feature extraction module comprises a feature extraction convolution layer, a normalization layer and a PReLU activation function which are sequentially connected; the input end of the feature extraction convolution layer is the input end of the feature extraction module, and the output end of the PReLU activation function is the output end of the feature extraction module;

The feature fusion module comprises: a multiplication sub-module, a addition sub-module and a fusion sub-module; the output end of the second shallow layer feature extraction module is connected with the input ends of the multiplication submodule and the addition submodule, the output end of the channel deep layer feature extraction module is connected with the input end of the multiplication submodule, the output end of the multiplication submodule is connected with the input end of the addition submodule, and the output end of the addition submodule is connected with the input end of the fusion submodule;

the fusion submodule comprises: the fusion convolution layer, the normalization layer and the PReLU activation function are sequentially connected; the input end of the fusion convolution layer is the input end of the fusion submodule, and the output end of the PReLU activation function is the output end of the fusion submodule;

the pixel weight determining module comprises a weight determining convolution layer, a normalization layer and a Sigmoid activating function which are connected in sequence; the input end of the weight determining convolution layer is the input end of the pixel weight determining module, and the output end of the Sigmoid activating function is the output end of the pixel weight determining module.

Optionally, performing transmission map estimation on the RGB image to be restored and the background light image to obtain a transmission map of the RGB image to be restored, including:

splicing the RGB image to be restored with the background light image to obtain a background light restoration image;

Performing transmission map estimation on the background light restoration image by using a transmission map estimation network to obtain a transmission map of an RGB image to be restored;

the transmission map estimation network includes: the device comprises a first layer of feature extraction unit, a first addition unit, a first layer of feature extraction fusion unit, a transmission diagram estimation unit, a first downsampling unit, a second layer of feature extraction unit, a second addition unit, a second layer of feature extraction fusion unit, a first upsampling unit, a second downsampling unit, a third layer of feature extraction fusion unit and a second upsampling unit;

the background light restoration image is respectively input into the input ends of the feature extraction unit and the first downsampling unit of the first layer, the output end of the transmission image estimation unit is the output end of the transmission image estimation network, and the transmission image of the RGB image to be restored is output;

the output end of the first layer of feature extraction unit is connected with the input end of the first addition unit, the output end of the first downsampling unit is connected with the input ends of the second downsampling unit and the second layer of feature extraction unit, the output end of the second downsampling unit is connected with the input end of the third layer of feature extraction fusion unit, the output end of the third layer of feature extraction fusion unit is connected with the input end of the second upsampling unit, the output end of the second upsampling unit is connected with the input end of the second addition unit, the output end of the second addition unit is connected with the input end of the second layer of feature extraction fusion unit, the output end of the second layer of feature extraction fusion unit is connected with the input end of the first upsampling unit, the output end of the first upsampling unit is connected with the input end of the first addition unit, the output end of the first addition unit is connected with the input end of the feature extraction fusion unit of the first layer, and the output end of the feature extraction fusion unit of the first layer is connected with the transmission image;

The feature extraction unit includes a feature extraction module.

Optionally, the feature extraction fusion unit includes:

the device comprises a first shallow feature extraction module, a second shallow feature extraction module, a deep feature extraction module, a multiplication module and an addition module which are connected in sequence; the output end of the second shallow feature extraction module is connected with the input ends of the multiplication module and the addition module, the output end of the deep feature extraction module is connected with the input end of the multiplication module, the output end of the multiplication module is connected with the input end of the addition module, and the output end of the addition module is connected with the input end of the transmission diagram estimation unit;

the input end of the first shallow feature extraction module is the input end of the feature extraction fusion unit, and the output end of the addition module is the output end of the feature extraction fusion unit;

the transmission map estimation unit comprises a transmission map estimation convolution layer, a normalization layer and a Sigmoid activation function which are connected in sequence;

the input end of the transmission diagram estimation convolution layer is the input end of the transmission diagram estimation unit, and the output end of the Sigmoid activation function is the output end of the transmission diagram estimation unit.

Optionally, generating an initial restoration image of the RGB image to be restored based on the background light image and the transmission map includes:

By the formula:

acquiring an initial restored image J of an RGB image to be restored _c ；

Wherein t is _c Representing a transmission diagram, I _c Represents an RGB image to be restored, x represents the position of a pixel, B _c Representing a background light image, B _c ＝{B _r ,B _g ,B _b }，B _r Represents the red background light estimated value, B _g Represents green light estimated value, B _b Representing the blue light estimate.

Optionally, adjusting the brightness of the pixels in the initial restored image that meet the preset condition to obtain a final restored image includes:

extracting features of the initial restoration image by utilizing a brightness adjustment network, determining pixels meeting preset conditions in the initial restoration image, and adjusting brightness of the pixels meeting the preset conditions in the initial restoration image to obtain a final restoration image;

the brightness adjustment network includes: the device comprises an HSV conversion unit, a feature extraction unit, a channel attention unit, a channel multiplication unit, a space attention unit, a space multiplication unit, a feature addition unit, an image fusion unit, an HSV reverse conversion unit and an image addition unit;

the input ends of the HSV conversion unit and the image addition unit are the input ends of the brightness adjustment network, the initial restoration image is received, the output end of the image addition unit is the output end of the brightness adjustment network, and the final restoration image is output;

The output end of the HSV conversion unit is connected with the input end of the feature extraction unit, the output end of the feature extraction unit is connected with the input ends of the channel attention unit, the space attention unit, the channel multiplication unit and the space multiplication unit, the output end of the channel attention unit is connected with the input end of the channel multiplication unit, the output end of the space attention unit is connected with the input end of the space multiplication unit, the output end of the channel multiplication unit and the output end of the space multiplication unit are both connected with the input end of the feature addition unit, the output end of the feature addition unit is connected with the input end of the image fusion unit, the output end of the image fusion unit is connected with the input end of the HSV inverse conversion unit, and the output end of the HSV inverse conversion unit is connected with the input end of the image addition unit;

the image fusion unit comprises a fusion convolution layer, a normalization layer and a Sigmoid activation function which are connected in sequence;

the input end of the fusion convolution layer is the input end of the image fusion unit, and the output end of the Sigmoid activation function is the output end of the image fusion unit.

Optionally, adjusting the brightness of the pixels in the initial restored image that satisfy the preset condition includes:

if the brightness of the pixel is larger than the preset brightness upper limit value, the brightness of the pixel is adjusted to be the preset brightness upper limit value;

And if the brightness of the pixel is smaller than the preset brightness lower limit value, adjusting the brightness of the pixel to the preset brightness lower limit value.

In a second aspect, an embodiment of the present application provides an underwater image restoration device, including:

the extraction module is used for extracting a red channel image, a green channel image and a blue channel image of the RGB image to be restored;

the background light estimating module is used for respectively carrying out background light estimation on the red channel image, the green channel image and the blue channel image to obtain a red background light estimated value, a green background light estimated value and a blue background light estimated value;

the fusion module is used for fusing the red background light estimated value, the green background light estimated value and the blue background light estimated value to obtain a background light image;

the transmission diagram estimation module is used for carrying out transmission diagram estimation on the RGB image to be restored and the background light image to obtain a transmission diagram of the RGB image to be restored;

the generating module is used for generating an initial restoration image of the RGB image to be restored based on the background light image and the transmission image;

the brightness adjusting module is used for adjusting the brightness of pixels meeting preset conditions in the initial restored image to obtain a final restored image; the preset condition is that the brightness of the pixel is larger than the preset value of the upper limit of the brightness or smaller than the preset value of the lower limit of the brightness.

In a third aspect, an embodiment of the present application provides a terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the above-mentioned underwater image restoration method when executing the above-mentioned computer program.

In a fourth aspect, embodiments of the present application provide a computer readable storage medium storing a computer program which, when executed by a processor, implements the above-described underwater image restoration method.

The scheme of the application has the following beneficial effects:

in the embodiment of the application, a red channel image, a green channel image and a blue channel image of an RGB image to be restored are extracted, then background light estimation is carried out on the red channel image, the green channel image and the blue channel image respectively to obtain a red background light estimated value, a green background light estimated value and a blue background light estimated value, then the RGB image to be restored, the red background light estimated value, the green background light estimated value and the blue background light estimated value are fused to obtain a background light image, then transmission diagram estimation is carried out on the RGB image to be restored and the background light image to be restored to obtain a transmission diagram of the RGB image to be restored, an initial restoration image of the RGB image to be restored is generated based on the background light image and the transmission diagram, and finally brightness of pixels meeting preset conditions in the initial restoration image is adjusted to obtain a final restoration image. The method comprises the steps of obtaining a red background light estimated value, a green background light estimated value and a blue background light estimated value, estimating the background light of three colors of an RGB image to be restored, improving the description accuracy of the background light in the RGB image to be restored by the background light image, obtaining a transmission image based on the background light image with high accuracy, improving the quality of an initial restored image generated based on the background light image and the transmission image, and effectively adjusting the overall brightness of the initial restored image by adjusting the brightness of pixels meeting preset conditions, thereby further improving the restoration effect of the underwater image.

Other advantages of the present application will be described in detail in the detailed description section that follows.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the following description will briefly introduce the drawings that are needed in the embodiments or the description of the prior art, it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of an underwater image restoration method according to an embodiment of the present application;

fig. 2 is a block diagram of a background light estimation network according to an embodiment of the present application;

FIG. 3 is a block diagram of a transmission map estimation network according to an embodiment of the present application;

FIG. 4 is a block diagram of a brightness adjustment network according to an embodiment of the present disclosure;

FIG. 5 is a flowchart illustrating an exemplary embodiment of an underwater image restoration system according to the present disclosure;

FIG. 6 is a comparison of RGB images to be restored and final restored images provided in an embodiment of the present application;

fig. 7 is a schematic structural diagram of an underwater image restoration device according to an embodiment of the present disclosure;

Fig. 8 is a schematic structural diagram of a terminal device according to an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system configurations, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the term "and/or" as used in this specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.

As used in this specification and the appended claims, the term "if" may be interpreted as "when..once" or "in response to a determination" or "in response to detection" depending on the context. Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]".

In addition, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used merely to distinguish between descriptions and are not to be construed as indicating or implying relative importance.

Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.

Aiming at the problem that the existing underwater image restoration effect is not ideal, the embodiment of the application provides an underwater image restoration method, which comprises the steps of extracting a red channel image, a green channel image and a blue channel image of an RGB image to be restored, then carrying out background light estimation on the red channel image, the green channel image and the blue channel image respectively to obtain a red background light estimated value, a green background light estimated value and a blue background light estimated value, then carrying out fusion on the red background light estimated value, the green background light estimated value and the blue background light estimated value to obtain a background light image, then carrying out transmission image estimation on the RGB image to be restored and the background light image to obtain a transmission image of the RGB image to be restored, generating an initial restoration image of the RGB image to be restored based on the background light image and the transmission image, and finally adjusting the brightness of pixels meeting preset conditions in the initial restoration image to obtain a final restoration image. The method comprises the steps of obtaining a red background light estimated value, a green background light estimated value and a blue background light estimated value, estimating the background light of three colors of an RGB image to be restored, improving the description accuracy of the background light in the RGB image to be restored by the background light image, obtaining a transmission image based on the background light image with high accuracy, improving the quality of an initial restored image generated based on the background light image and the transmission image, and effectively adjusting the overall brightness of the initial restored image by adjusting the brightness of pixels meeting preset conditions, thereby further improving the restoration effect of the underwater image.

The underwater image restoration method provided by the application is exemplified below.

As shown in fig. 1, the underwater image restoration method provided by the present application includes the following steps:

step 11, extracting a red channel image, a green channel image and a blue channel image of the RGB image to be restored.

In some embodiments of the present application, the red channel image, the green channel image, and the blue channel image of the RGB image to be restored may be extracted by image processing software such as an open source computer vision library (OpenCV, open Source Computer Vision Library).

The RGB image to be restored is illustratively an underwater image to be restored, such as a submarine animal image, an underwater facility image, and the like.

It is worth mentioning that, the images of three channels of the RGB image to be restored are extracted, so that the RGB image to be restored can be split, and the subsequent steps are convenient to process.

And step 12, respectively carrying out background light estimation on the red channel image, the green channel image and the blue channel image to obtain a red background light estimated value, a green background light estimated value and a blue background light estimated value.

Specifically, background light estimation is performed on a red channel image, a green channel image and a blue channel image based on a deep learning and attention mechanism respectively, so as to obtain a red background light estimated value, a green background light estimated value and a blue background light estimated value.

In some embodiments of the present application, a background light estimation network may be used to perform background light estimation on a red channel image, a green channel image, and a blue channel image, respectively, to obtain a red background light estimated value, a green background light estimated value, and a blue background light estimated value.

The first shallow feature extraction module and the second shallow feature extraction module are used for extracting shallow features of an input image, the channel deep feature extraction module is used for extracting deep features of the image, the feature fusion module is used for fusing the shallow features and the deep features, the pixel weight determination module is used for determining weights of pixels in output of the feature fusion module, and the global average pooling module is used for estimating a value of background light based on the weights of the pixels.

The feature extraction module comprises a feature extraction convolution layer, a normalization layer and a PReLU activation function which are sequentially connected; the input end of the feature extraction convolution layer is the input end of the feature extraction module, and the output end of the PReLU activation function is the output end of the feature extraction module; the feature fusion module comprises: a multiplication sub-module, a addition sub-module and a fusion sub-module; the output end of the second shallow layer feature extraction module is connected with the input ends of the multiplication submodule and the addition submodule, the output end of the channel deep layer feature extraction module is connected with the input end of the multiplication submodule, the output end of the multiplication submodule is connected with the input end of the addition submodule, and the output end of the addition submodule is connected with the input end of the fusion submodule; the fusion submodule comprises: the fusion convolution layer, the normalization layer and the PReLU activation function are sequentially connected; the input end of the fusion convolution layer is the input end of the fusion submodule, and the output end of the PReLU activation function is the output end of the fusion submodule; the pixel weight determining module comprises a weight determining convolution layer, a normalization layer and a Sigmoid activating function which are connected in sequence; the input end of the weight determining convolution layer is the input end of the pixel weight determining module, and the output end of the Sigmoid activating function is the output end of the pixel weight determining module. The channel deep feature extraction module is a channel attention mechanism operation module and comprises a global pooling layer, a maximum pooling layer, a multi-layer perception submodule, an addition submodule and a Sigmoid activation function; the output end of the second shallow layer feature extraction module is connected with the input ends of the global pooling layer and the maximum pooling layer, the output ends of the global pooling layer and the maximum pooling layer are connected with the input ends of the multi-layer perception sub-module, the output end of the multi-layer perception sub-module is connected with the input end of the addition sub-module, and the output end of the addition sub-module is the output end of the channel attention mechanism operation module.

The global pooling layer and the maximum pooling layer in the channel attention mechanism operation module are used for carrying out global pooling and maximum pooling on input features respectively to obtain two feature descriptors, the multi-layer perception sub-module is used for respectively perceiving the two feature descriptors to obtain a representative improved feature vector, the Sigmoid activation function is used for obtaining channel attention weight and weighting the feature vector by utilizing the channel attention weight, and the solid line deep feature is extracted.

The difference between the feature extraction convolution layer, the fusion convolution layer, and the weight determination convolution layer is the number of convolution kernels.

The method comprises the steps of enabling a red channel image with the size of 3×224×224 to enter a first shallow feature extraction module in a corresponding background light estimation unit, performing twice feature extraction through the first shallow feature extraction module and a second shallow feature extraction module, wherein the step size of a feature extraction convolution layer is 1, the convolution kernel size is 3×3, the number of convolution kernels is 16, the size of the shallow feature of the obtained red channel image is 16×224×224, then the shallow feature enters a channel deep feature extraction module, the deep feature of the red channel image is obtained based on a channel attention mechanism, the deep feature enters a feature fusion module, the first feature representation is obtained after multiplication with the shallow feature, the first feature representation is added with the shallow feature again to obtain a second feature representation, then the second feature representation enters a fusion module in the feature fusion module, the step size of the fusion convolution layer is 1, the convolution kernel size is 3×3, the number of convolution kernels is 8, the fusion feature after fusion is obtained, the fusion feature enters a pixel weight determination module, the weight of each pixel in the fusion feature is determined, the weight convolution kernel size is 1×3, the size of the fusion feature is obtained after the fusion step size is 1×3, the final estimation of the background light is obtained, the final estimated background light is obtained after the deep feature is multiplied by multiplying the shallow feature by the shallow feature, the first feature representation and the first feature representation is added with the shallow feature, the step size is obtained by the fusion light, the step size is obtained by the fusion device, the fusion device is obtained by the fusion device, and the fusion device is obtained by the final device, and the final device.

The shallow features refer to features extracted from the bottom pixel information of the image, and generally include basic visual elements such as edges, corner points, textures and the like, and the features mainly reflect local details and basic structures of the image. In the restoration of underwater images, shallow features can help capture the outline, texture details and color changes of underwater objects, and facilitate the preliminary analysis and recognition of underwater scenes. Deep features are high-level abstract features extracted from an image, and have stronger representation capability, and the features reflect higher-level semantic information, such as the shape, structure, combination relation and the like of objects in the image. In the restoration of underwater images, deep features can help the model identify the nature of the underwater organism and the features of the underwater environment.

It is worth mentioning that, through the comprehensive application of multi-layer feature extraction fusion, channel attention mechanism, feature fusion and multi-scale information, the method can effectively capture the features of the background light, improve the accuracy and stability of background light estimation, and improve the accuracy of red background light estimated value, green background light estimated value and blue background light estimated value, and accurately estimate the background light of three colors of the RGB image to be restored.

The background light estimation network described above is exemplified in connection with a specific example.

As shown in fig. 2, the RGB image to be restored is input into a background light estimation network, three background light estimation units are provided in the background light estimation network, the first two CBPs in each background light estimation unit are respectively a first shallow feature extraction module and a second shallow feature extraction module, the CBPs are configured as Convolution operation (Conv, convolition), normalization layer (BN, batch Normalization), parameterized modified linear unit (PReLU, parametric Rectified Linear Unit) activation function, channel attention mechanism (CA, coordAttention) is a channel deep feature extraction module, an output end of the channel deep feature extraction module is connected with an input end of a multiplication sub-module in a feature fusion module, mul is a multiplication sub-module, add is an addition sub-module, CBP connected with an output end of the addition sub-module is a fusion module, CBS connected with an output end of the fusion module is a pixel weight determination module, CBS is configured as Convolution operation (Conv, convolition), normalization layer (BN, batch Normalization), sigmoid activation function, an output end of the pixel weight determination module is a global background average cell GAP (Global Average Pooling), and three background light estimation units of the network are all estimated after all are calculated _r Green background light estimated value B _g And blue background light estimation valueB _b In the figure, K3 represents a convolution kernel size of 3×3, C16 represents a convolution kernel number of 16, C8 represents a convolution kernel number of 8, C1 represents a convolution kernel number of 1, S1 represents a convolution step size of 1, and P1 represents a pixel filling of 1, the pixel filling being to add additional pixel values around the boundary of the input image in order to control the size of the feature output after the convolution operation.

And 13, fusing the RGB image to be restored, the red background light estimated value, the green background light estimated value and the blue background light estimated value to obtain a background light image.

In some embodiments of the present application, the RGB image to be restored, the red background light estimated value, the green background light estimated value, and the blue background light estimated value may be fused by image processing software such as an open source computer vision library (OpenCV, open Source Computer Vision Library) to obtain a background light image.

It is worth mentioning that by fusing the RGB image to be restored with the three background light estimated values, the accuracy of describing the background light in the RGB image to be restored by the background light image can be improved.

And step 14, carrying out transmission map estimation on the RGB image to be restored and the background light image to obtain a transmission map of the RGB image to be restored.

Specifically, the RGB image to be restored and the background light image are spliced to obtain a background light restoration image, and then transmission map estimation is carried out on the RGB image to be restored and the background light image based on a deep learning and attention mechanism to obtain a transmission map of the RGB image to be restored.

In some embodiments of the present application, a transmission map estimation network may be used to perform transmission map estimation on a background light restoration image, so as to obtain a transmission map of an RGB image to be restored.

the output end of the first layer of feature extraction unit is connected with the input end of the first addition unit, the output end of the first downsampling unit is connected with the input ends of the second downsampling unit and the second layer of feature extraction unit, the output end of the second downsampling unit is connected with the input end of the third layer of feature extraction fusion unit, the output end of the third layer of feature extraction fusion unit is connected with the input end of the second upsampling unit, the output end of the second upsampling unit is connected with the input end of the second addition unit, the output end of the second addition unit is connected with the input end of the second layer of feature extraction fusion unit, the output end of the second layer of feature extraction fusion unit is connected with the input end of the first upsampling unit, the output end of the first upsampling unit is connected with the input end of the first addition unit, the output end of the first addition unit is connected with the input end of the first layer of feature extraction fusion unit, and the output end of the first layer of feature extraction fusion unit is connected with the transmission image.

It should be noted that the feature extraction unit includes a feature extraction module, and the feature extraction fusion unit includes: the device comprises a first shallow feature extraction module, a second shallow feature extraction module, a deep feature extraction module, a multiplication module and an addition module which are connected in sequence; the output end of the second shallow feature extraction module is connected with the input ends of the multiplication module and the addition module, the output end of the deep feature extraction module is connected with the input end of the multiplication module, the output end of the multiplication module is connected with the input end of the addition module, and the output end of the addition module is connected with the input end of the transmission diagram estimation unit; the input end of the first shallow feature extraction module is the input end of the feature extraction fusion unit, and the output end of the addition module is the output end of the feature extraction fusion unit; the transmission map estimation unit comprises a transmission map estimation convolution layer, a normalization layer and a Sigmoid activation function which are connected in sequence; the input end of the transmission diagram estimation convolution layer is the input end of the transmission diagram estimation unit, and the output end of the Sigmoid activation function is the output end of the transmission diagram estimation unit. The deep feature extraction module is a spatial attention mechanism operation module, and comprises: global average pooling layer, maximum pooling layer, splicing layer, spatial attention convolution layer, sigmoid activation function; the output end of the second shallow layer feature extraction module is connected with the input ends of the global average pooling layer and the maximum pooling layer, the output ends of the global average pooling layer and the maximum pooling layer are connected with the input end of the splicing layer, the output end of the splicing layer is connected with the input end of the spatial attention convolution layer, the output end of the spatial attention convolution layer is connected with the input end of the Sigmoid activation function, and the output end of the Sigmoid activation function is the output end of the spatial attention mechanism operation module.

The spatial attention convolution layer and the Sigmoid activation function in the spatial attention mechanism operation module are used for generating a spatial attention map, and the solid line deep feature extraction is carried out.

The first layer of feature extraction unit is used for extracting features of the background light restoration image, the first addition unit and the second addition unit are both used for carrying out addition operation on various inputs, the feature extraction fusion unit is used for extracting shallow layer features and deep layer features and fusing the extracted shallow layer features and deep layer features, the transmission map estimation unit is used for estimating the transmittance, the first downsampling unit is used for carrying out first downsampling on the background light restoration image, the second downsampling unit is used for carrying out second downsampling on the background light restoration image, and the first upsampling unit and the second downsampling unit are both used for upsampling corresponding inputs.

The background light restoration image with the size of 6×224×224 is input into a first layer of feature extraction unit and a first downsampling unit in the transmission image estimation network, the second feature representation with the size of 6×112×112 is obtained through sampling by the first downsampling unit, the third feature representation with the size of 6×64×64 is obtained through sampling by the second downsampling unit, the third feature representation is subjected to feature extraction by a first shallow feature extraction module and a second shallow feature extraction module in a third layer feature extraction fusion unit, the third shallow feature is input into a deep feature extraction module, deep feature extraction is performed by using a spatial attention mechanism, the third deep feature is obtained, the third deep feature and the third shallow feature are input into a multiplication module, the third initial feature is obtained, the third feature representation and the third shallow feature are input into an addition module, the third final feature is obtained, and the third final feature is input into the second upsampling unit, and the third final feature representation with the size of 16×112×112 is obtained. The third feature representation enters the second layer, and a second addition unit is carried out on the third feature representation and a second initial feature obtained by the feature extraction unit of the second layer in the second layer, so as to obtain a second fusion feature, the second fusion feature is used as input to be processed by the feature extraction fusion unit of the second layer, so as to obtain a second final feature, and the second final feature is up-sampled by the first up-sampling unit, so that a second final feature representation with the size of 16 multiplied by 224 is obtained. The second final feature representation enters the first layer, and a first addition unit is carried out on the second final feature representation and the first initial feature obtained by the processing of the feature extraction unit of the first layer through the background light restoration image in the first layer, so as to obtain a first fusion feature, the first fusion feature is processed by the feature extraction fusion unit of the first layer, so as to obtain a first final feature, and the first final feature enters a transmission map estimation unit to carry out transmission map estimation, so that a transmission map of an RGB image to be restored is obtained. The feature extraction unit of the first layer and the convolution layer in the feature extraction unit of the first layer are feature extraction convolution layers, the convolution kernel size of the convolution layer estimated by the transmission map in the transmission map estimation unit is 3×3, and the number of convolution kernels is 16.

It is worth mentioning that, this step can understand the background light restoration image from different levels through the characteristic representation and the integration of multiple levels, catch the multiscale characteristic that the transmission map estimated needs, then use attention mechanism to make the model focus on the required region of transmission map more, then the characteristic integration can integrate different levels of information, up-down sampling operation can keep and enlarge the effective information of characteristic, thereby improve the precision of transmission map estimation, the higher the value of transmission map, the higher the degree that represents light to permeate the water, the better the image quality that obtains based on the transmission map, through accurate estimation transmission map, scattering and absorption influence in the underwater image can be effectively got rid of, realize the enhancement and the restoration of image.

The above transmission map estimation network is exemplified in the following with reference to a specific example.

As shown in fig. 3, the background light restoration image is input into the transmission image estimation network, the transmission image estimation network comprises three layers, the two outputs obtained by downsampling the background light restoration image twice are respectively used as the input of a second layer and a third layer, the third layer comprises a second downsampling unit (downsampling (DWSP, downSampling) of the third layer in the figure), a feature extraction fusion unit of the third layer and a second upsampling unit (upsampling (UPSP, unSampling) of the third layer in the figure), the second layer comprises a first downsampling unit (DWSP of the second layer in the figure), a feature extraction unit of the second layer (first CBP of the second layer in the figure), a second addition unit (first add of the second layer in the figure), a feature extraction fusion unit of the second layer and a first upsampling unit (UPSP of the second layer in the figure), the first layer comprises a feature extraction unit of the first layer (a first CBP of the first layer in the figure), a first addition unit (a first add of the first layer in the figure), a feature extraction fusion unit of the first layer and a transmission map estimation unit (a CBS of the first layer in the figure), wherein the feature extraction fusion unit of each layer is two CBPs (i.e. a first shallow feature extraction module and a second shallow feature extraction module in the above) connected in sequence in each layer in the figure, spatial attention (SA, shuffle Attention) (i.e. a deep feature extraction module in the above), a multiplication module Mul and an addition module add after SA, K3 in the figure indicates that the convolution kernel size is 3 x 3, C16 indicates that the convolution kernel number is 16, C3 indicates that the convolution kernel number is 3, S1 indicates that the convolution step size is 1, p1 represents a pixel fill of 1.

Step 15, generating an initial restoration image of the RGB image to be restored based on the background light image and the transmission map.

Specifically, the formula is as follows:

acquiring an initial restored image J of an RGB image to be restored _c ；

In some embodiments of the present application, the above steps may be performed by an existing imaging model, such as an IFM imaging model, to obtain an initial reconstructed image.

It is worth mentioning that by calculating the transmission map and the background light image of the RGB image to be restored, and simultaneously processing the background light and the transmittance of the RGB image to be restored, the quality of the obtained initial restored image is high.

And step 16, adjusting the brightness of the pixels meeting the preset condition in the initial restored image to obtain a final restored image.

The preset condition is that the brightness of the pixel is larger than the preset value of the upper limit of the brightness or smaller than the preset value of the lower limit of the brightness.

Specifically, the feature extraction is performed on the initial restoration image by using a brightness adjustment network combining deep learning and attention mechanisms, pixels meeting preset conditions in the initial restoration image are determined, and the brightness of the pixels meeting the preset conditions in the initial restoration image is adjusted to obtain a final restoration image.

The image fusion unit comprises a fusion convolution layer, a normalization layer and a Sigmoid activation function which are connected in sequence; the input end of the fusion convolution layer is the input end of the image fusion unit, and the output end of the Sigmoid activation function is the output end of the image fusion unit.

When the brightness of the pixel satisfying the preset condition in the initial restored image is adjusted, if the brightness of the pixel is greater than the preset brightness upper limit value, the brightness of the pixel is adjusted to the preset brightness upper limit value; and if the brightness of the pixel is smaller than the preset brightness lower limit value, adjusting the brightness of the pixel to the preset brightness lower limit value.

The above-mentioned HSV conversion unit is used for converting the input image from RGB color space to HSV color space, the feature extraction unit includes a feature extraction module for extracting the features of the image in HSV color space, the channel attention unit is used for extracting the features of different levels in the image features, the space attention unit is used for determining the pixels needing to adjust brightness and adjusting the brightness of the pixels, the channel multiplication unit and the space multiplication unit are both used for multiplying the two corresponding inputs, the feature addition unit is used for adding the two inputs, the image fusion unit is used for fusing the features of the input, the HSV inverse conversion unit is used for converting the fused features from HSV color space to RGB space, and the image addition unit is used for adding the images input to the brightness adjustment network and the outputs of the HSV inverse conversion unit.

For example, if the upper limit preset value of brightness is 7, the brightness of the pixel is 9, which indicates that the brightness of the pixel is too high, the brightness of the pixel is adjusted to 7, if the lower limit preset value of brightness is 3, the brightness of the pixel is 1, which indicates that the brightness of the pixel is too low, the brightness of the pixel is adjusted to 3, and if the brightness of the pixel is 5, which is between the upper limit preset value of brightness and the lower limit preset value of brightness, which indicates that the brightness of the pixel does not need to be adjusted.

The method comprises the steps of inputting an initial restoration image into an HSV conversion unit in a brightness adjustment network to convert a color space, converting the initial restoration image into the HSV color space from the RGB color space to obtain an initial restoration HSV image, extracting features through a feature extraction unit to obtain an initial restoration feature with the size of 16 multiplied by 224 multiplied by the initial restoration feature, inputting the initial restoration feature and the different layer feature into a channel multiplication unit to multiply to obtain a channel feature, inputting the initial restoration feature into the spatial channel attention unit, determining pixels meeting preset conditions, adjusting brightness of pixels to obtain brightness features, inputting the channel features and the brightness features into a feature addition unit to add to obtain brightness combination features, inputting the brightness combination features into an image fusion unit to perform feature fusion to obtain brightness fusion features, inputting the brightness fusion features into an HSV inverse conversion unit, converting the initial restoration features into the RGB color space from the HSV color space to obtain RGB brightness fusion features, and finally adding the initial restoration image and the RGB brightness fusion features into an image addition unit to obtain a final restoration image with the size of 3 multiplied by 224. The step length of the fusion convolution layer in the image fusion unit is 1, the convolution kernel size is 3×3, and the kernel number is 3. The channel attention unit is a channel attention mechanism operation module, and the space attention unit is a space attention mechanism operation module.

Channel attention is an attention mechanism that focuses on the feature map channel dimension, and its goal is to weight the different channels of the feature map to enhance the channels of useful information and to attenuate the channels of irrelevant or redundant information. Spatial attention is an attention mechanism focused on the spatial position dimension of a feature map, and aims to endow different importance on different positions of the feature map so that a network can focus on a specific region of an image, thereby improving perceptibility and positioning accuracy.

It is worth mentioning that the background light and the transmissivity in the initial restored image are accurately restored, but the problems of insufficient contrast and darkness and brightness still exist in the initial restored image, and the pixels needing to be subjected to brightness adjustment in the initial restored image can be effectively identified through the brightness adjustment network and adjusted, so that the underwater image restoration effect is further improved.

The above-described luminance adjustment network is exemplified in conjunction with a specific example.

As shown in fig. 4, RGB2HSV in the figure represents an HSV conversion unit, CBP represents a feature extraction unit, CBP has a structure of a convolutional layer Conv, a normalizing layer BN, and a pralu activation function which are sequentially connected, caldel represents a channel attention unit, SAModel represents a spatial attention unit, mul connected to an output end of the channel attention unit is a channel addition unit, mul connected to an output end of the spatial attention unit is a spatial addition unit, CBS is an image fusion unit, CBS has a structure of a convolutional layer Conv, a normalizing layer BN, and a Sigmoid activation function which are sequentially connected, add connected to an input end of CBS is a feature addition unit, HSV2RGB represents an HSV reverse conversion unit, add connected to an output end of the HSV reverse conversion unit is an image addition unit, K3 in the figure represents a convolution kernel size of 3×3, C16 represents a convolution kernel number of 16, C3 represents a convolution kernel number of 3, S1 represents a convolution step size of 1, and P1 represents pixel fill of 1.

The above underwater image restoration method will be exemplarily described with reference to a specific example.

As shown in fig. 5, the working flow of the underwater image restoration system provided in a specific example of the present application is: the original underwater image (namely the RGB image to be restored in the above) enters a background light estimation network to obtain a background light image, the background light image and the original underwater image are simultaneously input into a transmission image estimation network to obtain a transmission image, the transmission image and the background light image are simultaneously input into an IFM imaging model to obtain an initial restoration image, the initial restoration image is input into a brightness adjustment network to obtain a restored underwater image (namely the final restoration image in the above).

As shown in fig. 6, the final restored image obtained based on the above-mentioned underwater image restoration system is an RGB image to be restored in fig. 6, where a1, b1, c1, d1 are all final restored images corresponding to a1, b2 is a final restored image corresponding to b1, c2 is a final restored image corresponding to c1, and d2 is a final restored image corresponding to d 1.

Therefore, the underwater image restoration method provided by the application can restore the underwater image well, and the background light, brightness, transmissivity and the like in the underwater image are regulated effectively.

An exemplary description of the underwater image restoration device provided in the present application follows.

As shown in fig. 7, an embodiment of the present application provides an underwater image restoration device 700 including:

the extraction module 701 extracts a red channel image, a green channel image and a blue channel image of an RGB image to be restored;

the background light estimating module 702 is configured to perform background light estimation on the red channel image, the green channel image, and the blue channel image, respectively, to obtain a red background light estimated value, a green background light estimated value, and a blue background light estimated value;

the fusion module 703 is configured to fuse the red background light estimated value, the green background light estimated value and the blue background light estimated value to obtain a background light image;

the transmission diagram estimation module 704 performs transmission diagram estimation on the RGB image to be restored and the background light image to obtain a transmission diagram of the RGB image to be restored;

a generating module 705 that generates an initial restoration image of the RGB image to be restored based on the background light image and the transmission map;

the brightness adjustment module 706 adjusts the brightness of the pixels meeting the preset condition in the initial restored image to obtain a final restored image; the preset condition is that the brightness of the pixel is larger than the preset value of the upper limit of the brightness or smaller than the preset value of the lower limit of the brightness.

It should be noted that, because the content of information interaction and execution process between the above devices/units is based on the same concept as the method embodiment of the present application, specific functions and technical effects thereof may be referred to in the method embodiment section, and will not be described herein again.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.

As shown in fig. 8, an embodiment of the present application provides a terminal device, a terminal device D10 of which includes: at least one processor D100 (only one processor is shown in fig. 8), a memory D101 and a computer program D102 stored in the memory D101 and executable on the at least one processor D100, the processor D100 implementing the steps in any of the various method embodiments described above when executing the computer program D102.

Specifically, when the processor D100 executes the computer program D102, the red channel image, the green channel image and the blue channel image of the RGB image to be restored are extracted, then the red channel image, the green channel image and the blue channel image are respectively subjected to background light estimation to obtain a red background light estimated value, a green background light estimated value and a blue background light estimated value, then the RGB image to be restored, the red background light estimated value, the green background light estimated value and the blue background light estimated value are fused to obtain a background light image, then the transmission diagram of the RGB image to be restored and the background light image are estimated to obtain a transmission diagram of the RGB image to be restored, then an initial restoration image of the RGB image to be restored is generated based on the background light image and the transmission diagram, and finally the brightness of pixels meeting preset conditions in the initial restoration image is adjusted to obtain a final restoration image. The method comprises the steps of obtaining a red background light estimated value, a green background light estimated value and a blue background light estimated value, estimating the background light of three colors of an RGB image to be restored, improving the description accuracy of the background light in the RGB image to be restored by the background light image, obtaining a transmission image based on the background light image with high accuracy, improving the quality of an initial restored image generated based on the background light image and the transmission image, and effectively adjusting the overall brightness of the initial restored image by adjusting the brightness of pixels meeting preset conditions, thereby further improving the restoration effect of the underwater image.

The processor D100 may be a central processing unit (CPU, central Processing Unit), the processor D100 may also be other general purpose processors, digital signal processors (DSP, digital Signal Processor), application specific integrated circuits (ASIC, application Specific Integrated Circuit), off-the-shelf programmable gate arrays (FPGA, field-Programmable Gate Array) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory D101 may in some embodiments be an internal storage unit of the terminal device D10, for example a hard disk or a memory of the terminal device D10. The memory D101 may also be an external storage device of the terminal device D10 in other embodiments, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the terminal device D10. Further, the memory D101 may also include both an internal storage unit and an external storage device of the terminal device D10. The memory D101 is used for storing an operating system, an application program, a boot loader (BootLoader), data, other programs, etc., such as program codes of the computer program. The memory D101 may also be used to temporarily store data that has been output or is to be output.

Embodiments of the present application also provide a computer readable storage medium storing a computer program which, when executed by a processor, implements steps that may implement the various method embodiments described above.

The present embodiments provide a computer program product which, when run on a terminal device, causes the terminal device to perform steps that enable the respective method embodiments described above to be implemented.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present application implements all or part of the flow of the method of the above embodiments, and may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, where the computer program, when executed by a processor, may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to underwater image restoration method apparatus/terminal equipment, recording medium, computer Memory, read-Only Memory (ROM), random access Memory (RAM, random Access Memory), electric carrier signal, telecommunication signal, and software distribution medium. Such as a U-disk, removable hard disk, magnetic or optical disk, etc. In some jurisdictions, computer readable media may not be electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The foregoing is a preferred embodiment of the present application and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present application and are intended to be comprehended within the scope of the present application.

Claims

1. An underwater image restoration method, comprising:

adjusting the brightness of pixels meeting preset conditions in the initial restored image to obtain a final restored image; the preset condition is that the brightness of the pixel is larger than the preset value of the upper limit of the brightness or smaller than the preset value of the lower limit of the brightness;

the step of performing background light estimation on the red channel image, the green channel image and the blue channel image to obtain a red background light estimated value, a green background light estimated value and a blue background light estimated value comprises the following steps:

the backlight estimation unit includes: the device comprises a first shallow feature extraction module, a second shallow feature extraction module, a channel deep feature extraction module, a feature fusion module, a pixel weight determination module and a global average pooling module which are sequentially connected; the output end of the second shallow layer feature extraction module is connected with the input end of the channel deep layer feature extraction module and the input end of the feature fusion module, the output end of the channel deep layer feature extraction module is connected with the input end of the feature fusion module, the output end of the feature fusion module is connected with the input end of the pixel weight determination module, and the output end of the pixel weight determination module is connected with the input end of the global averaging pooling module;

the channel deep feature extraction module is a channel attention mechanism operation module;

the input end of a first shallow feature extraction module of a background light estimation unit corresponding to the red channel image receives the red channel image, and a global average pooling module of the background light estimation unit corresponding to the red channel image outputs a red background light estimation value of the red channel image;

The method comprises the steps that an input end of a first shallow feature extraction module of a background light estimation unit corresponding to a green channel image receives the green channel image, and a global average pooling module of the background light estimation unit corresponding to the green channel image outputs a green background light estimation value of the green channel image;

the input end of a first shallow feature extraction module of a background light estimation unit corresponding to the blue channel image receives the blue channel image, and a global average pooling module of the background light estimation unit corresponding to the blue channel image outputs a blue background light estimation value of the blue channel image;

the first shallow feature extraction module and the second shallow feature extraction module both comprise a feature extraction module;

the feature fusion module comprises: a multiplication sub-module, a addition sub-module and a fusion sub-module; the output end of the second shallow feature extraction module is connected with the input ends of the multiplication submodule and the addition submodule, the output end of the channel deep feature extraction module is connected with the input end of the multiplication submodule, the output end of the multiplication submodule is connected with the input end of the addition submodule, and the output end of the addition submodule is connected with the input end of the fusion submodule;

the pixel weight determining module comprises a weight determining convolution layer, a normalization layer and a Sigmoid activating function which are sequentially connected; the input end of the weight determination convolution layer is the input end of the pixel weight determination module, and the output end of the Sigmoid activation function is the output end of the pixel weight determination module;

the step of estimating the transmission map of the RGB image to be restored and the background light image to obtain the transmission map of the RGB image to be restored includes:

the transmission map estimation network comprises: the device comprises a first layer of feature extraction unit, a first addition unit, a first layer of feature extraction fusion unit, a transmission diagram estimation unit, a first downsampling unit, a second layer of feature extraction unit, a second addition unit, a second layer of feature extraction fusion unit, a first upsampling unit, a second downsampling unit, a third layer of feature extraction fusion unit and a second upsampling unit;

the output end of the first layer of feature extraction unit is connected with the input end of the first addition unit, the output end of the first downsampling unit is connected with the input ends of the second downsampling unit and the second layer of feature extraction unit, the output end of the second layer of feature extraction unit is connected with the input end of the second addition unit, the output end of the second downsampling unit is connected with the input end of the third layer of feature extraction fusion unit, the output end of the third layer of feature extraction fusion unit is connected with the input end of the second upsampling unit, the output end of the second up-sampling unit is connected with the input end of the second adding unit, the output end of the second adding unit is connected with the input end of the second layer of feature extraction fusion unit, the output end of the second layer of feature extraction fusion unit is connected with the input end of the first up-sampling unit, the output end of the first up-sampling unit is connected with the input end of the first adding unit, the output end of the first adding unit is connected with the input end of the first layer of feature extraction fusion unit, and the output end of the first layer of feature extraction fusion unit is connected with the input end of the transmission diagram estimation unit;

The feature extraction unit comprises a feature extraction module;

the feature extraction fusion unit includes:

the device comprises a first shallow feature extraction module, a second shallow feature extraction module, a deep feature extraction module, a multiplication module and an addition module which are connected in sequence; the output end of the second shallow feature extraction module is connected with the input ends of the multiplication module and the addition module, the output end of the deep feature extraction module is connected with the input end of the multiplication module, and the output end of the multiplication module is connected with the input end of the addition module;

the deep feature extraction module is a spatial attention mechanism operation module;

2. The underwater image restoration method according to claim 1, wherein the generating an initial restoration image of the RGB image to be restored based on the background light image and the transmission map includes:

by the formula:

acquiring an initial restoration image J of the RGB image to be restored _c ；

3. The underwater image restoration method according to claim 1, wherein the adjusting the brightness of the pixels satisfying a preset condition in the initial restoration image to obtain a final restoration image includes:

extracting features of the initial restoration image by using a brightness adjustment network, determining pixels meeting preset conditions in the initial restoration image, and adjusting brightness of the pixels meeting the preset conditions in the initial restoration image to obtain the final restoration image;

the output end of the HSV conversion unit is connected with the input end of the feature extraction unit, the output end of the feature extraction unit is connected with the input ends of the channel attention unit, the space attention unit, the channel multiplication unit and the space multiplication unit, the output end of the channel attention unit is connected with the input end of the channel multiplication unit, the output end of the space attention unit is connected with the input end of the space multiplication unit, the output end of the channel multiplication unit and the output end of the space multiplication unit are both connected with the input end of the feature addition unit, the output end of the feature addition unit is connected with the input end of the image fusion unit, the output end of the image fusion unit is connected with the input end of the HSV inversion unit, and the output end of the HSV inversion unit is connected with the input end of the image addition unit;

The image fusion unit comprises a fusion convolution layer, a normalization layer and a Sigmoid activation function which are sequentially connected;

4. A method of underwater image restoration as claimed in claim 3, wherein said adjusting the brightness of the pixels satisfying a preset condition in the initial restoration image includes:

if the brightness of the pixel is larger than the brightness upper limit preset value, adjusting the brightness of the pixel to the brightness upper limit preset value;

5. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the underwater image restoration method as claimed in any of claims 1 to 4 when executing the computer program.

6. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the underwater image restoration method according to any one of claims 1 to 4.