CN113538702A

CN113538702A - Method for generating underwater scene panoramic image of marine culture area

Info

Publication number: CN113538702A
Application number: CN202110738641.7A
Authority: CN
Inventors: 付先平; 姚冰; 苏知青; 袁国良; 王辉兵
Original assignee: Dalian Maritime University
Current assignee: Dalian Maritime University
Priority date: 2021-06-30
Filing date: 2021-06-30
Publication date: 2021-10-22
Anticipated expiration: 2041-06-30
Also published as: CN113538702B

Abstract

The invention discloses a method for generating a panoramic image of an underwater scene in a marine culture area, which comprises the steps of planning a walking path of an underwater robot, acquiring underwater images by using the underwater robot, and extracting marine product target characteristics of the acquired images to obtain a local characteristic image and a global characteristic image; and performing target feature extraction on the local feature map through a target detection algorithm model to obtain a local mask feature map with strong robustness, performing mask processing on the target feature map to obtain a target mask feature map, performing matrix mask operation on the target mask feature map and the global feature map to obtain a representative feature map in the image, performing exposure and judgment on whether the images are the same image, and performing image splicing on the images of the same object by using a splicing algorithm to obtain an underwater spliced image. According to the method, after the underwater image splicing is completed in the current target area through the robot, the observation effect is clearer, more accurate and more precise.

Description

Method for generating underwater scene panoramic image of marine culture area

Technical Field

The invention relates to the technical field of underwater image acquisition, in particular to a method for generating a panoramic image of an underwater scene in a marine culture area.

Background

At present, when an underwater robot carries out close-range underwater target identification, the condition of a target needs to be observed clearly in real time. Due to the problems of dynamic change of underwater target distribution, serious attenuation and scattering effect of underwater light, limited visible range of an underwater camera and the like, complete observation of an identified target is difficult to realize by depending on information acquired and recorded by a single image obtained under the conditions of limited distance and view angle, so that observers are difficult to comprehensively master the condition of a submarine environment.

Disclosure of Invention

The invention provides a method for generating a panoramic view of an underwater scene in a marine culture area.

The technical means adopted by the invention are as follows:

a method for generating a panoramic view of an underwater scene in a marine culture area comprises the following steps:

step 1, planning a walking route of an underwater robot, and controlling the motion direction of the underwater robot according to the walking route;

2, acquiring a plurality of images of a visible area of the underwater robot through a plurality of high-definition cameras arranged on the underwater robot;

step 3, extracting marine product target features of the collected images by using a local feature extractor and a global feature extractor to obtain a local feature map and a global feature map;

step 4, extracting target features from the local feature map through a target detection algorithm model to obtain a local mask feature map with strong robustness, performing mask processing on the target feature map through a softmax function to obtain a target mask feature map, and performing matrix mask operation on the target mask feature map and the global feature map to obtain a representative feature map in the image;

step 5, sending the representative feature diagram into a DCP network for local guidance operation, increasing the weight of the features with more information of the feature diagram channels, and reducing the weight of the unimportant features with less information;

step 6, connecting the global feature map with the representative feature map to obtain a fusion feature map;

step 7, carrying out target re-identification detection on the multiple images in the visible area, and if the same object appears in two or more images, carrying out exposure processing on the images in which the same object is detected to obtain exposure processing images;

step 8, judging whether the exposure processing images have objects with the same characteristics, if not, returning to the step 2, and if so, executing the step 9;

step 9, carrying out image splicing on the images of the detected same object by using a splicing algorithm;

step 10, obstacle avoidance is carried out by using acoustic equipment;

step 11, whether an obstacle is encountered or not is judged, if yes, the step 1 is returned, and if not, the step 12 is executed;

step 12, storing the spliced image;

and step 13, judging whether the process is finished or not, and if not, returning to the step 1.

Further, the backbone network of the global feature extractor comprises two standard 3 × 3 convolutions for global feature extraction, a ReLu function for preventing overfitting during training the network, and a non-local attention mechanism network for increasing correlation between pixels;

the local feature extractor includes a yolov4 single stage detector network for focusing on regions in the underwater image with obvious objects and a mask network for detecting the resulting local region feature map.

Further, the step 4 comprises the following steps:

step 40, setting a candidate frame with high detection precision and high confidence coefficient, wherein the candidate frame comprises the correlation attribute of the underwater image;

step 41, after selecting a plurality of candidate regions with high top-D detection precision, using an index i e to {1, 2.. multidot.D } to indicate each selected representative feature, wherein a spatial region covered by each representative feature is represented as A_iFor each candidate region A_iBy assigning 1 to the pixels in the region and 0 to the remaining pixels, a binary feature matrix is obtained, the formula is as follows:

wherein ,M_i∈{0，1}^C×KAnd all A_iIn the C x K range to ensure that all partial mask regions are within the image region, and then mapping the partial masks to global features, the global feature F is obtained_gAnd partial mask { M_i}^D _i＝₁Then, mapping the partial mask to the global features to obtain a set of feature map sets { F ] based on the partial mask_i}^D _i＝₁；

Step 42, only retaining the most representative features by using the mask network, obtaining the mask features, and then combining the global features F_gFor each partial region i, a mask feature map F can be obtained_iNamely:

wherein an indication is in global feature F_gOperate on each channel of，F_iIs a mapping of the partial mask features of part i, at F_iWith only the partial mask feature of part i activated and with F_i∈R^C×K×GGlobal feature F_gAnd feature map F for each mask_iAnd connecting in channel dimensions, supplementing lost information of the underwater target after convolution, and obtaining a plurality of characteristics with strong robustness.

Further, the step 5 comprises the following steps:

step 50, for each mask feature map F_iThe above operation is performed by using global average pooling to obtain each mask feature map F_iEach channel corresponds to a different feature of the image;

step 51, for each mask feature map F_iPerforming Softmax operation on the maximum value of the channel pixel to obtain a weight vector w for representing the importance of each representative feature, and normalizing w by taking the sum of w as 1 to make the relative importance among different features more obvious;

step 52, adding global feature F_gTo enhance the importance of the representative feature region, thereby obtaining the most representative feature:

wherein ,

this can be predicted by the following equation:

where μ (·) denotes a learning function, σ_μIs a parameter of μ (·), mgap (·) represents the global average pooling. Further, the planning of the walking route of the underwater robot adopts an obstacle avoidance algorithm or a PID algorithm;

the advancing direction of the propeller is controlled through an operator-critic algorithm, the motion trail of the robot is planned, and the algorithm formula is as follows:

(1) critic is defined as the state action function Q

Where Q is the state action function, π is the reward penalty strategy, γ is the decay factor [0,1 ]]，u_tIs an action state action function taken at time t, and can use state x_tTo x_t+1Is learned, state x_tThe method comprises the following steps of (1) including the current movement direction and the current position of the underwater robot;

(2) when the target strategy is fixed, Q can be learned offline, and the formula for updating Q is as follows:

update Q^wUp to Q^w≈Q^π, wherein ,Q^wA specific position coordinate point of the robot movement;

(3) actor is defined as the state cost function:

τ(μ_θ)＝∫ρ^μr(x_t,μ)dx＝Ε[r(x_t,μ_θ(x_t))]

in neural networks, τ (u) is optimized by minimizing the loss function_θ) Then the loss function is:

where L (w) is a simple mean square error function, N represents the time range of the sample, y_iIs a target state action value obtained from a target deep neural network Q, wherein,

y_i＝r(x_i,u'_i)-γQ^'w(x_i,u_i),u'_i＝μ(x_i|θ')

the loss function is graded as:

if the actor is represented by a neural network with parameter θ, then τ (μ)_θ)＝Q^w(x_i,μ(x_i| θ) | w), graduating τ as:

compared with the prior art, the method for generating the underwater scene panoramic image of the marine culture area can provide a clear and complete underwater target panoramic image, so that remote control personnel on the water surface can better observe the comprehensive condition of an underwater target, and the method has very important practical significance for the underwater robot to realize target grabbing and the automatic and autonomous operation of an underwater operation robot.

Drawings

FIG. 1 is a flow chart of a method for generating a panoramic view of an underwater scene in a marine culture area, which is disclosed by the invention;

FIG. 2 is a schematic diagram of a movement path planning of an underwater robot;

FIG. 3 is a schematic diagram of an underwater target re-identification algorithm;

FIG. 4 is a schematic diagram of an underwater panoramic image generation algorithm.

Detailed Description

As shown in fig. 1 to 4, the method for generating a panoramic view of an underwater scene in a marine culture area disclosed by the invention comprises the following steps:

step 10, obstacle avoidance is carried out by using acoustic equipment;

step 12, storing the spliced image;

Specifically, a robot driving route is planned, and the movement direction is adjusted. In the invention, the obstacle avoidance algorithm or the PID algorithm is used for controlling the robot and planning the running path of the robot. The underwater robot has 8 propellers, realizes 6 degrees of freedom and supplies enough power to the robot. And controlling the advancing direction of the propeller through an actor-critic algorithm, and planning the motion track of the robot. The algorithm formula is as follows:

(1) critic is defined as the state action function Q

Q is the state action function, π is the strategy taken, γ is the decay factor [0,1 ]]，u_tThe action state action function taken at time t may use state x_tTo x_t+1Is learned.

update Q^wUp to Q^w≈Q^π。

(3) actor is defined as the state cost function:

τ(μ_θ)＝∫ρ^μr(x_t,μ)dx＝Ε[r(x_t,μ_θ(x_t))]

y_i＝r(x_i,u'_i)-γQ^'w(x_i,u_i),u'_i＝μ(x_i|θ')

the loss function is graded as:

the underwater high-definition camera is used for acquiring the image of the visual area of the robot, and the camera can only capture a part of light reflected from an object due to the attenuation of the light under water and the scattering effect of suspended particles, so that the underwater image has the visualization problems of blurring, color cast and the like. The resolution of underwater images acquired by a common underwater camera is insufficient, and the accuracy of underwater target detection is influenced. Therefore, a high-quality underwater image of the visible area of the robot is acquired by using the underwater high-definition camera. Simultaneously, a plurality of high-definition cameras are installed on the robot, so that the underwater environment can be observed at multiple angles.

the local feature extractor comprises a yolov4 single-stage detector network for paying attention to the region with the obvious target in the underwater image and a mask network for detecting the obtained local region feature map, wherein the yolov4 single-stage detector network only pays attention to the region with the obvious target in the underwater image (the target can be fish and plankton in the sea, and the features (size, color, shape and the like) are obvious.

Specifically, feature blocks are extracted for marine product objects in an image of a visible area of the underwater robot.

(1) First, a backbone network (Global Feature Module) is composed of two standard 3 × 3 convolutions, ReLu functions and a non-local attention mechanism network, and a network structure diagram 3 is shown in the figure. The two standard convolution blocks are used for extracting the global features of the underwater shot image, after the global features are extracted, the attention mechanism network is used for increasing the correlation among pixels, and the global features F with strong robustness are extracted_g. The global features are then used for input, training and final optimization of the next local feature network.

Further, the step 4 comprises the following steps:

step 40, setting a candidate frame with high detection precision and high confidence coefficient, wherein the candidate frame comprises related attributes of the underwater image, specifically, the detector can circle the detected object by using one detection frame, the detection precision is high and the confidence coefficient are complementary, the confidence coefficient refers to a threshold value, whether the threshold value is set properly or not can indirectly influence the detection precision, meanwhile, the threshold value is a probability, and the threshold value range is greater than or equal to 0.5 and less than or equal to 1;

wherein ,M_iE {0, 1} CxK, and all A_iIn the C x K range to ensure that all partial mask regions are within the image region, and then mapping the partial masks to global features, the global feature F is obtained_gAnd partial mask { M_i}^D _i＝1Then, mapping the partial mask to the global features to obtain a set of feature map sets { F ] based on the partial mask_i}^D _i＝1(ii) a The mask area refers to an area with an obvious target in the image (the target can be fish or marine plankton, generally refers to visible and touchable, and has obvious characteristics), and is detected by a detector;

wherein an indication is in global feature F_gIs operated on each channel of F_iIs a mapping of the partial mask features of part i, at F_iWith only the partial mask feature of part i activated and with F_i∈R^C×K×GGlobal feature F_gAnd feature map F for each mask_iAnd connecting in channel dimensions, supplementing lost information of the underwater target after convolution, and obtaining a plurality of characteristics with strong robustness.

Further, the step 5 comprises the following steps:

step 50, for each mask feature map F_iThe above operation is performed by using global average pooling to obtain each mask feature map F_iExtracting a plurality of characteristics for an image according to the maximum value of the channel pixel, wherein each channel corresponds to different characteristics of the image;

wherein ,

this can be predicted by the following equation:

where μ (·) denotes a learning function, σ_μIs a parameter of μ (·), mgap (·) represents the global average pooling. The importance of each feature is highlighted by using weight ratio, and the obtained result is added with a global feature map F_gThe aim is to further highlight the most representative features of underwater targets. Finally, the feature fusion module compensates the global feature F because the global feature and the local feature can compensate each other_gAnd the most representative feature F_pConnected in channel dimension to obtain a fusion feature F_f. Since the global feature and the part feature based on the representative feature provide complementary information, the global feature and the most representative feature are concat connected in the channel dimension, and the formula is as follows:

F_f＝concat(F_g,F_p)，F_f∈R^H×W×2C。

the specific process of exposing the image of the same object in step 6 is as follows: the method comprises the steps that a camera collects multiple images in real time, the images are sent to an underwater image re-recognition network, in the network training process, the network extracts features of underwater image target objects such as sea cucumbers, scallops and the like through a local module and a global module, then feature recombination is carried out on the local features and the global features to obtain robust features, then the features are put into a DCP network to further enhance the features, finally the output features and the global features of the network are connected in channel dimensions, whether the images shot by the camera are the same object or not is judged through the features, if yes, the next step is carried out, and otherwise, the step 2 is carried out in a circulating mode. The step realizes the function of detecting whether the multiple pictures contain the same object, realizes whether the multiple pictures have the same object mark, namely whether each picture contains the same object, and considers that the multiple pictures contain the same target if the contact ratio is high (the threshold value is more than or equal to 0.5), otherwise, the steps are opposite.

And 8, performing image splicing by adopting a Multi-Band algorithm.

When fusing the macroscopic features, a large smooth gradual change area is adopted, and when fusing the local details, a small smooth gradual change area is adopted to decompose the image into component weighted sums of different frequency bands, wherein the macroscopic features of the image are in the low frequency band of the image, and the local details are in the high frequency band of the image. And the image is expanded according to the frequency to form a pyramid, the high and low frequency components are respectively weighted smoothly and superposed according to different modes, and the frequency band components are added again to obtain the final fusion effect.

And (3) image preprocessing, namely after the robot completes the splicing of the underwater images in the current target area, the super-resolution reconstruction technology is used for carrying out the clarification processing on the panoramic image, so that the observation effect is clearer, and the obtained underwater data is more accurate and precise.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims

1. A method for generating a panoramic image of an underwater scene in a marine culture area is characterized by comprising the following steps: the method comprises the following steps:

step 10, obstacle avoidance is carried out by using acoustic equipment;

step 12, storing the spliced image;

2. The method for generating the underwater scene panorama of the mariculture area as claimed in claim 1, wherein: the backbone network of the global feature extractor comprises two standard 3 x 3 convolutions for global feature extraction, a ReLu function for preventing overfitting during network training and a non-local attention mechanism network for increasing correlation between pixels;

3. The method for generating the underwater scene panorama of the mariculture area as claimed in claim 1, wherein: the step 4 comprises the following steps:

wherein ,M_i∈{0，1}^C×KAnd all A_iIn the C x K range to ensure that all partial mask regions are within the image region, and then mapping the partial masks to global features, the global feature F is obtained_gAnd partial mask { M_i}^D _i＝1Then, mapping the partial mask to the global features to obtain a set of feature map sets { F ] based on the partial mask_i}^D _i＝1；

F_i＝M_i⊙F_g，

4. The method for generating the panoramic view of the underwater scene in the mariculture area as claimed in claim 3, wherein: the step 5 comprises the following steps:

wherein ,

this can be predicted by the following equation:

where μ (·) denotes a learning function, σ_μIs a parameter of μ (·), mgap (·) represents the global average pooling.

5. The method for generating the underwater scene panorama of the mariculture area as claimed in claim 4, wherein:

the planning of the walking route of the underwater robot adopts an obstacle avoidance algorithm or a PID algorithm;

(1) critic is defined as the state action function Q

(3) actor is defined as the state cost function:

τ(μ_θ)＝∫ρ^μr(x_t,μ)dx＝Ε[r(x_t,μ_θ(x_t))]

in neural networksOptimizing τ (u) by minimizing the loss function_θ) Then the loss function is:

y_i＝r(x_i,u'_i)-γQ'^w(x_i,u_i),u'_i＝μ(x_i|θ')

the loss function is graded as:

if the actor is represented by a neural network with parameter θ, then τ (μ)_θ)＝Q^w(x_i,μ(x_i| θ |) w), graduating τ as: