CN113052885B

CN113052885B - Underwater environment safety assessment method based on optical flow and depth estimation

Info

Publication number: CN113052885B
Application number: CN202110332203.0A
Authority: CN
Inventors: 王楠; 张兴; 辛国玲; 杨学文
Original assignee: Ocean University of China
Current assignee: Ocean University of China
Priority date: 2021-03-29
Filing date: 2021-03-29
Publication date: 2023-02-03
Anticipated expiration: 2041-03-29
Also published as: CN113052885A

Abstract

The invention relates to the technical field of computer vision, and particularly discloses an underwater environment safety assessment method based on optical flow and depth estimation, which comprises the steps of firstly, collecting underwater source images under various underwater dynamic scenes by adopting a monocular camera as a data set (step S1); further performing distance estimation on each underwater source image in the data set to obtain a corresponding depth estimation image (step S2); carrying out optical flow analysis on each underwater source image in the underwater source image data set to obtain an optical flow diagram (step S3); and (S4) fusing the depth estimation graph and the optical flow graph to obtain an underwater environment safety evaluation graph of each underwater source image (step S4), wherein the obtained underwater environment safety evaluation graph is an important support for subsequently performing autonomous operation tasks such as path planning, autonomous obstacle avoidance and grabbing and the like, and can guide the robot to realize the optimal behavior at a higher level.

Description

Underwater environment safety assessment method based on optical flow and depth estimation

Technical Field

The invention relates to the technical field of computer vision, in particular to an underwater environment safety evaluation method based on optical flow and depth estimation.

Background

The ocean is an important resource on which the future human beings live. In order to obtain more ocean benefits, a large number of underwater robots for ocean surveys, scientific experiments, resource exploration and the like are continuously put into use. Due to the fact that the underwater application environment is very complex and has strong time variation, the underwater robot faces many challenges in solving key technologies such as accurate navigation, autonomous risk avoidance and cooperative control. No matter what underwater robot, the mission can be smoothly finished on the basis of guaranteeing the navigation safety. Therefore, the safety evaluation of the marine environment can provide effective reference for the underwater robot to safely navigate and smoothly complete tasks. Compared with a traditional artificial potential field method, a behavior decomposition method, an optimization-based algorithm and the like, the environmental safety assessment can guide the robot to realize the optimal behavior at a higher level. However, current research on marine environmental safety assessment is still in the exploration phase.

Disclosure of Invention

The invention provides an underwater environment safety evaluation method based on optical flow and depth estimation, which solves the technical problems that: at present, for the operation of an underwater robot, the considered factors are not comprehensive enough, and no complete underwater environment safety assessment method can provide safety guarantee for the underwater robot to carry out reasonable path planning, autonomous obstacle avoidance and grabbing and the like.

In order to solve the technical problems, the invention provides an underwater environment safety assessment method based on optical flow and depth estimation, which comprises the following steps:

s1: collecting underwater source images of various underwater dynamic scenes by adopting a monocular camera to generate an underwater source image data set;

s2: performing distance estimation on each underwater source image in the underwater source image data set to obtain a corresponding depth estimation image;

s3: performing optical flow analysis on each underwater source image in the underwater source image data set to obtain an optical flow graph;

s4: and fusing the depth estimation graph and the optical flow graph to obtain an underwater environment safety evaluation graph of each underwater source image.

Further, in the step S4, the process of fusing an underwater source image specifically includes the steps of:

s41: carrying out region division on the depth estimation image of the underwater source image obtained in the step S2, and respectively calculating the average value mu (D) of the depth in each sub region; carrying out the same area division on the optical flow graph of the underwater source image obtained in the step S3, and respectively calculating the average value mu (F) of the optical flow amplitude values in each sub-area;

s42: histogram distribution statistics is carried out on the direction of the optical flow vector in the sub-area of each optical flow graph obtained in the step S3, and the variance of the optical flow direction histogram in each sub-area is calculated

S43: calculating the covariance of the optical flow direction histogram of each subregion in a preset time period and the mean value of the covariance in the preset time period

S44: estimating mean value mu (D), optical flow amplitude mean value mu (F), optical flow histogram variance value from depth

Sum optical flow histogram covariance mean

Calculating a safety evaluation value of each sub-region;

s45: and normalizing the safety evaluation value of each sub-region to obtain an underwater environment safety evaluation graph of the underwater source image.

Further, in the step S44, a safety evaluation value S is calculated _value The rule for (i, t) is:

wherein i represents the ith sub-region, t represents the tth time within the preset time period, mu (D) _i ) Mean depth estimate, μ (F), representing the ith sub-region _i ) Represents the average value of the optical flow amplitude of the ith sub-area,

the optical flow direction histogram variance value representing the ith sub-region,

representing the mean of covariance of the histogram of optical flow direction, ω, over time τ → t ₁ 、ω ₂ 、ω ₃ 、ω ₄ Each represents μ (D) _i )、μ(F _i )、

The weight of (c).

Preferably, ω is ₁ ＝ω ₂ ＝ω ₃ ＝ω ₄ 。

Preferably, ω is ₁ ＝ω ₂ ＝ω ₃ ＝ω ₄ ＝0.25。

Further, in step S2, a maximum attenuation recognition algorithm is used for distance estimation.

Further, step S2 specifically includes the steps of:

s21: carrying out restoration processing on the underwater source image I by adopting a maximum attenuation recognition algorithm to obtain a restored image J;

s22: respectively extracting red channels J of the restored image J and the underwater source image I ^R And I ^R Calculating J ^R And I ^R As a distance coefficient

S23: and normalizing the distance coefficient d to obtain a depth estimation image.

Further, in step S3, performing optical flow analysis by using a trained MaskFlownet network, where the MaskFlownet network includes a MaskFlownet-S network and a double pyramid cascade network.

Further, the process of training the MaskFlownet network includes the steps of:

(1) Training MaskFlownet-S network

S31: training a MaskFlownet model by using a Flying rules dataset, wherein the learning rate is reduced by half from 0.0001, the iteration is carried out at 0.4M, 0.6M, 0.8M and 1M, the batch processing size is 8, and the running time is 1000k iteration;

s32: optimizing the model trained in the step S31 for the network parameters by using a Flyingthings3D data set, and simultaneously excluding image pairs larger than 1000 pixels, wherein the batch processing size is 4;

s33: fine tuning the model trained in the step S32 by using the Sintel and KITTI data sets, adding horizontal inversion when fine tuning is performed by using the Sintel data sets, and removing additive noise in the data enhancement process, wherein the batch processing size is 4; when the KITTI data set is used for fine tuning, the number of rotation, scaling and extrusion in the data enhancement process is reduced, and the batch processing size is 4;

(2) Training double-pyramid cascaded network

S34: all parameters of the MaskFlownet-S model trained in the step S33 are fixed, and then the double-pyramid cascade network is trained by following the training schedule of the MaskFlownet-S model, wherein the difference is that the running time on the FlyingChars data set is 800k iteration.

The invention provides an underwater environment safety assessment method based on optical flow and depth estimation, which comprises the steps of firstly adopting a monocular camera to collect underwater source images under various underwater dynamic scenes as a data set (step S1); further performing distance estimation on each underwater source image in the data set to obtain a corresponding depth estimation image (step S2); carrying out optical flow analysis on each underwater source image in the underwater source image data set to obtain an optical flow diagram (step S3); and (5) fusing the depth estimation image and the optical flow image to obtain an underwater environment safety evaluation image of each underwater source image (step S4), wherein the obtained underwater environment safety evaluation image is an important support for subsequently carrying out autonomous operation tasks such as path planning, autonomous obstacle avoidance and grabbing and the like, and can guide the robot to realize the optimal behavior at a higher level.

Drawings

FIG. 1 is a flowchart illustrating steps of a method for evaluating safety of an underwater environment based on optical flow and depth estimation according to an embodiment of the present invention;

fig. 2 is an architecture diagram of a MaskFlownet network provided by an embodiment of the present invention;

fig. 3 is an architecture diagram of a PWC-net network provided by an embodiment of the present invention.

Detailed Description

The embodiments of the present invention will be described in detail below with reference to the accompanying drawings, which are given solely for the purpose of illustration and are not to be construed as limitations of the present invention, including reference to and illustration of the accompanying drawings, which are not to be construed as limitations of the scope of the invention, since many variations thereof are possible without departing from the spirit and scope of the invention.

In order to guide a robot to better perform underwater environment operation tasks such as path planning and autonomous obstacle avoidance grabbing, an underwater environment safety assessment method based on optical flow and depth estimation provided by the embodiment of the invention is shown in a flow chart of fig. 1, and specifically comprises the following steps:

s1: collecting underwater source images of various underwater dynamic scenes by using a monocular camera to generate an underwater source image data set;

s4: and fusing the depth estimation image and the optical flow image to obtain an underwater environment safety evaluation image of each underwater source image.

Wherein, step S2 specifically includes steps S21 to S23:

s21: carrying out restoration processing on the underwater source image I by adopting a maximum attenuation recognition algorithm to obtain a restored image J; the method specifically comprises the following steps:

s211, estimating global background light A:

1) Filtering an R channel of an underwater source image I by using a maximum filter with adjustable window size to obtain a corresponding depth image;

2) Finding the lowest pixel point with the brightness of 10% in the depth image corresponding to each image block to correspond to the underwater source image I, and obtaining the background light of each image block according to the pixel points, wherein the number of the image blocks is v × w (2 × 2 in the embodiment);

3) Integrating the backlights of all image blocks to estimate the global backlight A of the R channel ^R (x)；

4) According to the principle and the steps of the R channel, the global background light A of the G channel of the underwater source image I is estimated ^G (x) And global backlight A of B channel ^B (x)；

S212, estimating a propagation coefficient xi:

1) According to

Estimating propagation coefficient xi of R channel of underwater source image ^R (x) Wherein Ω (x) represents a local area, y represents a position of a pixel point in the local area, I ^R (y) representing the pixel value of the pixel point corresponding to the R channel;

2) Calculating underwater according to the principle and steps of R channelPropagation coefficient xi of G channel of source image ^G (x) And propagation coefficient xi of B channel ^B (x)；

And S213, obtaining a restored image J according to the underwater light propagation model I (x) = J (x) xi (x) + A (1-xi (x)), wherein x is the position of the pixels in the underwater source image I and the restored image J.

It should also be noted that the above-mentioned materials,

the calculation process of (2) is as follows:

the classical light scattering model is as follows: i (x) = J (x) ξ (x) + a (x) (1- ξ (x)), and since the absorption and scattering coefficients of water for different colors of light are different, the degree of attenuation of light under water is different, so that the propagation coefficients of three colors of R, G and B need to be considered separately, and the following results are obtained:

I ^R (x)＝J ^R (x)ξ ^R (x)+A ^R (x)(1-ξ ^R (x))

I ^G (x)＝J ^G (x)ξ ^G (x)+A ^G (x)(1-ξ ^G (x))

I ^B (x)＝J ^B (x)ξ ^B (x)+A ^B (x)(1-ξ ^B (x))

the maximum value is taken on the local region Ω (x) for the left and right sides of the above formula, and the propagation coefficient ξ (x) and the background light a (x) are assumed to be consistent in the local region Ω (x), so that:

taking the R channel as an example, the conversion is continued:

then, both sides are removed 1-A ^R (x) Obtaining:

further transformation:

in view of the attenuation at close distances, the underwater background light a is usually darker, especially in deep sea, while, at a suitable window size, the closer the object is to the camera, the brighter the object area, the closer the maximum value of J is to 1, at which time,

And in the step S3, the MaskFlownet network which is trained is adopted for optical flow analysis. As shown in fig. 2, the MaskFlownet network includes a MaskFlownet-S network and a double pyramid cascade network, and the dotted lines between the pyramids indicate shared weights. The MaskFlownet-S network inherits the network architecture of PWC-Net and is shown in figure 3, and comprises a pyramid feature extractor, a Warping layer, a Cost volume layer, an optical flow estimator and a context network. Unlike the PWC-Net network, the Feature Matching Modules (FMMs) of each layer in the pyramid feature extractor are replaced with asymmetric feature matching modules (asymefmm) that can learn the occlusion mask.

The working flow of the MaskFlownet-S network is as follows:

1) Pyramid feature extractor: for two continuous pictures, firstly, extracting features by adopting a 6-layer shared feature pyramid to obtain corresponding features;

2) Warping layer: inputting the characteristics of a second graph and the up-sampled optical flow, and realizing warping operation by using bilinear interpolation;

3) Cost volume layer: the input is the feature of the first image and the warp feature of the second image, which is used to store the matching cost of the associated pixel and its corresponding pixel in the next frame.

4) An optical flow estimator: the CNN is a multi-level CNN, the Cost volume, the characteristics of a first image and the up-sampling optical flow are input, the optical flow on a corresponding layer is output, and the DenseNet is used for enhancing the architecture performance of an estimator;

5) A context network that obtains the estimated optical flow and features of the penultimate layer from the optical flow estimator and outputs a refined optical flow.

The network structure of the stage of the double-pyramid cascade network is similar to that of the previous stage, but in the stage, the relevant layer is used as a feature matching module of the shielding perception feature pyramid in the pyramid feature extractor.

The process of training the MaskFlownet comprises the following steps:

(1) Training MaskFlownet-S network

s33: fine-tuning the model trained in the step S32 by using the Sintel and KITTI data sets, adding horizontal inversion when fine-tuning by using the Sintel data sets, and removing additive noise in the data enhancement process, wherein the batch processing size is 4; when the KITTI data set is used for fine tuning, the number of rotation, scaling and extrusion in the data enhancement process is reduced, and the batch processing size is 4;

(2) Training a double pyramid cascaded network

In the asymmetric feature matching module (asymeofmm), a deformed convolution (deformable convolution) is asymmetrically introduced, namely, an additional convolution is performed while the target feature map is deformed according to the current flow field, so that the symmetry of the original feature map and the target feature map is broken. At this time, the learnable blocking mask predicted by the network acts on the deformed feature map to filter interference information caused by a ghost phenomenon, and a mask feature map is obtained. Finally, since the information originally carried by the occlusion region is lost after filtering, a trade-off term μ needs to be added to make up for the loss.

S4, the process of fusing the underwater source images specifically comprises the following steps:

s41: performing region division on the depth estimation map of the underwater source image obtained in the step S2, wherein the number of sub-regions is v × w (2 × 2 in this embodiment, which can be adjusted according to actual situations), and calculating an average value μ (D) of the depth in each sub-region; carrying out the same area division on the optical flow graph of the underwater source image obtained in the step S3, and respectively calculating the average value mu (F) of the optical flow amplitude in each sub-area;

S43: in making a security assessmentDuring estimation, a single optical flow graph can only reflect the disorder degree of the motion of an object at the current moment, and in order to improve the accuracy of safety evaluation, the covariance of the optical flow direction histogram of each sub-area in a preset time period and the mean value of the covariance in the preset time period are calculated

Sum optical flow histogram covariance mean

Calculating a safety evaluation value of each sub-region;

In the step S44, a safety evaluation value S is calculated _value The rule for (i, t) is:

wherein i represents the ith sub-region, t represents the tth time within the preset time period, mu (D) _i ) Mean depth estimate, μ (F), representing the ith sub-region _i ) Represents the average value of the optical flow magnitudes of the ith sub-area,

representing mean covariance of histogram of optical flow direction, ω, within time τ → t ₁ 、ω ₂ 、ω ₃ 、ω ₄ Respectively represent mu (D) _i )、μ(F _i )、

Preferably, in this embodiment, ω is set ₁ ＝ω ₂ ＝ω ₃ ＝ω ₄ And =0.25 (which can be adjusted according to the actual situation).

To sum up, the underwater environment safety assessment method based on optical flow and depth estimation provided by the embodiment of the invention comprises the steps of firstly, collecting underwater source images under various underwater dynamic scenes by adopting a monocular camera as a data set (step S1); further performing distance estimation on each underwater source image in the data set to obtain a corresponding depth estimation image (step S2); carrying out optical flow analysis on each underwater source image in the underwater source image data set to obtain an optical flow diagram (step S3); and (S4) fusing the depth estimation graph and the optical flow graph to obtain an underwater environment safety evaluation graph of each underwater source image (step S4), wherein the obtained underwater environment safety evaluation graph is an important support for subsequently performing autonomous operation tasks such as path planning, autonomous obstacle avoidance and grabbing and the like, and can guide the robot to realize the optimal behavior at a higher level.

The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims

1. An underwater environment safety assessment method based on optical flow and depth estimation is characterized by comprising the following steps:

s4: fusing the depth estimation graph and the optical flow graph to obtain an underwater environment safety evaluation graph of each underwater source image;

in the step S4, the process of fusing an underwater source image specifically includes the steps of:

s41: carrying out region division on the depth estimation image of the underwater source image obtained in the step S2, and respectively calculating the average value mu (D) of the depth in each sub region; carrying out the same area division on the optical flow graph of the underwater source image obtained in the step S3, and respectively calculating the average value mu (F) of the optical flow amplitude in each sub-area;

S43: calculating the covariance of the optical flow direction histogram of each subregion in a preset time period and the mean value of the covariance in the time period

S44: estimating average value mu (D), optical flow amplitude average value mu (F) and optical flow histogram variance value according to depth

Sum optical flow histogram covariance mean

Calculating a safety evaluation value of each sub-region; in the step S44, a safety evaluation value S is calculated _value The rule for (i, t) is:

The weight of (c);

2. The underwater environment safety assessment method based on optical flow and depth estimation as claimed in claim 1, wherein:

ω ₁ ＝ω ₂ ＝ω ₃ ＝ω ₄ 。

3. the underwater environment safety assessment method based on optical flow and depth estimation as claimed in claim 2, wherein:

ω ₁ ＝ω ₂ ＝ω ₃ ＝ω ₄ ＝0.25。

4. an underwater environment safety assessment method based on optical flow and depth estimation as claimed in any one of claims 1 to 3, characterized in that: in step S2, a maximum attenuation recognition algorithm is adopted for distance estimation.

5. The underwater environment safety assessment method based on optical flow and depth estimation as claimed in claim 4, wherein the step S2 specifically comprises the steps of:

6. An underwater environment safety assessment method based on optical flow and depth estimation as claimed in any one of claims 1 to 3, characterized in that: in the step S3, a MaskFlownet network which is trained is adopted to carry out optical flow analysis, wherein the MaskFlownet network comprises a MaskFlownet-S network and a double-pyramid cascade network.

7. The underwater environment security assessment method based on optical flow and depth estimation as claimed in claim 6, wherein the process of training the MaskFlownet comprises the steps of:

(1) Training MaskFlownet-S network

(2) Training double-pyramid cascaded network

S34: all parameters of the MaskFlownet-S model trained in the step S33 are fixed, and then the double-pyramid cascade network is trained according to the training schedule of the MaskFlownet-S model, wherein the difference is that the running time is 800k iteration on a FlyingChars data set.