CN112183255A

CN112183255A - Underwater target visual identification and attitude estimation method based on deep learning

Info

Publication number: CN112183255A
Application number: CN202010970281.9A
Authority: CN
Inventors: 李乐; 张辰; 井新康; 赵海涛; 张梅洁; 刘卫东
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2020-09-15
Filing date: 2020-09-15
Publication date: 2021-01-05

Abstract

The invention discloses an underwater target vision recognition and posture estimation method based on deep learning. Firstly, two layers of prediction networks are added in a network structure of a basic Yolov3 algorithm, so that the rapid identification and efficient extraction of different dimensional characteristics of a target are realized; and then selecting a plurality of characteristic marks with two dimensions, the central points of which are positioned on a straight line, according to the overall structural characteristics of the target, estimating the slope of the straight line by a least square method according to the coordinate information of the characteristic marks in the target image, and further obtaining the attitude angle of the target. The underwater target vision recognition and posture estimation method based on deep learning can realize quick recognition and posture estimation of common underwater operation tools, and provides a foundation for an underwater robot mechanical arm to grab an operation tool and implement underwater operation.

Description

Underwater target visual identification and attitude estimation method based on deep learning

Technical Field

The invention belongs to the field of target recognition, and particularly relates to an underwater target recognition and attitude estimation method.

Background

A working type underwater robot is one of important equipments for exploration and development of marine resources, and is generally equipped with an underwater robot arm. The realization of the rapid and efficient recognition and attitude estimation of the underwater target is the premise of the operation of the underwater mechanical arm. Visual image processing and feature extraction are a common method for underwater target recognition and pose estimation. However, the method is limited by complex underwater environment conditions (such as dim light, turbidity and the like), the traditional underwater target visual identification method is low in identification efficiency, and the attitude estimation effect is not ideal. With the development of the deep learning theory, deep convolutional neural network sample training is an effective technical way for realizing underwater cooperative target recognition, but the effect of underwater target posture estimation is not ideal. Therefore, how to integrate the advantages of the underwater target characteristics and the deep learning method to carry out quick identification and efficient attitude estimation on the common underwater operation tool is one of the key technologies which need to be broken through when the operation type underwater robot mechanical arm works efficiently.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention provides an underwater target visual identification and posture estimation method based on deep learning. Firstly, two layers of prediction networks are added in a network structure of a basic Yolov3 algorithm, so that the rapid identification and efficient extraction of different dimensional characteristics of a target are realized; and then selecting a plurality of characteristic marks with two dimensions, the central points of which are positioned on a straight line, according to the overall structural characteristics of the target, estimating the slope of the straight line by a least square method according to the coordinate information of the characteristic marks in the target image, and further obtaining the attitude angle of the target. The underwater target vision recognition and posture estimation method based on deep learning can realize quick recognition and posture estimation of common underwater operation tools, and provides a foundation for an underwater robot mechanical arm to grab an operation tool and implement underwater operation.

The technical scheme adopted by the invention for solving the technical problem comprises the following steps:

step 1: for each operation tool in the underwater operation tool bag, selecting N characteristic marks on the operation tool manually, wherein N is more than or equal to 3; one of the features is marked as an integral dimensional feature and is formed by the work tool itself; the rest N-1 feature marks are local dimension features and are formed by local structures or components of the working tool; the central points of the N characteristic marks are positioned on the same straight line;

step 2: work tool target identification:

step 2-1: the YOLOV3 network was modified; keeping a front-end feature extraction network of the YOLOV3 network unchanged, wherein the front-end feature extraction network outputs 5 feature graphs with the scales of 13 × 13, 26 × 26, 52 × 52, 104 × 104 and 208 × 208; two layers are added to the rear of the prediction network at the rear end of the YOLOV3 network, wherein the two layers are respectively as follows: the added first layer is a convolutional neural network, the characteristic diagram with the scale of 52 x 52 output by the front-end feature extraction network is up-sampled to obtain a characteristic diagram of 104 x 104, and then the characteristic diagram is fused with the characteristic diagram with the scale of 104 x 104 output by the front-end feature extraction network; the second layer added is a convolutional neural network, the characteristic diagram with the scale of 104 × 104 output by the front-end feature extraction network is up-sampled to obtain a characteristic diagram with the scale of 208 × 208, and then the characteristic diagram is fused with the characteristic diagram with the scale of 208 × 208 output by the front-end feature extraction network; thereby generating a modified YOLOV3 network;

step 2-2, collecting A images of the underwater operation tool as a training set, processing the training set by adopting a k-means clustering algorithm to obtain anchor frames on a plurality of scales, training the improved YOLOV3 network obtained in the step 2-1, and training iteration steps are B steps; obtaining a trained improved YOLOV3 network;

step 2-3: inputting the underwater operation tool image into the improved YOLOV3 network trained in the step 2-2, extracting the operation tool image characteristics through the front-end characteristic extraction network of the improved YOLOV3 network trained to obtain 5-scale characteristic graphs, wherein the 5 scales are 13 × 13, 26 × 26, 52 × 52, 104 × 104 and 208 × 208 respectively;

step 2-4: detecting and fusing the feature maps of 5 scales obtained in the step 2-3 by utilizing the trained rear-end prediction network of the improved YOLOV3 network to obtain the coordinate (t) of the center point of the bounding box corresponding to any feature mark of the manually marked working tool in the step 1 on any scale_x,t_y) Width t_wAnd height t_h；

Step 2-5: coordinate (t) of the center point of the boundary frame obtained in the step 2-4_x,t_y) Width t_wAnd height t_hThe normalization processing is performed by using the following formulas (1) to (4), specifically as follows:

b_x＝σ(t_x)+c_x (1)

b_y＝σ(t_y)+c_y (2)

wherein: sigma is a sigmoid function, and the value range is (0, 1); (b)_x,b_y) As coordinates of the center point of the bounding box after normalization, b_wFor the width of the bounding box after normalization, b_hTo the normalized bounding box height, c_x,c_yValues are all 1; p is a radical of_w,p_hMapping width and height in the feature map for the anchor box on the corresponding scale;

step 2-6: repeating the steps 2-4 and 2-5, obtaining the coordinates, widths and heights of the central points of all the bounding boxes corresponding to the N feature labels on 5 scales through the rear-end prediction network of the trained improved YOLOV3 network, and then carrying out normalization processing;

step 2-7: classifying the coordinates, widths and heights of the central point of the boundary frame after the normalization processing in the steps 2-6 according to different feature marks to perform non-maximum suppression processing to obtain a final recognition and feature extraction result, namely the coordinates (x) of the central points of the boundary frame of N feature marks in the working tool₁,y₁)、(x₂,y₂)、…(x_N,y_N) Width w of bounding box₁,w₂,...,w_NAnd height h₁,h₂,...,h_N；

And step 3: using least square method to the N central points (x)₁,y₁)、(x₂,y₂)、…(x_N,y_N) Fitting to obtain a straight line, and estimating the slope k of the straight line through the following formula (5);

let the equation of a straight line be: y — kx + b, the slope k of the line is estimated as:

wherein

Is x₁,x₂,…,x_NAverage value of (x)_i，y_i) The coordinate of the center point of the ith bounding box is shown, and b is a straight line intercept;

and 4, step 4: and (4) calculating the attitude angle of the underwater operation tool in the image plane according to the slope k of the straight line obtained in the step (3).

Preferably, N is 3.

Preferably, a is 500 and B is 10000.

The invention has the beneficial effects that:

1. compared with the traditional underwater target visual identification method, the underwater target identification algorithm of the improved YOLOV3 network can realize quick identification and effective extraction of multi-dimensional characteristic information of the underwater target, and improves the accuracy and the rapidity of underwater target identification.

2. The underwater target attitude estimation algorithm fusing the multi-dimensional feature markers can utilize coordinate information of a plurality of underwater target feature markers with central points located on a straight line and belonging to two dimensions in a target image, and estimates the slope of the straight line and obtains the attitude angle of the target by a least square method.

Drawings

Fig. 1 is a schematic diagram of a modified YOLOV3 network structure according to the present invention.

Fig. 2 is a drawing for selecting a wrench feature mark according to an embodiment of the present invention.

Fig. 3 is a diagram of a feature label recognition result according to an embodiment of the present invention.

In the figure: 1-wrench body, 2-wrench head, 3-wrench tail, 4-improved YOLOV3 network, 5-front-end feature extraction network, 6-rear-end prediction network, 7-wrench tail feature label boundary box, 8-wrench head feature label boundary box, and 9-wrench body feature label boundary box.

Detailed Description

The invention is further illustrated with reference to the following figures and examples.

As shown in fig. 1, the invention provides a method for underwater target visual recognition and attitude estimation based on deep learning, which comprises the following steps:

step 2: work tool target identification:

step 2-2, collecting 500 images of the underwater operation tool in total number as a training set, processing the training set by adopting a k-means clustering algorithm to obtain anchor frames on a plurality of scales, training the improved Yolov3 network obtained in the step 2-1, and training the iteration steps to 10000 steps; obtaining a trained improved YOLOV3 network;

b_x＝σ(t_x)+c_x (1)

b_y＝σ(t_y)+c_y (2)

wherein

The specific embodiment is as follows:

1. according to the overall structure of the working tool, three feature marks with central points located on the same straight line and belonging to two dimensions are selected. Wherein the first feature is labeled as the work tool itself (global dimension), the second feature and the third feature are labeled as local structures or components on the work tool (local dimension). For a common tool wrench, as shown in fig. 2, three feature labels are respectively selected according to the special mechanical structure of the wrench: a wrench body 1, a wrench head 2, and a wrench tail 3.

2. Inputting the underwater operation tool image into an underwater target recognition algorithm of improved YOLOV3 to obtain three bounding boxes corresponding to the three feature labels and coordinates (x) of center points of the three bounding boxes₁,y₁)、(x₂,y₂) And (x)₃,y₃)。

3. Using least squares method to pair (x)₁,y₁)、(x₂,y₂) And (x)₃,y₃) And fitting the three central points to obtain a straight line, and estimating the slope k of the straight line by using a formula (5).

4. In an image plane coordinate system, the true south direction is taken as 0 degree direction, the angle range is clockwise 0-360 degrees, the attitude angle of the underwater target in the image plane is obtained by combining the relation between the slope k of the straight line and the inclination angle according to the regulation of the angle direction, and a foundation is provided for the underwater robot mechanical arm to grab a working tool and implement underwater operation.

Claims

1. An underwater target vision recognition and posture estimation method based on deep learning is characterized by comprising the following steps:

step 2: work tool target identification:

b_x＝σ(t_x)+c_x (1)

b_y＝σ(t_y)+c_y (2)

wherein

2. The deep learning-based underwater target vision recognition and pose estimation method according to claim 1, wherein N-3.

3. The deep learning-based underwater target vision recognition and pose estimation method according to claim 1, wherein a is 500 and B is 10000.