CN117848345A

CN117848345A - Stepping type unmanned ship path planning method adopting optimization

Info

Publication number: CN117848345A
Application number: CN202410029928.6A
Authority: CN
Inventors: 李梓甜; 罗显涛; 杨立鑫; 刘畅; 黄增鸿; 徐雍
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2024-01-08
Filing date: 2024-01-08
Publication date: 2024-04-09

Abstract

The invention is suitable for the technical field of unmanned boats, and particularly relates to a stepping type unmanned boat path planning method adopting optimization. According to the invention, the depth information and the environment information are acquired through the sensor carried by the unmanned ship, and the depth information and the environment information are fused to obtain an environment map; converting the environment map into a grid map, establishing an initial dual-depth Q network model, and training the grid map as input of the initial dual-depth Q network model to obtain a dual-depth Q network model; acquiring current position information of an unmanned ship; taking the current position information as input of the dual-depth Q network model to carry out path planning to obtain a planned path; and carrying out local path planning according to the planned path and the environmental information acquired by the unmanned ship in real time. The invention effectively improves the path planning speed of the unmanned ship, and ensures that the unmanned ship can quickly find a correct and safe path in a complex environment.

Description

Stepping type unmanned ship path planning method adopting optimization

Technical Field

The invention is suitable for the technical field of unmanned boats, and particularly relates to a stepping type unmanned boat path planning method adopting optimization.

Background

The unmanned ship is a small offshore platform with environment sensing and autonomous navigation capabilities and capable of autonomously completing corresponding tasks. Based on the characteristics, the unmanned ship is required to have the capability of fast planning and comfortable navigation when the unmanned ship performs tasks in unknown sea areas, and the unmanned ship is helpful for completing corresponding work tasks.

The general global algorithm used in practical application of unmanned ships is an RRT algorithm, and the main function of the global algorithm is to plan a path as a reference path for local planning. For the conventional RRT algorithm, under the condition of quite complex environment, a great amount of time is required to search paths, and the planned paths have extreme conditions, such as V-shaped paths with extremely small included angles. The huge time overhead is detrimental to autonomous planning itself, which can result in the USV being stationary and the unmanned vessels (unmanned surface vessel, USV) drifting off the sea along the ocean current. The extreme path can cause extremely poor control effect of the USV, even the unmanned ship cannot be controlled, and autonomous navigation fails.

Therefore, a new step-by-step optimization unmanned ship path planning method is needed to solve the above technical problems.

Disclosure of Invention

The invention provides a stepping type unmanned ship path planning method, which aims to improve the path planning speed of an unmanned ship, improve algorithm efficiency and ensure that the unmanned ship can quickly find a correct and safe path in a complex environment.

The stepping optimization unmanned ship path planning method comprises the following steps:

s1, acquiring depth information and environment information through a sensor carried by an unmanned ship, and fusing the depth information and the environment information to obtain an environment map; wherein the environment map includes obstacle position information and passable area information;

s2, converting the environment map into a grid map, establishing an initial dual-depth Q network model, and training the grid map as input of the initial dual-depth Q network model to obtain a dual-depth Q network model;

s3, acquiring current position information of the unmanned ship;

s4, taking the current position information as input of the dual-depth Q network model to carry out path planning to obtain a planned path;

s5, planning a local path according to the planned path and the environmental information acquired by the unmanned ship in real time.

Preferably, step S2 comprises the sub-steps of:

s21, setting parameters of the initial dual depth Q network model;

s22, inputting the grid map into the initial dual depth Q network model for training, and obtaining the dual depth Q network model.

Preferably, the parameters of the initial dual depth Q network model include a learning rate, a discount factor, an activation function, and an action space.

Preferably, step S22 comprises the sub-steps of:

s221, setting a starting point and different target points in the grid map;

s222, inputting the grid map into the initial dual depth Q network model and training through an RMSProp algorithm.

Preferably, the step S4 includes the following substeps:

s41, converting the current position information into grid coordinate information, and inputting the grid coordinate information into the dual depth Q network model to obtain a sampling action;

s42, randomly sampling and taking points in the preset range of the adopted action to obtain sampling points;

s43, collision detection is carried out according to the sampling points through a preset calculation formula, if yes, the sampling points are used as new nodes, and father nodes of the new nodes are selected according to a preset method; if not, returning to the step S42;

s44, judging whether the new node is a target point, if so, storing the new node; if not, the new node is stored and returned to step S42.

Preferably, in step S43, the preset calculation formula is:

wherein x and y represent the x-axis and y-axis coordinates of the sampling point, respectively; xl and yl represent the minimum boundaries of the x-axis and y-axis, respectively; xu and yu represent the maximum boundaries of the x-axis and y-axis, respectively; isfree (x, y) indicates whether the sampling point belongs to a non-obstacle region, if so, the return value is 1, and if not, the return value is 0.

Preferably, in step S43, the preset method is as follows:

selecting a node as a first node in a preset range of the new node, and defining a father node of the first node as a second node;

the first node and the second node form a first vector, the first node and the new node form a second vector, whether an included angle formed between the first vector and the second vector is smaller than 90 degrees is judged, and if yes, the first node is used as a father node of the new node.

Preferably, whether the new node is a target point satisfies the following relation:

wherein N is _new Representing the new node; isgol represents a function of determining whether the new node is a target point; x is x _new ，y _new Representing the x-axis and y-axis coordinates of the new node, respectively, and goalx, goaly representing the x-axis and y-axis coordinates of the target point, respectively.

Compared with the prior art, the method has the advantages that the depth information and the environment information are acquired through the sensor carried by the unmanned ship, and the depth information and the environment information are fused to obtain the environment map; the environment map comprises barrier position information and passable area information; converting the environment map into a grid map, establishing an initial dual-depth Q network model, and training the grid map as input of the initial dual-depth Q network model to obtain a dual-depth Q network model; acquiring current position information of an unmanned ship; taking the current position information as input of the dual-depth Q network model to carry out path planning to obtain a planned path; and carrying out local path planning according to the planned path and the environmental information acquired by the unmanned ship in real time. Therefore, the invention effectively improves the path planning speed of the unmanned ship, and ensures that the unmanned ship can quickly find a correct and safe path in a complex environment.

Drawings

The present invention will be described in detail with reference to the accompanying drawings. The foregoing and other aspects of the invention will become more apparent and more readily appreciated from the following detailed description taken in conjunction with the accompanying drawings. In the accompanying drawings:

FIG. 1 is a flow chart of a stepwise approach to optimized unmanned boat path planning provided by an embodiment of the present invention;

FIG. 2 is a schematic diagram of a grid map using an optimized unmanned boat path planning method in a step-by-step manner, provided by an embodiment of the present invention;

FIG. 3 is a representation of an approximate Q representation of a stepwise approach to optimized unmanned boat path planning provided by an embodiment of the present invention;

FIG. 4 is a schematic path diagram of a step-by-step approach to optimized unmanned boat path planning provided by an embodiment of the present invention;

FIG. 5 is a sector diagram of current position information for a stepwise approach to optimized unmanned boat path planning provided by an embodiment of the present invention;

FIG. 6 is a vector diagram of a stepwise approach to optimized unmanned aerial vehicle path planning provided by an embodiment of the present invention;

fig. 7 is a parent node update diagram of a stepwise approach to optimized unmanned boat path planning provided by an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

Referring to fig. 1 to 7, the present invention provides a step-by-step optimization unmanned ship path planning method, which includes the following steps:

in the embodiment of the invention, the unmanned ship fuses the two parts of data by acquiring the depth information of the surrounding environment taken by the depth camera and the environment information scanned by the laser radar, so as to generate an environment map containing information such as obstacle positions, passable areas and the like used in training a network.

S2, converting the environment map into a grid map, establishing an initial dual-depth Q Network model, and training the grid map as input of the initial dual-depth Q Network model to obtain a dual-depth Q Network model (Double Deep Q-Network);

in an embodiment of the present invention, step S2 comprises the sub-steps of:

s21, setting parameters of the initial dual depth Q network model; parameters of the initial dual depth Q network model include a learning rate, a discount factor, an activation function, and an action space.

Step S22 comprises the following sub-steps:

s221, setting a starting point and different target points in the grid map;

Specifically, the dual depth Q network model includes a Q network and a q_target network, the network structure of the Q network and the q_target is a five-layer neural network including 3 hidden layers, the number of neurons of the hidden layers is 20, 40 and 20, the learning rate is 0.0001, the discount factor γ is 0.9, the activation function is shown in formula 1, the greedy strategy (epsilon-greedy) adopted in the training process is shown in formula 2, the action space includes 8 actions of up, down, left, right, up left, down left, up right, down right and the like, and the actions are respectively expressed as D1-D8 by symbols, as shown in formula 3, and the reward function R is shown in formula 4.

Equation 1:

equation 2:

equation 3:

equation 4:

on a grid map, n is selected (n is determined by the actual map), 4 different target points are selected, as shown in fig. 2, n is 18, gray points in the second row and the second column from the left are all starting points, and another gray point is the target point. The RMSProp algorithm is used to train the dual depth Q network model.

After the 4 training data are input into the dual-depth Q network model for training, the Q value calculation in the training process is shown in formula 5, and a specific Q value table needs to be calculated in actual training, for example, the approximate Q value table corresponding to the first graph in fig. 2 is shown in fig. 3, the larger the Q value is, the greater the selection possibility is indicated, and it can be seen from fig. 4 that the value corresponding to the Q table meets the requirement of actual planning.

Equation 5: q (s, a) =r+γ q_target (s ^′ ,argmax_a Q(s ^′ ,a；θ)；θ ^′ )

Where Q (s, a) represents the Q value of the selected action a in state s, r is the prize obtained after execution of action a, s 'is the next state entered after execution of action a, γ is a discount factor for adjusting the importance of future prizes, q_target represents the output of the target Q network, argmax_a represents the action selected to have the largest Q value, θ is the parameter of the current policy network (Q network), and θ' is the parameter of the target network (q_target network).

The approximate Q table obtained by the steps can obtain the optimal path from the starting point to the end point, as shown in fig. 4, the well-trained network can obtain correct and effective actions according to the current position information of the unmanned ship, and meets the requirement of guiding planning and sampling.

S3, acquiring current position information of the unmanned ship;

in the embodiment of the present invention, the step S4 includes the following substeps:

in step S43, the preset calculation formula is:

Whether the new node is a target point satisfies the following relation:

Specifically, the defined and selected action is denoted as D, Q is an action evaluation value calculated by a network, the input value is in, the output value is out, epsilon in the formula 2 is 0.1, the current position information of the unmanned ship is N (x, y), and a new node (sampling point) N _new (x _new ，y _new ) The sampling action of the DOUBLE DQN output is denoted as D and the sector radius is L (which can be designed according to actual needs).

Container k is the container of the storage path; container q is the container storing the midpoint; the container w is a container storing nodes with an angle larger than 90 degrees;

the target point (endpoint) coordinates are gol (golx, goly).

The minimum boundary xy coordinates of the map are (xl, yl), the maximum boundary xy coordinates are (xu, yu), and the starting point is first stored in the container k at the beginning of the planning.

The current position information N (x, y) is converted into grid coordinates N (x) according to formula 7 _nn ,y _nn ) Thereafter, N (x) _nn ,y _nn ) The method is input into a dual depth Q network model to obtain sampling actions D, taking N (x, y) as a center, taking the direction D as a center line, expanding left and right by 30 degrees to form a sector with a radius L (the size of L is determined according to actual needs), and forming an angle of 60 degrees, wherein the starting angle of the sector is θ1=θ0-30 degrees, the ending angle is θ2=θ0+30 degrees, wherein θ0 is an included angle between an NN-type vector and an x-axis, and recording that a point corresponding to the distance L from the direction D is N (x, y) is N (x 1, y 1), as shown in fig. 5. Randomly generating an angle θ between the start angle and the end angle according to equation 6 _r Then randomly taking the points according to the formula 8 to obtain a sampling point N _new (x _new ，y _new )。

Equation 6: θ _r ＝θ1+random()*(θ2-θ1)

Equation 7: x is x _nn ＝round((x*R _x +y*R _y )/X _s )

y _nn ＝round((x*R _y +y*R _z )/Y _s )

Wherein R is _x 、R _y 、R _z Is the rotation moment, X of radar scan data _s 、Y _s Is the resolution of the scan map, round (x) represents rounding.

Equation 8:

wherein u1 and u2 are different uniform random numbers between [0,1 ];

obtaining a sampling point N according to step S42 _new For N _new Performing collision detection according to formula 9, if not, returning to step S42, and updating N _new Coordinate values of (2); if the collision detection condition is met, the sampling point is the new node N _new Then further N _new Selecting a father node, screening the father node according to the following comfort level conditions (namely a preset method), and selecting a father node N which accords with the selection _newparent (xp，yp)。

Equation 9:

the comfort conditions are specifically as follows:

the main evaluation index of the comfort level in the invention is that the included angle between two vectors formed by every three points is not smaller than 90 degrees, and the method is mainly used for avoiding the condition that the path has extreme curvature, and meanwhile, the generated path can not meet the added dynamic constraint and model constraint.

With new node N _new (x _new ，y _new ) In the center, a circle is drawn with a radius r, and all tree nodes in the circle are set F (node 1..node n) assuming that there are n nodes in the circle _i .. node n).

Set node n in circle _i (x _i ，y _i ) For point B (xb, yb), point B being the first node, parent node N of point B _preparent (xp, yp) is point A (xa, ya), point A being the second node, new node N _new (x _new ，y _new ) For point C, the BA vector and the BC vector (i.e., the first vector and the second vector) are formed, and the angle between the BA and the BC vector is recorded asAs shown in FIG. 6, all nodes of set F are traversed again, corresponding +.>Into the container w.

The BC wire in the rejecting container w collides with the obstacleA kind of electronic device

Traversing the container w to obtain the maximum angle in the containerThe corresponding coordinate is n _i (x _i ，y _i )。

If it isIf the angle is larger than 90 degrees, selecting a node n _i (x _i ，y _i ) For new node N _new (x _new ，y _new ) Is the parent node N of (1) _newparent (xp, yp), delete n _i (x _i ，y _i ) Is used for guaranteeing the uniqueness of the child node of each node.

If it isIf the angle is smaller than 90 degrees, taking the node n _i (x _i ，y _i ) The midpoint on the midpoint connection of the corresponding BA and BC vectors is taken as a new point B, the value of point B is updated to be the value of the midpoint, and the updated point B is stored in the container q until the angle is greater than 90 °, as shown in fig. 7. If a proper midpoint is successfully found, the midpoint is selected as the current new node N _new (x _new ，y _new ) The parent node of the corresponding point A and the child node of the corresponding point B are deleted, and meanwhile, the container q is emptied; if the point meeting the condition cannot be found, taking the point N with the angle closest to 90 DEG in the container q _mid As a new node N _new (x _new ，y _new ) Is the parent node N of (1) _newparent (xp, yp) while pointing the child node corresponding to point A to N _mid ，N _new Is directed to N by the parent node of (2) _mid The uniqueness of the child node of each node is guaranteed.

The new node N obtained according to the above _new Judging whether the node is a target node according to a formula 10, if so, storing the new node and the terminal point into the node in turnIn container k, the search is ended; if not, storing the new node into the container k, and then jumping back to the step S42 to continue execution until the new node is the target node, and finding out the path to finish searching.

Equation 10:

In the embodiment of the invention, according to the path planned in the step S4 and the environmental information transmitted by the sensor in real time, the information of the path planned in the step S4 and the environmental information are transmitted to a local planning part for further local planning, and then the generated control instruction is transmitted to a control system part for driving the unmanned ship to run according to the planned track.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

While the embodiments of the present invention have been illustrated and described in connection with the drawings, what is presently considered to be the most practical and preferred embodiments of the invention, it is to be understood that the invention is not limited to the disclosed embodiments, but on the contrary, is intended to cover various equivalent modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims

1. The step-by-step optimization unmanned ship path planning method is characterized by comprising the following steps of:

s3, acquiring current position information of the unmanned ship;

2. The stepwise approach to optimized unmanned boat path planning method according to claim 1, wherein step S2 comprises the sub-steps of:

s21, setting parameters of the initial dual depth Q network model;

3. The stepwise approach to optimized unmanned boat path planning method of claim 2, wherein the parameters of the initial dual depth Q network model include learning rate, discount factor, activation function, and action space.

4. The stepwise approach to optimized unmanned boat path planning method according to claim 2, wherein step S22 comprises the sub-steps of:

s221, setting a starting point and different target points in the grid map;

5. The stepwise approach to optimized unmanned boat path planning method according to claim 1, wherein step S4 comprises the sub-steps of:

6. The stepping adoption optimization unmanned ship path planning method according to claim 5, wherein in step S43, the preset calculation formula is:

7. The stepping adoption optimization unmanned ship path planning method according to claim 5, wherein in step S43, the preset method is as follows:

8. The stepwise approach to optimized unmanned boat path planning of claim 5, wherein whether the new node is a target point satisfies the following relationship:

wherein N is _new Representing the new node; isgol represents a function of determining whether the new node is a target point; x is x _new ，y _new Representing x-axis and y-axis coordinates, each representing the new node, each of the golx and goly representing the targetThe x-axis and y-axis coordinates of the point.