CN112650246B

CN112650246B - Ship autonomous navigation method and device

Info

Publication number: CN112650246B
Application number: CN202011535524.2A
Authority: CN
Inventors: 杨杰; 刘今栋; 尚午晟; 梁奇; 韦港文
Original assignee: Wuhan University of Technology WUT
Current assignee: Wuhan University of Technology WUT
Priority date: 2020-12-23
Filing date: 2020-12-23
Publication date: 2022-12-09
Anticipated expiration: 2040-12-23
Also published as: CN112650246A

Abstract

The invention relates to a ship autonomous navigation method, which comprises the following steps: constructing a ship motion model, and constructing a navigation water area model based on a navigation water area map; based on the navigation water area model, performing global flight path planning by adopting an ant colony optimization algorithm to generate a global reference flight path; based on the ship motion model, performing local risk collision avoidance planning by adopting a Q-learning algorithm to generate a real-time risk collision avoidance strategy; and realizing the autonomous navigation of the ship by combining the global reference track and the local risk collision avoidance strategy. The invention can realize real-time and accurate autonomous navigation when the ship navigates in a large water area range.

Description

Ship autonomous navigation method and device

Technical Field

The invention relates to the technical field of ship control and decision, in particular to a ship autonomous navigation method, a ship autonomous navigation device and a computer storage medium.

Background

With the vigorous development of high and new technologies such as artificial intelligence and the internet of things, the ship is highly valued by researchers as a carrier widely applied to cargo circulation, military cruising and public traffic. Because the motion of the ship is different from land and air vehicles, the ship is greatly influenced by the environment, and a great deal of uncertainty and risk exist in navigation. In order to realize the normal navigation of the ship, two key problems of global track planning and local risk collision avoidance of the ship need to be solved.

For the global track planning problem, various algorithms are currently applied to the research of autonomous navigation of ships, such as Dijkstra algorithm, a-star algorithm, artificial potential field method, particle swarm optimization algorithm, ant colony optimization algorithm, and the like. However, these classical offline methods need to acquire complete environment prior knowledge in advance, which is difficult to be implemented completely at present, and meanwhile, the difficulty of environment modeling is increased, so that the method is not suitable for real-time autonomous navigation of a ship when used alone.

For the problem of local risk collision avoidance, the mainstream method focuses on modeling the environment in real time. For example, complexity clustering is performed by analyzing obstacle information in a certain range of a ship body, dynamic obstacle avoidance is realized by combining an ant colony optimization algorithm, the method has excellent performance on obstacle avoidance, and when a starting point is far away from a target, redundant tracks are easily generated. Other researchers weaken the complexity of environment modeling by adopting a trial and error method, take the influence of sea stormy waves into consideration, and provide an autonomous navigation algorithm based on Q-learning, so that intelligent collision avoidance of the robot in an unknown environment is realized; an artificial potential field method is introduced to the Q-learning algorithm, so that the convergence speed of the Q-learning algorithm is further improved. However, the grid-based Q-learning algorithm is greatly affected by the size of the water area, and autonomous navigation in a large water area will increase track randomness, which is not favorable for track convergence.

Disclosure of Invention

In view of this, it is necessary to provide a ship autonomous navigation method and apparatus, so as to solve the problems that global track planning cannot be applied to real-time navigation, local risk avoidance is not applicable to large water areas, and track convergence is slow.

The invention provides an autonomous navigation method for a ship, which comprises the following steps:

constructing a ship motion model, and constructing a navigation water area model based on a navigation water area map;

based on the sailing water area model, carrying out global track planning by adopting an ant colony optimization algorithm to generate a global reference track;

based on the ship motion model, performing local risk collision avoidance planning by adopting a Q-learning algorithm to generate a real-time risk collision avoidance strategy;

and combining the global reference track and the local risk collision avoidance strategy to realize autonomous navigation of the ship.

Further, constructing a ship motion model, specifically:

establishing a ship model;

the motion of the ship is decomposed into translation along with a central point and rotation around the central point, and the motion of the ship is simplified into motion on a horizontal plane, so that the kinematic description of the ship is obtained;

setting a steering threshold value, and restraining the steering of the ship;

setting a safe area of the ship by taking the central point as a circle center and the safe distance as a radius;

and combining the ship model, the kinematics description, the steering threshold value and the safety region to obtain the ship motion model.

Further, constructing a navigation water area model based on the navigation water area map specifically comprises:

acquiring a navigation water area map, and carrying out edge extraction on obstacles in the navigation water area map;

extracting convex hulls of edges by using a Graham scanning method, setting a minimum side length threshold value and a minimum perimeter threshold value, and filtering the convex hulls of which the side length is smaller than the minimum side length threshold value or the perimeter is smaller than the minimum perimeter threshold value;

zooming the convex hulls equidistantly to enable the convex hulls to completely surround corresponding obstacles;

constructing an overall connected graph of the MAKLINK by the peak of the convex hull, and selecting the middle points, the starting points and the end points of all MAKINK lines in the overall connected graph of the MAKLINK as network nodes;

and connecting adjacent network nodes to obtain a undirected network diagram of the trajectory plan, and taking the undirected network diagram as the sailing water area model.

Further, based on the navigation water area model, performing global flight path planning by adopting an ant colony optimization algorithm to generate a global reference flight path, specifically:

the navigation water area model is a directionless network diagram, and the directionless network diagram comprises a plurality of connecting lines;

acquiring a starting point and an end point, and constructing an initial track from the starting point to the end point based on the undirected network graph;

establishing an objective function of node optimization:

wherein the content of the first and second substances,

is the length of the flight path，Length(V _i (h _i ),V _i+1 (h _i+1 ) Denotes the distance, V, between the ith node and the (i + 1) th node in the track _i (h _i ) Any point on the ith connecting line which represents the flight path passes through is marked as the ith node in the flight path, V _i+1 (h _i+1 ) Representing any point on the (i + 1) th connecting line passed by the flight path, and marking as the (i + 1) th node in the flight path, d is the number of the connecting lines passed by the flight path, h _i Is a proportionality coefficient of h _i ∈[0,1]；

The proportionality coefficient being used to adjust V _i (h _i ) Specific position on the ith connecting line:

V _i (h _i )＝V _i ⁽⁰⁾ +h _i (V _i ⁽¹⁾ -V _i ⁽⁰⁾ )；

wherein, V _i ⁽⁰⁾ 、V _i ⁽¹⁾ Two end points of the ith connecting line through which the flight path passes are respectively arranged;

obtaining a plurality of paths by adjusting the proportional coefficients to form a solution set of feasible paths;

and optimizing the feasible path by adopting an ant colony optimization algorithm to obtain a shortest path as the global reference track.

Further, optimizing the feasible path by adopting an ant colony optimization algorithm to obtain a shortest path as the global reference track, specifically:

constructing a node set of a path according to the undirected network graph, and initializing parameters of an ant colony optimization algorithm;

selecting a next path node for each ant from the node set according to the pheromone;

performing single step updating on the corresponding path selected by the ants:

τ _ij (t+1)＝(1-ρ)τ _ij (t)+ρτ ₀ ；

wherein, tau _ij (t) denotes the pheromone between the ith node on the previous connection line to the jth node on the next connection line, τ _ij (t + 1) is for τ _ij (t) after the updateOf the pheromone, tau ₀ Is initial pheromone, rho is the volatilization coefficient of the pheromone, and rho belongs to [0,1 ]]；

After each ant completes the path selection of one round, the shortest path is selected for pheromone round updating, and the pheromone round updating formula is as follows:

τ _ij (t+1)＝(1-ρ)τ _ij (t)+ρΔτ _ij (t)；

Δτ _ij (t)＝1/L ^* ；

wherein L is ^* For the length of the selected path, Δ τ _ij (t) represents the sum of pheromones released by all ants between the two points, and t represents the turn;

and checking whether a termination condition is reached, if so, outputting the shortest path in the paths taken by the ants as the global reference track, and otherwise, performing next path optimization.

Further, based on the model of the sailing water area, an initial track is constructed, specifically:

in the constructed undirected network graph, the midpoint of a MAKLINK line is used as an intermediate node, and a Dijkstra algorithm is adopted to search the shortest path from a starting point to an end point to be used as the initial track.

Further, based on the ship motion model, performing local risk collision avoidance planning by using a Q-learning algorithm to generate a real-time risk collision avoidance strategy, specifically:

initializing a state set, an action set and a reward strategy of a ship; the state set comprises a plurality of states of the ship, wherein the states comprise relative position information between obstacles in a safe area of the ship and a current course angle of the ship; the motion set comprises a plurality of corresponding motions of the ship in each state, and the motions comprise translation information and rotation information; the reward strategy comprises a collision condition and a target arrival condition which are fed back by interaction with the environment in the ship navigation process;

calculating the Q value of each action in the action set, and using the Q value to represent the action value;

selecting the action with the highest action value from the action set by adopting an epsilon-greedy strategy;

updating the Q value of each action in the current state by adopting a Q-learning algorithm:

Q(s _t ,a _t )←Q(s _t ,a _t )+α[r _t +γmaxQ(s _t+1 ,a _t )-Q(s _t ,a _t )]；

s _t ←s _t+1 ；

wherein s is _t Is in the current state, a _t For the selected action, s _t+1 To perform action a _t Next state of last, r _t The reward is fed back under the current state, alpha is the learning rate, gamma is the discount factor, Q(s) _t ,a _t ) Is a state s _t Lower motion a _t Q value of (1), Q(s) _t+1 ,a _t ) Represents a state s _t+1 Lower motion a _t Max denotes taking the maximum value, and ← denotes updating;

and judging whether the difference between the Q values before and after updating is smaller than a set difference value, if so, outputting the action with the highest Q value as the real-time risk collision avoidance strategy, and otherwise, updating the Q value next time.

Further, the vessel autonomous navigation is realized by combining the global reference track and the local risk collision avoidance strategy, and the method specifically comprises the following steps:

acquiring a starting point, a target point and the corresponding global reference track;

according to the nodes and the node sequence contained in the global reference track, sequentially setting each node as a child target point and navigating to each child target point in sequence;

acquiring barrier information in a current navigation area in real time in the navigation process;

judging whether the ship reaches a current target point, if so, further judging whether the current target point is an end point, if so, finishing navigation, if not, updating the current target point, and if not, turning to the next step;

and judging whether the barrier information is in a safe area of the ship, if so, generating a corresponding real-time risk collision avoidance strategy based on the barrier information, and carrying out risk avoidance according to the real-time risk collision avoidance strategy, otherwise, adjusting the course of the ship to navigate to a current target point, and continuously monitoring whether the barrier information is in the safe area of the ship.

The invention also provides a ship autonomous navigation device, which comprises a processor and a memory, wherein the memory is stored with a computer program, and the computer program is executed by the processor to realize the ship autonomous navigation method.

The present invention also provides a computer storage medium having a computer program stored thereon, which, when executed by a processor, implements the ship autonomous navigation method.

Has the advantages that: the invention provides a ship autonomous navigation method based on an ant colony optimization algorithm and Q-learning, which comprises the steps of firstly constructing a ship motion model, and constructing a navigation water area model based on an original map; then designing a global flight path planning strategy based on an ant colony optimization algorithm, generating a global reference flight path, extracting flight path segmentation nodes, and reducing path randomness caused by Q-learning in subsequent local risk collision avoidance; then designing a local risk collision avoidance strategy based on Q-learning, and guiding real-time risk collision avoidance; finally, combining the global track planning and the local risk collision avoidance strategy to generate an autonomous navigation track based on the navigation water area; according to the difference of the actual model, the model parameters are properly adjusted according to conditions so as to be transplanted to the existing ship autonomous navigation control. The local risk collision avoidance strategy is combined on the basis of global track planning, so that the method can be used for real-time navigation, and meanwhile, the local risk collision avoidance strategy is implemented on the basis of the global track planning, so that the method can be suitable for large water areas.

Drawings

FIG. 1 is a flowchart of a first embodiment of a method for autonomous navigation of a ship according to the present invention;

FIG. 2 is a diagram of a ship motion model according to a first embodiment of the autonomous navigation method for a ship provided by the present invention;

FIG. 3 is a schematic view of a sailing water area of a first embodiment of the autonomous navigation method for a ship provided by the present invention;

FIG. 4 is a diagram of a model of a sailing water area of a first embodiment of the autonomous navigation method for a ship according to the present invention;

FIG. 5a is a length convergence curve in the global reference track optimizing process of the first embodiment of the autonomous navigation method for a ship according to the present invention;

FIG. 5b is a schematic view of a global reference track of the autonomous navigation method for a ship according to the first embodiment of the present invention;

fig. 6a is an accumulated reward convergence curve in the optimization process of the real-time risk collision avoidance strategy of the autonomous navigation method of a ship according to the first embodiment of the present invention;

fig. 6b is a schematic diagram of a real-time risk collision avoidance strategy according to the first embodiment of the autonomous navigation method for a ship provided by the present invention;

FIG. 7a is a diagram illustrating an autonomous navigation track obtained by a first embodiment of the method for autonomous navigation of a ship under a first navigation condition according to the present invention;

fig. 7b is an autonomous navigation track obtained by the first embodiment of the autonomous navigation method for a ship under the second navigation condition.

Detailed Description

The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate preferred embodiments of the invention and together with the description, serve to explain the principles of the invention and not to limit the scope of the invention.

Example 1

As shown in fig. 1, embodiment 1 of the present invention provides a ship autonomous navigation method, including the steps of:

s1, constructing a ship motion model, and constructing a navigation water area model based on a navigation water area map;

s2, performing global flight path planning by adopting an ant colony optimization algorithm based on the navigation water area model to generate a global reference flight path;

s3, based on the ship motion model, performing local risk collision avoidance planning by adopting a Q-learning algorithm to generate a real-time risk collision avoidance strategy;

and S4, realizing autonomous navigation of the ship by combining the global reference track and the local risk collision avoidance strategy.

The embodiment provides a ship autonomous navigation method based on an ant colony optimization algorithm and Q-learning, and the method comprises the steps of firstly, constructing a ship motion model, and constructing a navigation water area model based on an original map; then designing a global flight path planning strategy based on an ant colony optimization algorithm, generating a global reference flight path, extracting flight path segmentation nodes, and reducing path randomness caused by Q-learning in subsequent local risk collision avoidance; then designing a local risk collision avoidance strategy based on Q-learning, and guiding real-time risk collision avoidance; finally, combining the global track planning and the local risk collision avoidance strategy to generate an autonomous navigation track based on the navigation water area; according to the difference of the actual model, the model parameters are properly adjusted according to conditions so as to be transplanted to the existing ship autonomous navigation control.

The ship autonomous navigation method based on the ant colony optimization algorithm and Q-learning can provide effective reference for ship autonomous navigation control.

Preferably, a ship motion model is constructed, specifically:

establishing a ship model;

setting a steering threshold value, and restraining the steering of the ship;

When modeling a ship, firstly reducing the dimension of the ship motion, and decomposing the motion of the ship into two parts: the two parts of motions are superposed to form six-degree-of-freedom kinematic description of the ship along with the translation (including longitudinal movement, transverse movement and heaving) and the rotation (including rolling, longitudinal movement and yawing) around the central point. In the problem of navigation planning, the motion of a ship on a horizontal plane is described, so that the problem of autonomous navigation is simplified into a problem of two-dimensional path planning on the horizontal plane;

setting a steering threshold δ _max Since the motion of the ship is not simple particle motion and a certain constraint condition is applied to the motion, the steering angle δ of the ship is made smaller than the threshold δ _max ；

A safety zone of the ship is set, and a certain safety margin is reserved for the ship due to the safety of the running of the ship. Therefore, a circle with the center point as the circle center and the safe distance as the radius wraps the whole ship body, and when the minimum distance between the barrier and the center point is smaller than the safe distance, the ship body is considered to have collision danger.

Specifically, fig. 2 shows the ship motion model established by the present embodiment, in which: XYZO denotes a navigation water area coordinate system, XYZO denotes a ship coordinate system, a solid line denotes a ship model, a dotted line denotes a safety area, and l denotes a safety distance.

Preferably, the navigation water area model is constructed based on a navigation water area map, specifically:

extracting a convex hull of the edge by using a Graham scanning method, setting a minimum side length threshold value and a minimum perimeter threshold value, and filtering the convex hull of which the side length is less than the minimum side length threshold value or the perimeter is less than the minimum perimeter threshold value;

Specifically, as shown in fig. 3, in the embodiment of the present invention, a map of a sailing water area is extracted from a height map as an original map, a navigable area is represented as gray, an obstacle area is represented as white, and then a small obstacle that cannot be directly obtained from the map by a ship is artificially added and marked as black.

The original map is preprocessed by adopting methods of edge extraction, convex hull extraction and convex polygon fitting, so that large obstacles are simplified, small obstacles are filtered, an MAKLINK global connectivity graph is constructed, and a water area environment model is simplified;

the embodiment uses Canny operator to carry out edge extraction on the map of the navigation water area. The edge extraction specifically comprises the following steps: smoothing the map of the navigation water area by using a Gaussian filter, calculating the gradient amplitude and the direction of the map of the navigation water area by using first-order partial derivative finite difference, performing non-maximum suppression on the gradient amplitude, and detecting and connecting edges by using a dual-threshold algorithm.

Extracting a convex hull by using a Graham scanning method, setting the minimum side length threshold of a convex polygon (namely the convex hull) as 1/32 of the side length of a map, setting the minimum perimeter threshold as 1/16 of the side length of the map, filtering out small islands and reefs through the minimum side length threshold and the minimum perimeter threshold, and abstracting a large island into the convex polygon;

carrying out equidistant zooming on the convex polygon to enable the convex polygon to completely surround the barrier;

constructing a MAKLINK global connected graph by convex polygon vertexes: selecting the middle points, the starting points and the end points of all the MAKINK lines, and connecting the adjacent points to form a trajectory planning undirected network diagram, as shown in FIG. 4, the black filling blocks in FIG. 4 are convex polygons, the dotted lines are MAKINK lines, all the dotted lines form a MAKINK global connected diagram, the solid lines are connecting lines, and all the solid lines form the undirected network diagram.

Preferably, based on the model of the sailing water area, an ant colony optimization algorithm is adopted to perform global track planning, and a global reference track is generated, specifically:

establishing an objective function of node optimization:

wherein, the first and the second end of the pipe are connected with each other,

length (V) as track Length _i (h _i ),V _i+1 (h _i+1 ) Denotes the distance between the ith node and the (i + 1) th node in the track, V) _i (h _i ) Any point on the ith connecting line which represents the flight path passes through is marked as the ith node in the flight path, V _i+1 (h _i+1 ) Representing any point on the (i + 1) th connecting line passed by the flight path, and marking as the (i + 1) th node in the flight path, d is the number of the connecting lines passed by the flight path, h _i Is a proportionality coefficient of h _i ∈[0,1]；

V _i (h _i )＝V _i ⁽⁰⁾ +h _i (V _i ⁽¹⁾ -V _i ⁽⁰⁾ )；

wherein, V _i ⁽⁰⁾ 、V _i ⁽¹⁾ Two end points of the ith connecting line through which the flight path passes are respectively;

And constructing an initial track in the constructed undirected network graph.

Then, establishing a node optimization function:

marking the connecting line of the central node where the initial track passes as L _i Let the end point of the connecting line be V _i ⁽⁰⁾ 、V _i ⁽¹⁾ Then, any point on the connecting line can be expressed as:

V _i (h _i )＝V _i ⁽⁰⁾ +h _i (V _i ⁽¹⁾ -V _i ⁽⁰⁾ )

wherein h is _i Is a proportionality coefficient of h _i ∈[0,1]，i＝1,2…, d, d is the number of nodes passed through;

the node optimization function is established as follows:

wherein, length (V) _i (h _i ),V _i+1 (h _i+1 ) Represents the distance between two nodes, when i =0, V _i (h _i ) Denotes the starting point, when i = d, V _i (h _i ) Indicating an endpoint;

optimizing a track node: by adjusting the proportionality coefficient h _i Forming a solution set (h) of feasible paths ₁ ,h ₂ ,…,h _d ) Searching for optimal parameter set by adopting ant colony optimization algorithm, and collecting through proportionality coefficient (h) ₁ ,h ₂ ,…,h _d ) And obtaining the optimized node position, and connecting the starting point, the terminal point and each node to form the global track of the ship.

Preferably, the ant colony optimization algorithm is adopted to optimize the feasible path, and the shortest path is obtained as the global reference track, specifically:

and updating the pheromone on the corresponding path selected by the ant in a single step:

τ _ij (t+1)＝(1-ρ)τ _ij (t)+ρτ ₀ ；

wherein, tau _ij (t) represents the previous connecting line L _k To the last connection line L _k+1 Pheromone, τ, between the j-th nodes _ij (t + 1) is for τ _ij (t) pheromone, τ, after update ₀ Is initial pheromone, rho is the volatilization coefficient of the pheromone, and rho belongs to [0,1 ]]；

τ _ij (t+1)＝(1-ρ)τ _ij (t)+ρΔτ _ij (t)；

Δτ _ij (t)＝1/L ^* ；

Preferably, based on the model of the sailing water area, an initial track is constructed, specifically:

And (3) taking the middle point of the MAKLINK line as an intermediate node, and searching the shortest path from the starting point to the end point by adopting a Dijkstra algorithm. The embodiment combines the traditional Dijkstra algorithm and the ant colony optimization algorithm on the global flight path planning to enable the global flight path to tend to be shortest, reduces the redundancy of paths, and is beneficial to the transportation efficiency and the economic benefit of ships.

Preferably, based on the ship motion model, a Q-learning algorithm is adopted to perform local risk collision avoidance planning, and a real-time risk collision avoidance strategy is generated, specifically:

initializing a state set, an action set and a reward strategy of the ship; the state set comprises a plurality of states of the ship, wherein the states comprise relative position information between obstacles in a safe area of the ship and a current course angle of the ship; the motion set comprises a plurality of corresponding motions of the ship in each state, and the motions comprise translation information and rotation information; the reward strategy comprises a collision condition and a target arrival condition which are fed back by interaction with the environment in the ship navigation process;

s _t ←s _t+1 ；

wherein s is _t Is in the current state, a _t For the selected action, s _t+1 To perform action a _t Next state of last, r _t The reward is fed back under the current state, alpha is the learning rate, gamma is the discount factor, Q(s) _t ,a _t ) Is a state s _t Lower motion a _t Q value of (1), Q(s) _t+1 ,a _t ) Represents a state s _t+1 Lower motion a _t Max represents taking the maximum value, and ← represents updating;

Correspondingly, the local real-time risk avoidance needs to consider the filtered small-sized obstacles and the avoidance of unidentified obstacles in the navigation water area map, so that a Q-learning algorithm is introduced to the local risk avoidance, and an obstacle avoidance strategy is learned through interaction with the environment, so that the effect of further reducing the difficulty of environment modeling is achieved.

Specifically, a state set, an action set, and a reward policy are initialized. In the prior art, when a ship state set is established, the ship state set is generally established based on a geodetic coordinate system, so that a result obtained by training in a certain environment cannot be transplanted to another new environment; in addition, the state set based on the global environment will increase suddenly with the increase of the sea area, which is not favorable for storage and convergence. Unlike this state set definition method, the state set definition method proposed in this embodiment is as follows: for the construction of a ship state set, taking the position of a ship as the center of a circle, setting a threshold as an obstacle in the circle formed by the radius, and recording the obstacle and the relative position information of the ship as a state by combining the current course angle of the ship as s; the actions comprise advancing and steering, which is marked as a, the action in each state is initialized, the value of the action is represented by a Q value, and the larger the Q value of a certain action is, the higher the value of the action in the state is; in addition, a reward strategy is set to serve as value guidance according to collision conditions and target arrival conditions fed back by interaction with the environment in the ship navigation process.

And selecting the action. The action selection adopts an epsilon-greedy strategy, the action with the highest value is selected according to a certain probability epsilon, otherwise, random selection is carried out, and the mathematical expression is as follows:

wherein, a _t For the selected action, Q(s) _t A) is a state s _t The Q value of the lower action a, argmax, represents the value of Q(s) _t A) corresponding action at maximum, A represents a selectable action set, p is selection probability, epsilon is greedy degree, epsilon belongs to [0,1]。

And updating the action value. And updating the Q value of the action by adopting a Q-learning algorithm.

Through iteration, the Q value is converged, and the final strategy pi (a | s) = argmaxQ (s, a) is output.

Preferably, the vessel autonomous navigation is realized by combining the global reference track and the local risk collision avoidance strategy, and specifically, the method comprises the following steps:

And after the global track planning and the local risk collision avoidance strategy planning are completed, the connection between the model parameters is properly adjusted according to the difference of the actual model by combining the global track planning and the local risk collision avoidance strategy. Based on a ship model, collision-free autonomous navigation from a starting point to a terminal point can be completed under the condition of acquiring priori knowledge of a navigation water area map, and the specific process comprises the following steps:

s41, inputting a starting point and a target point, and starting an algorithm;

s42, starting from the starting point position, obtaining an intermediate node passing from the starting point to the end point through a global track planning algorithm, setting the intermediate node as a child target point, and keeping the sequence of passing unchanged;

s43, detecting obstacle information in the navigation area through a ship sensing system, and if the current target point is judged to be reached, entering a step S46; if the judgment is dangerous, the step S44 is carried out; if the judgment result is safe, the step S45 is carried out;

s44, adopting a local risk collision avoidance strategy to avoid the current risk, and then switching to the step S43 to judge whether the risk is separated;

s45, adjusting the course to sail towards the current target point, and then turning to the step S43 to judge the dangerous situation;

s46, if the current target point is the end point, the step S47 is carried out, otherwise, the current target point is updated to be the next target point, and the step S43 is carried out;

and S47, finishing the algorithm and outputting the running track of the ship.

According to the ship autonomous navigation model, the original map is preprocessed by adopting edge extraction, convex hull extraction and convex polygon fitting methods, so that large obstacles are simplified, small obstacles are filtered, an MAKLINK global connection diagram is constructed, and a water area environment model is simplified; the traditional Dijkstra algorithm and the ant colony optimization algorithm are combined on the global flight path planning, so that the global flight path tends to be shortest, the redundancy of paths is reduced, and the transportation efficiency and the economic benefit of a ship are facilitated; correspondingly, the local real-time risk collision avoidance needs to consider the small-sized obstacles filtered out from the original map and the collision avoidance of the unidentified obstacles, so that a Q-learning algorithm is introduced to the local risk collision avoidance, a collision avoidance strategy is learned through interaction with the environment, and the effect of further reducing the difficulty of environment modeling is achieved. By combining global track planning and a local risk collision avoidance strategy, the model realizes collision-free autonomous navigation from a starting point to a terminal point of a ship, so that the model can be applied to digital construction of large-scale ports.

1. The ship autonomous navigation model based on the ant colony optimization algorithm and Q-learning makes up for the method in the background art, and only needs less environment priori knowledge to overcome the defect that real-time collision avoidance cannot be realized for the submerged reef which is not identified by the map in the water area, thereby reducing the modeling difficulty.

2. Compared with the prior art mentioned in the background art, the ship autonomous navigation model based on the ant colony optimization algorithm and the Q-learning has the advantages that complex modeling of obstacles by a ship is not needed in real time, the requirement on a ship sensing system is low, the sensing systems with different functional conditions can establish a state set suitable for the functional conditions of the ship and learn according to different types of acquired information, and certain universality is achieved.

3. The traditional Q-learning algorithm is applied to the problems of large state space, low convergence speed and large randomness of planned flight paths in a grid map, and the artificial potential field method mentioned in the background art as a reward strategy still cannot well solve the problems as value guidance. The ship autonomous navigation model based on the ant colony optimization algorithm and Q-learning provided by the invention not only reserves high degree of freedom of navigation caused by a grid method on local obstacle avoidance, but also ensures that the global navigation track always tends to be shortest, and is beneficial to improving the transportation efficiency and economic benefit of ships.

Based on the above reasons, the ship autonomous navigation model based on the ant colony optimization algorithm and Q-learning provided by the invention can provide effective reference for ship autonomous navigation control.

In order to check the effectiveness of the ship autonomous navigation model based on the ant colony optimization algorithm and Q-learning, verification experiments are respectively set in the embodiment to test the performance of the model.

Verification experiment:

and mapping the original map to a sailing water area, and setting the parameter values of the ship model, the global track planning strategy and the local risk collision avoidance strategy as shown in tables 1, 2 and 3 respectively by assuming that the distance between adjacent pixel points is the minimum unit and corresponds to 1km in the water area.

Tables 1,

TABLE 2,

In fig. 5a, it can be clearly seen that when the number of iterations reaches 120, the total length of the path has converged to 650km, and then the convergence speed gradually decreases, in fig. 5b, the Dijkstra algorithm generates an initial path, which is represented by a short dashed line, the path length is 671.36km, and after 500 iterations, the shortest path is 631.53km, which is represented by a long dashed line, i.e. the reference track.

TABLE 3,

After 100 learning cycles, the algorithm has substantially converged and has the ability to independently avoid obstacles in the local area shown in fig. 6b, as shown in fig. 6 a.

The ship autonomous navigation model based on the ant colony optimization algorithm and Q-learning provided by the invention is subjected to verification simulation experiments in two different original water domains. In fig. 7a, the track can be roughly divided into 6 sections, in the 5 th section of the track, after the ship detects the surrounding obstacles, the navigation track is adjusted to bypass the obstacles, after the danger is eliminated, the track is adjusted again to sail towards the current target point, and finally the collision-free navigation from the starting point to the target point is realized. In fig. 7b, similar effects are achieved by changing the positions of the start point and the target point.

Example 2

Embodiment 2 of the present invention provides a ship autonomous navigation apparatus, including a processor and a memory, where the memory stores a computer program, and when the computer program is executed by the processor, the ship autonomous navigation apparatus implements the ship autonomous navigation method provided in embodiment 1.

The ship autonomous navigation device provided by the embodiment of the invention is used for realizing the ship autonomous navigation method, so that the ship autonomous navigation device has the technical effects which are not described herein again.

Example 3

Embodiment 3 of the present invention provides a computer storage medium having stored thereon a computer program that, when executed by a processor, implements the vessel autonomous navigation method provided in embodiment 1.

The computer storage medium provided by the embodiment of the invention is used for realizing the ship autonomous navigation method, so that the technical effect of the ship autonomous navigation method is achieved, and the computer storage medium also has the technical effect, and is not repeated herein.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention.

Claims

1. A ship autonomous navigation method is characterized by comprising the following steps:

based on the navigation water area model, performing global flight path planning by adopting an ant colony optimization algorithm to generate a global reference flight path;

the global reference track and the local risk collision avoidance strategy are combined to realize autonomous navigation of the ship;

based on the ship motion model, performing local risk collision avoidance planning by adopting a Q-learning algorithm to generate a real-time risk collision avoidance strategy, which specifically comprises the following steps:

by using

The strategy selects the action with the highest action value from the action set, and specifically comprises the following steps: action selection adoption

Strategy, with a certain probability

And selecting the action with the highest value, otherwise, randomly selecting, wherein the mathematical expression is as follows:

wherein the content of the first and second substances,

in order for the action to be selected,

is a state

Lower motion

The value of Q of (A) is,

show to get

The corresponding action at the maximum value is carried out,

a set of selectable actions is represented that are,

in order to select the probability of being,

in order to be greedy,

；

；

；

wherein the content of the first and second substances,

in the case of the current state of the mobile terminal,

in order for the action to be selected,

to perform an action

In the latter next state, the state of the switch is,

for the reward fed back in the current state,

in order to obtain a learning rate,

in order to be a factor of the discount,

is in a state

Lower motion

The value of Q of (A) is,

indicating a state

Lower motion

The value of Q of (A) is,

it is indicated that the maximum value is taken,

representing an update;

judging whether the difference between the Q values before and after updating is smaller than a set difference value, if so, outputting the action with the highest Q value as the real-time risk collision avoidance strategy, and otherwise, updating the Q value next time;

the state set definition method comprises the following steps: for the construction of a ship state set, the relative position information of an obstacle in a circle formed by taking the position of a ship as the center of the circle and setting a threshold as the radius and the ship is recorded as a state by combining the current course angle of the ship

(ii) a The movements including advancing and turning, are described

The value of an action is represented by a Q value by initializing the action in each state, and a larger Q value of an action indicates a higher value of the action in the state.

2. The vessel autonomous navigation method according to claim 1, characterized by constructing a vessel motion model, specifically:

establishing a ship model;

setting a steering threshold value, and restraining the steering of the ship;

3. The vessel autonomous navigation method according to claim 1, wherein the navigation water area model is constructed based on a navigation water area map, specifically:

constructing an MAKLINK global connected graph by using the convex hull top point, and selecting the middle points, the starting points and the end points of all MAKLINK lines in the MAKLINK global connected graph as network nodes;

4. The ship autonomous navigation method according to claim 1, wherein based on the model of the sailing water area, an ant colony optimization algorithm is used to perform global track planning and generate a global reference track, specifically:

establishing an objective function of node optimization:

；

wherein the content of the first and second substances,

as to the length of the flight path,

indicating the first in the track

From node to node

The distance between the individual nodes is such that,

indicating the first track passed

Any point on the connecting line is marked as the second point in the track

The number of the nodes is equal to the number of the nodes,

indicating the first track passed

Any point on the strip connecting line is marked as the first point in the flight path

The number of the nodes is one,

the number of connecting lines through which the flight path passes,

is a coefficient of proportionality that is,

；

the proportionality coefficient being used for regulation

In the first place

Specific positions on the bar connection lines:

；

wherein the content of the first and second substances,

、

respectively the first track passed

Two end points of the bar connection line;

5. The ship autonomous navigation method according to claim 4, wherein an ant colony optimization algorithm is used to optimize feasible paths, and a shortest path is obtained as the global reference track, specifically:

；

wherein the content of the first and second substances,

indicating the previous connection line

From node to the next connection

The pheromone between the individual nodes is,

is a pair of

The updated pheromone is carried out,

is the initial pheromone and is a new pheromone,

is the volatilization coefficient of the pheromone and is,

；

；

；

wherein the content of the first and second substances,

for the length of the selected path or paths,

represents the sum of the pheromones released by all ants between the two points,

representing a round;

6. The vessel autonomous navigation method according to claim 4, wherein an initial track is constructed based on the model of the sailing waters, specifically:

7. The vessel autonomous navigation method according to claim 1, wherein vessel autonomous navigation is realized by combining the global reference track and the local risk collision avoidance strategy, and specifically, the method includes:

8. A ship autonomous navigation apparatus comprising a processor and a memory, the memory having stored thereon a computer program which, when executed by the processor, implements the ship autonomous navigation method according to any one of claims 1 to 7.

9. A computer storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the vessel autonomous navigation method according to any of claims 1-7.