CN116578095A

CN116578095A - Energy-saving obstacle avoidance method for ocean energy driven robot

Info

Publication number: CN116578095A
Application number: CN202310681658.2A
Authority: CN
Inventors: 廖煜雷; 李可; 赵永波; 刘骁锋; 李晔; 史健; 王博; 张强
Original assignee: Harbin Engineering University
Current assignee: Harbin Engineering University
Priority date: 2023-06-09
Filing date: 2023-06-09
Publication date: 2023-08-11

Abstract

The energy-saving obstacle avoidance method for the ocean energy driven robot solves the problem of how to improve the endurance of the ocean energy driven robot, and belongs to the technical field of ocean robots. The invention comprises the following steps: determining a state space S, an approaching target rewarding function, an obstacle avoidance rewarding function and an energy-saving rewarding function of the ocean energy driven robot in an ocean environment; training the marine energy driven robot obstacle avoidance network by using the determined reward function to obtain a trained marine energy driven robot obstacle avoidance network; the method comprises the steps of obtaining position information of an initial point, a target point and an obstacle of the ocean energy driven robot, and completing an energy-saving obstacle avoidance task of the ocean energy driven robot by adopting a trained ocean energy driven robot obstacle avoidance network. According to the invention, a reward function is designed according to the energy consumption and ocean energy capture of the ocean energy driven robot and according to ocean environment factors, network training is carried out on the ocean energy driven robot path planning, and a safe, feasible and energy-saving obstacle avoidance path is planned.

Description

Energy-saving obstacle avoidance method for ocean energy driven robot

Technical Field

The invention relates to an energy-saving obstacle avoidance method for a marine energy driven robot, and belongs to the technical field of marine robots.

Background

The ocean energy driving robot is used as a novel ocean robot, ocean energy can be directly utilized as thrust to obtain power through loaded driving equipment, or the carried ocean energy capturing equipment is utilized to convert the ocean energy into chemical energy to be stored in a battery. The ocean energy driven robot effectively utilizes ocean energy, realizes long-time offshore operation, and has wide application prospect in the military and civil fields.

The local collision prevention technology is used as one of key technologies of the ocean energy driven robot, and ensures the safety and the working efficiency of the ocean energy driven robot when the ocean energy driven robot sails in an unknown environment. Compared with a conventional power marine robot, the marine energy driving robot can continuously capture marine energy in the course of navigation, and when the obstacle avoidance task is carried out, the marine energy is effectively captured, and the reduction of energy consumption is beneficial to improving the cruising ability of the marine energy driving robot. The existing obstacle avoidance method based on reinforcement learning is patent, the design aims at being safe and feasible and the distance is shortest, the utilization of surrounding energy sources by the ocean energy robot is often ignored, and the thinking mode and the method for solving the problems are not perfect for the ocean energy driven robot working in long voyage.

Disclosure of Invention

Aiming at the problem of how to improve the endurance of the ocean energy driven robot, the invention provides an energy-saving obstacle avoidance method of the ocean energy driven robot.

The invention relates to an energy-saving obstacle avoidance method of a marine energy driven robot,

the invention discloses an intelligent detection method of a passive radar signal based on I-U-Net, which comprises the following steps:

s1, determining a state space S of a marine energy driving robot in a marine environment;

s2, determining an approaching target rewarding function, an obstacle avoidance rewarding function and an energy-saving rewarding function;

s3, training the ocean energy driven robot obstacle avoidance network by using the determined approach target reward function, the obstacle avoidance reward function and the energy-saving reward function to obtain a trained ocean energy driven robot obstacle avoidance network;

the input of the marine energy driving robot obstacle avoidance network is a state space S of the marine energy driving robot in a marine environment, and the output is an optimal strategy, and the input comprises a safe, feasible and energy-saving path from an initial point to a target point of the marine energy driving robot;

s4, acquiring position information of an initial point, a target point and an obstacle of the ocean energy driven robot, and adopting a trained ocean energy driven robot obstacle avoidance network to complete an energy-saving obstacle avoidance task of the ocean energy driven robot.

Preferably, the energy-saving reward function r _energy ：

Wherein k is a punishment coefficient, alpha is a sea current resistance fitting coefficient, beta is a sea current resistance fitting coefficient, E (t) is the energy consumption in delta t time, and the energy consumption is summed to obtain the accumulated energy consumptionV _nsv Navigation speed vector u representing ocean energy driven robot _cur (t) is the velocity component of the ocean current in the X-axis direction under the geodetic coordinate system at the moment t, v _cur (t) is the velocity component of the ocean current in the Y-axis direction under the geodetic coordinate system at the moment t, u _wind (t) is the velocity component of the sea wind in the X-axis direction under the geodetic coordinate system at the moment t, v _wind (t) is the velocity component of the sea wind in the Y-axis direction under the geodetic coordinate system at the moment t, ψ _nsv (t) is the heading angle of the ocean energy driven robot at the moment t; p (P) ₁ Representing the total power of the control system, P ₂ Represents the power of the steering engine, P ₄ Represents the power generation power of a fan, P ₅ Representing the power generated by the photovoltaic panel.

Preferably, the obstacle avoidance reward function r _obs ：

d _safe ＝0.002893L ² +0.303744L

Wherein d _k (t)＝d _Obs (t)-d _Obs (t-1), dis_obs is the obstacle avoidance distance, d _Obs (t) represents the shortest distance between the ocean energy driven robot and the obstacle at the time t, d _max For the maximum distance of the ocean energy driven robot from the obstacle for normalization, d _safe The safe distance of the ocean energy driving robot is L, and the ship length of the ocean energy driving robot is L.

Preferably, the objective reward function r is approached _goal ：

In the method, in the process of the invention,dis_gol is the arrival distance d _max Is the maximum distance (x) _start ,y _start ) X represents the abscissa, x of the starting point of the voyage _nsv (t) represents the position coordinate of the ocean energy driven robot in the X-axis direction under the geodetic coordinate system at the moment t, y _nsv And (t) represents the position coordinate of the ocean energy driven robot in the Y-axis direction under the geodetic coordinate system at the moment t.

Preferably, the state space S is:

S＝[x _nsv ,y _nsv ,ψ _nsv ,d _Obs ,θ _Obs ,u _cur ,v _cur ,u _wind ,v _wind ,s _solar ]

in the psi- _nsv Representing a heading angle of the ocean-energy-driven robot; (x) _nsv ,y _nsv ) Representing the position of the ocean-powered robot; d, d _Obs Represents the shortest distance, theta, of the ocean energy driven robot from an obstacle _Obs Driving a relative azimuth angle between the robot and the obstacle for ocean energy; u (u) _cur 、v _cur Representing the projected sizes of the ocean currents on the X axis and the Y axis at the current moment; u (u) _wind 、v _wind The projection size of sea wind on the X axis and the Y axis at the current moment is shown; s is(s) _solar Indicating the intensity of the illumination.

Preferably, in the step S3, an A3C algorithm is adopted to train the marine energy driven robot obstacle avoidance network.

Preferably, S3 includes:

s31, initializing the structure and parameters of the marine energy driving robot obstacle avoidance network, rasterizing a marine environment map, keeping the starting point and the target point of the marine energy driving robot unchanged in the training process, and randomly generating different state spaces for training the marine energy driving robot obstacle avoidance network in each training round;

s32: according to the current state of the ocean energy driving robot, according to the approaching target point rewarding function and the obstacle avoidance rewarding function, the energy-saving rewarding function executes navigation action to finish state transition, and instant rewarding and the state at the next moment are obtained;

s33: if the iteration step number reaches the maximum step number or the ocean energy driven robot is in a termination state, updating public network parameters and sharing the public network parameters to training networks in other threads; otherwise, go to S32;

s34: judging whether the ocean energy driven robot reaches a termination state, and switching to S32 if the ocean energy driven robot does not reach the termination state; otherwise, the number of rounds is added with one, the next round of network training is carried out, and S31 is carried out;

s35: the training rounds of the marine energy driving robot obstacle avoidance network reach the maximum rounds, and the marine energy driving robot obstacle avoidance network is saved.

The method has the beneficial effects that the method designs the rewarding function according to the energy consumption and ocean energy capture of the ocean energy driven robot and ocean environment factors, performs network training on ocean energy driven robot path planning, and plans a safe, feasible and energy-saving obstacle avoidance path.

Drawings

FIG. 1 is a diagram of an A3C algorithm training framework;

FIG. 2 is a state diagram of a marine energy driven robot sailing;

FIG. 3 is a schematic diagram of a marine energy driven robot obstacle avoidance network;

FIG. 4 is a flowchart of an energy-saving A3C obstacle avoidance algorithm of the ocean energy driven robot;

FIG. 5 is a training process jackpot diagram;

FIG. 6 is an energy saving obstacle avoidance planning;

FIG. 7 is a graph of the number of iterative steps of the energy-saving A3C obstacle avoidance algorithm.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be noted that, without conflict, the embodiments of the present invention and features of the embodiments may be combined with each other.

The invention is further described below with reference to the drawings and specific examples, which are not intended to be limiting.

The ocean energy driven robot energy saving and obstacle avoiding method of the embodiment comprises the following steps:

step 1, as shown in fig. 2, determining a state space S of the ocean energy driven robot in an ocean environment:

Step 2, determining an approaching target rewarding function, an obstacle avoidance rewarding function and an energy-saving rewarding function;

step 3, training the ocean energy driven robot obstacle avoidance network by using the determined approach target reward function, the obstacle avoidance reward function and the energy-saving reward function to obtain a trained ocean energy driven robot obstacle avoidance network;

and 4, acquiring position information of an initial point, a target point and an obstacle of the ocean energy driven robot, and completing an energy-saving obstacle avoidance task of the ocean energy driven robot by adopting a trained ocean energy driven robot obstacle avoidance network.

According to the invention, the problems of energy conservation and obstacle avoidance of the ocean energy driven robot under the unknown environment are solved by taking ocean energy capture into consideration, obstacle information is obtained through the carried vision module, the position, navigational speed and heading information of the ocean energy driven robot are obtained through the combined navigation module, the information such as the flow speed and the flow direction of ocean currents, the flow speed and the flow direction of wind and the illumination intensity is obtained through the sensor modules such as the flow rate meter, the weather station and the light radiation, and a safe, feasible and energy-saving path scheme is comprehensively calculated and used for the ocean energy driven robot to avoid the unknown obstacle, the ocean information is effectively utilized, and the endurance of the ocean energy driven robot is improved.

The energy-saving reward function r of the embodiment _energy ：

Obstacle avoidance reward function r of the present embodiment _obs ：

d _safe ＝0.002893L ² +0.303744L

Wherein d _k (t)＝d _Obs (t)-d _Obs (t-1), dis_obs is the obstacle avoidance distance, when d _Obs When (t) < = dis_obs, the ocean energy driven robot is considered to collide with the obstacle, d _Obs (t) represents the shortest distance between the ocean energy driven robot and the obstacle at the time t, d _max For the maximum distance of the ocean energy driven robot from the obstacle for normalization, d _safe The safe distance of the ocean energy driving robot is L, and the ship length of the ocean energy driving robot is L.

The approach target reward function r of the present embodiment _goal ：

In the method, in the process of the invention,dis_gol is the arrival distance, and when d (t) <=dis_gol, the ocean energy driven robot is considered to reach the target point, d _max Is the maximum distance (x) _start ,y _start ) X represents the abscissa, x of the starting point of the voyage _nsv (t) represents the position coordinate of the ocean energy driven robot in the X-axis direction under the geodetic coordinate system at the moment t, y _nsv And (t) represents the position coordinate of the ocean energy driven robot in the Y-axis direction under the geodetic coordinate system at the moment t.

In step 3 of the embodiment, an A3C algorithm is adopted to train the obstacle avoidance network of the ocean energy driven robot.

The training process of the A3C algorithm comprises a plurality of threads for synchronously training a network and a public network which does not need training, the structures of the plurality of threads for synchronously training the network and the public network are the same, parameters obtained by training the plurality of parallel training networks are updated to the public network, the public network shares the updated parameters to training networks in other threads, in the embodiment, as shown in fig. 1, A3C creates a plurality of parallel environments (environment 1, environment 2, environment 3 and environment 4), so that a plurality of ocean energy driving robots with the training networks can update the parameters in the public network on the parallel environments simultaneously. A3C solves the AC non-convergence problem: the ocean energy in parallel drives robots to be mutually noninterfere, and parameter updating of a public network is interfered by discontinuity of updating submitted by a secondary structure, so that the correlation of updating is reduced, and the convergence is improved. As shown in fig. 4, step 3 of the present embodiment includes:

step 31, initializing the structure and parameters of the marine energy driving robot obstacle avoidance network, rasterizing a marine environment map, keeping the starting point and the target point of the marine energy driving robot unchanged in the training process, and randomly generating different state spaces for training the marine energy driving robot obstacle avoidance network in each training round; in the training process, the ocean energy driven robot position is updated as follows:

wherein x is _nsv (t) represents the position coordinate of the ocean energy driven robot in the X-axis direction under the geodetic coordinate system at the moment t, y _nsv (t) represents the position coordinate of the ocean energy driven robot in the Y-axis direction under the geodetic coordinate system at the moment t, ψ _nsv (t) is the heading angle, V of the ocean energy driven robot at the moment t _nsv A represents a navigational speed vector of the ocean energy driven robot, and a represents psi _nsv (t) discretized actions.

Step 32: as shown in fig. 3, according to the current state of the ocean energy driven robot, according to the approaching target point rewarding function and the obstacle avoidance rewarding function, the energy-saving rewarding function executes navigation action to complete state transition, and instant rewarding and the next time state are obtained;

step 33: if the iteration step number reaches the maximum step number or the ocean energy driven robot is in a termination state, updating public network parameters and sharing the public network parameters to training networks in other threads; otherwise go to step 32;

step 34: judging whether the ocean energy driven robot reaches a termination state, and switching to step 32 if the ocean energy driven robot does not reach the termination state; otherwise, the number of rounds is increased by one, the next round of network training is carried out, and the step 31 is carried out;

step 35: the training rounds of the marine energy driving robot obstacle avoidance network reach the maximum rounds, and the marine energy driving robot obstacle avoidance network is saved.

Although the invention herein has been described with reference to particular embodiments, it is to be understood that these embodiments are merely illustrative of the principles and applications of the present invention. It is therefore to be understood that numerous modifications may be made to the illustrative embodiments and that other arrangements may be devised without departing from the spirit and scope of the present invention as defined by the appended claims. It should be understood that the different dependent claims and the features described herein may be combined in ways other than as described in the original claims. It is also to be understood that features described in connection with separate embodiments may be used in other described embodiments.

Claims

1. The energy-saving obstacle avoidance method for the ocean energy driven robot is characterized by comprising the following steps of:

2. The energy-saving obstacle avoidance method for marine-energy-driven robots of claim 1 wherein the energy-saving reward function r _energy ：

Wherein k is a punishment coefficient, alpha is a sea current resistance fitting coefficient, beta is a sea current resistance fitting coefficient, E (t) is the energy consumption in delta t time, and the energy consumption is summed to obtain the accumulated energy consumptionV _nsv Navigation speed vector u representing ocean energy driven robot _cur (t) is the velocity component of the ocean current in the X-axis direction under the geodetic coordinate system at the moment t, v _cur (t) is the velocity component of the ocean current in the Y-axis direction under the geodetic coordinate system at the moment t, u _wind (t) is the velocity component of the sea wind in the X-axis direction under the geodetic coordinate system at the moment t, v _wind (t) is the velocity component of the sea wind in the Y-axis direction under the geodetic coordinate system at the moment t, ψ _nsv (t) is the heading angle, P, of the ocean energy driven robot at the moment t ₁ Representing the total power of the control system, P ₂ Represents the power of the steering engine, P ₄ Represents the power generation power of a fan, P ₅ Representing the power generated by the photovoltaic panel.

3. The energy-saving obstacle avoidance method for marine energy driven robots of claim 1 wherein the obstacle avoidance reward function r _obs ：

d _safe ＝0.002893L ² +0.303744L

4. The energy-saving obstacle avoidance method for marine-energy-driven robots of claim 1 wherein the approach to the target reward function r _goal ：

In the method, in the process of the invention,dis_gol is the arrival distance d _max Is the maximum distance (x) _start ,y _start ) Representing the coordinates of the navigation starting point in the X-axis direction and the Y-axis direction under the geodetic coordinate system, X _nsv (t) represents the position coordinate of the ocean energy driven robot in the X-axis direction under the geodetic coordinate system at the moment t, y _nsv And (t) represents the position coordinate of the ocean energy driven robot in the Y-axis direction under the geodetic coordinate system at the moment t.

5. The energy-saving obstacle avoidance method of marine energy driven robot of claim 1, wherein,

the state space S is:

6. The energy-saving obstacle avoidance method of ocean-energy-driven robots of claim 1 wherein the A3C algorithm is used in S3 to train the ocean-energy-driven robot obstacle avoidance network.

7. The energy-saving obstacle avoidance method of marine energy-driven robots of claim 6 wherein S3 comprises:

8. The method of energy saving and obstacle avoidance for a marine-energy driven robot of claim 6 wherein during training, the position of the marine-energy driven robot is updated as:

9. A computer readable storage device storing a computer program, characterized in that the computer program when executed implements the marine energy driven robot energy saving obstacle avoidance method according to any of claims 1 to 8.

10. An energy-saving obstacle avoidance apparatus for a marine-energy-driven robot comprising a storage device, a processor and a computer program stored in the storage device and executable on the processor, wherein execution of the computer program by the processor implements the energy-saving obstacle avoidance method for a marine-energy-driven robot according to any one of claims 1 to 8.