CN114089762A - Water-air amphibious unmanned aircraft path planning method based on reinforcement learning - Google Patents

Water-air amphibious unmanned aircraft path planning method based on reinforcement learning Download PDF

Info

Publication number
CN114089762A
CN114089762A CN202111381994.2A CN202111381994A CN114089762A CN 114089762 A CN114089762 A CN 114089762A CN 202111381994 A CN202111381994 A CN 202111381994A CN 114089762 A CN114089762 A CN 114089762A
Authority
CN
China
Prior art keywords
unmanned aircraft
amphibious unmanned
path planning
point
grid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111381994.2A
Other languages
Chinese (zh)
Other versions
CN114089762B (en
Inventor
杨晓飞
史逸伦
叶辉
杜昭平
佘宏伟
严鑫
刘伟
冯北镇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu University of Science and Technology
Original Assignee
Jiangsu University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu University of Science and Technology filed Critical Jiangsu University of Science and Technology
Priority to CN202111381994.2A priority Critical patent/CN114089762B/en
Publication of CN114089762A publication Critical patent/CN114089762A/en
Application granted granted Critical
Publication of CN114089762B publication Critical patent/CN114089762B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/02Control of position or course in two dimensions
    • G05D1/0206Control of position or course in two dimensions specially adapted to water vehicles
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/10Simultaneous control of position or course in three dimensions
    • G05D1/101Simultaneous control of position or course in three dimensions specially adapted for aircraft

Landscapes

  • Engineering & Computer Science (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention discloses a water-air amphibious unmanned aircraft path planning method based on reinforcement learning. The method comprises the following steps: s1, selecting a region S of the amphibious unmanned aircraft for executing a path planning task, and extracting data of the region S corresponding to the electronic chart according to the region S to perform three-dimensional environment modeling; s2, constructing a Markov Decision Process (MDP) for planning paths of the amphibious unmanned aircraft; and S3, setting a starting point and a target point, and completing global path planning according to different working scenes of the amphibious unmanned aircraft based on a Depth Q Network (DQN) algorithm according to the MDP for path planning of the amphibious unmanned aircraft. Compared with the existing environment modeling method for planning the path of the amphibious unmanned aircraft, the planning range of the amphibious unmanned aircraft path planning method is increased to dozens of kilometers, the motion characteristics of the amphibious unmanned aircraft are effectively considered, and the optimal path which accords with the working scene of the amphibious unmanned aircraft can be found more quickly and effectively by combining a DQN algorithm.

Description

Water-air amphibious unmanned aircraft path planning method based on reinforcement learning
Technical Field
The invention belongs to the technical field of autonomous path planning, and particularly relates to an intelligent path planning method for a water-air amphibious unmanned aircraft.
Background
The water-air amphibious unmanned aircraft has the functions of water navigation and air flight, and has the advantages of fast arrival at a task point and wide search visual field compared with a common unmanned ship. The method can effectively solve the defects of slow rescue time, high cost and low frequency existing in the conventional method for searching and rescuing on water by depending on rescuers to drive patrol boats to the incident place. The path planning is one of key technologies for realizing the autonomy of the amphibious unmanned aircraft. The performance of the path planning module is directly related to the quality of the running path selection and the running fluency of the amphibious unmanned aircraft, and whether the amphibious unmanned aircraft can meet the indexes of minimum energy consumption, fastest speed and the like during task execution.
The invention patent CN109871022A introduces an intelligent path planning and local obstacle avoidance method for an amphibious unmanned aircraft, a three-dimensional grid map is built by acquiring working environment information of the amphibious unmanned aircraft in real time, and an improved A-star algorithm is used for path planning. The invention patent CN112698646A introduces a navigation vehicle path planning method based on reinforcement learning, which constructs a virtual force field by accessing barrier information in an electronic chart, and sets a virtual force field reward function for path planning.
The existing path planning method for the amphibious unmanned aircraft can only be applied to local path planning in a small range within tens of meters around the amphibious unmanned aircraft by constructing the three-dimensional grid map in real time, however, the working radius of the amphibious unmanned aircraft can reach dozens of kilometers, and the method cannot solve the path planning task of the map in the large range. And the traditional path planning searching method (A) cannot find the optimal solution of the path planning by using the characteristics of the cross-dimensional motion of the amphibious unmanned aircraft. The existing path planning method for the amphibious unmanned aircraft through reinforcement learning is generally to plan by an auto-built grid environment model, and the method has the advantages that the algorithm search space is large, and the method is different from a real environment and cannot be applied to an actual planning task; a path planning task of an environment model is built through an actual environment map such as an electronic chart, the electronic chart is not digitally modeled, and the training efficiency of a path planning algorithm based on reinforcement learning is influenced.
Disclosure of Invention
The purpose of the invention is as follows: the method aims to overcome the defects that the existing method for planning the path of the amphibious unmanned aircraft cannot cope with the path planning task suitable for the large-range working radius of the amphibious unmanned aircraft, the optimal solution of the path planning cannot be found by effectively considering the motion characteristics of the amphibious unmanned aircraft, and the training efficiency of the reinforcement learning algorithm is influenced by the fact that the electronic chart is used for path planning and digital modeling is not carried out on the electronic chart. The invention provides a water-air amphibious unmanned aircraft path planning method based on reinforcement learning.
According to the method, the data of the electronic chart in the S-57 format are extracted, and an environment model for planning the path of the amphibious unmanned aircraft based on the electronic chart is established by combining actual digital elevation data. The reward function is established based on the risk of the aircraft colliding with the obstacle and some other rules. Then, repetitive training is performed using the Deep Q Network (DQN) algorithm principle. After full training, artificial intelligence of path planning is established, and a meaningful and reasonable path can be found according to different working scenes of the aircraft.
The technical scheme is as follows: in order to achieve the purpose, the invention adopts the following specific technical scheme:
a water-air amphibious unmanned aircraft path planning method based on reinforcement comprises the following steps:
s1, selecting a region S of the amphibious unmanned aircraft for executing a path planning task, and extracting data of the region S corresponding to the electronic chart according to the region S to perform three-dimensional environment modeling;
s2, constructing a Markov Decision Process (MDP) for planning paths of the amphibious unmanned aircraft;
s3, setting a starting point and a target point, and completing global path planning according to different working scenes of the amphibious unmanned aircraft based on a Depth Q Network (DQN) algorithm according to the MDP for path planning of the amphibious unmanned aircraft;
in a further improvement of the present invention, the step S1 specifically includes:
and S101, selecting a region S in which the amphibious unmanned aircraft needs to execute a task, namely the longitude and latitude range of the task execution region.
And S102, selecting the electronic chart file with the S-57 format of the corresponding area according to the area S, namely according to the longitude and latitude of the area S, and extracting the data of the electronic chart required by path planning by taking S as the range for extracting the data in the electronic chart.
S103, extracting data of a corresponding area S in the electronic chart, namely selecting object types which need to be extracted from the electronic chart and are used for path planning, usually land, reef and the like, obtaining image layer numbers corresponding to the object types by referring to official documents IHO S-57(ENC) of the electronic chart, and extracting the required electronic chart data by using a geospatial vector data open source library (OGR) library;
s104, extracting the required electronic chart data through a geospatial vector data open source library (OGR), namely opening an electronic chart file in an S-57 format through a function in an OGRSFDriver in the OGR, and calling (S57reader) to read the chart data layer by layer according to the layer number of the type of the required extracted object.
S105, storing the data read Layer by Layer on the electronic charts such as land, submerged reefs and the like in extensible markup language (xml) (extensible markup language) files Layer by Layer according to Layer (Layer) -element (feature) -field (field) and geometric object (geometry); wherein the geometric object (geometry) holds the geometric attribute of the element, indicating that the type of the element is one of a point (point), a line (line) or a polygon (polygon).
S106, performing three-dimensional environment modeling on the data of the area S, namely extracting the electronic chart data, storing the electronic chart data as an extensible markup language (xml) file, and establishing a grid matrix with columns of (wlong-elong)/squaresize +1 and rows of (hlati-lati)/squaresize +1 according to the longitude range (wlong, elong) and the latitude range (hlati, lati) of the area S and the grid size of squaresize. The latitude and longitude of the center point of the ith row and j columns of the grid matrix can be represented as:
Figure BDA0003365938050000031
the latitude and longitude of the four vertices a, b, c, d of this grid can be expressed as:
Figure BDA0003365938050000032
and S107, obtaining an extensible markup language (xml) file by extracting the electronic chart data of the region S, traversing the xml, and accessing each node in sequence. The method mainly comprises two tasks: 1. establishing navigable points and non-navigable points of a grid map; 2. fill the water depth value of each grid.
S107-1, establishing navigable points and non-navigable points of a grid map by the first task, namely when elements are located under land and submerged reef layers and the types of the elements are polygons (Polygons) and lines (lines), accessing a child node point set (waypoints) of the elements, storing longitude and latitude coordinates of all coordinate point (waypoint) nodes under the point set (waypoints) nodes, filling the stored longitude and latitude coordinate points into polygons by using a Polygon function, performing Polygon intersection judgment with a rectangle formed by the points a, b, c and d, and if the two nodes intersect, setting a grid where an abcd point is located as a non-navigable grid; when the element type is a point (p-oil), longitude and latitude coordinates of a coordinate point (waypoint) node under a point set (waypoints) of the element type are obtained, whether each coordinate is in a rectangular grid formed by a, b, c and d is judged, and if yes, the grid where the current point a, b, c and d is located is set as a non-navigable grid.
S107-2, filling the water depth value of each grid by the second task, namely when the element sub-node is positioned below the node of the isobologram layer, similarly, acquiring the longitude and latitude coordinates of a coordinate point (waypoint) node under a point set (waypoints) of the element sub-node, judging whether the coordinate point is in a rectangular grid formed by a, b, c and d, and if the coordinate point is in the rectangular grid formed by a, b, c and d, the grid is a navigable grid, and if the coordinate point is in the rectangular grid, assigning the value of the water depth (depth) to the grid. Because the geometry of the iso-depth line is a line (line), not all navigable grids have depth values, and other unassigned grids are incrementally interpolated with the iso-depth line as a range, i.e., grids with small values near the depth of water are assigned small values and grids with large values near the depth of water are assigned large values. This way all navigable grids can be assigned depth values.
And S108, finally obtaining a grid matrix with actual geographic information, wherein the cells with numerical values in the matrix represent navigable areas with depth values.
And S109, acquiring elevation data of the area S, wherein the elevation data is generally in tif format.
S110, acquiring the elevation data of the area S, namely, when the elevation data of the area S is intercepted, obtaining the longitude and latitude coordinates of the vertex at the upper left corner and the size of the two-dimensional array. Therefore, according to the resolution of the elevation data, the longitude and latitude information of each pixel point can be calculated, and the value of each pixel point is an elevation value. Therefore, after obtaining the two-dimensional array of elevation data and the elevation value and the longitude and latitude coordinates of each unit, assigning the elevation value to the non-navigable area of the grid matrix in the step S108 according to the longitude and latitude coordinates thereof by a comparison and assignment method, so as to obtain the grid matrix with the elevation data.
The invention further improves that the specific method for constructing the amphibious unmanned aircraft path plan (MDP) in the step S2 comprises the following steps:
s201, constructing a Markov Decision Process (MDP) for planning paths of the amphibious unmanned aircraft, and determining a state space of the amphibious unmanned aircraft firstly, wherein the state space is defined as position coordinates (x, y) and a height z of the amphibious unmanned aircraft, the position coordinates (x, y) are expressed as a two-dimensional continuous space, and the height z is expressed as a one-dimensional discrete space in order to simplify a training process. The state space of the amphibious unmanned aircraft is represented as
[(x1,y1,z1),(x2,y2,z2),.......,(xn,yn,zn)] (3)
S202, constructing a Markov Decision Process (MDP) for planning paths of the amphibious unmanned aircraft, and secondly determining an action space of the amphibious unmanned aircraft. Considering that the amphibious unmanned aircraft has the characteristics of underwater flight and air flight at the same time, the actions of the amphibious unmanned aircraft are dispersed into six actions of up, down, left, right, takeoff and landing, namely an action space A [ up, down, left, right, fly, descan ].
S203, the moving distance of the four actions of up, down, left and right is considered in two situations, namely navigation and flight. Under the navigation condition, the displacement distance of one minute of advance is taken as the moving distance of the up, down, left and right actions (d) through the test of the navigation speed of the laboratory shipsail) (ii) a In the case of flight, the displacement distance of one minute of advance is taken as the moving distance of up, down, left and right actions (d) through the test of the navigation speed of the laboratory shipflight). The moving distance of take-off and landing actions is simplified, namely after the take-off action is executed, the amphibious unmanned aircraft can vertically take off to reach the maximum height (h) which can be reached by the amphibious unmanned aircraftmax) And after the landing action is executed, the amphibious unmanned aircraft can vertically land to the water surface with the height of 0. The state transition in a given motion, according to the defined state and motion space, can be expressed as
Figure BDA0003365938050000041
Where [ x ' y ' z ' ] is the next state and [ x y z ] is the current state.
And S204, constructing the MDP of the amphibious unmanned aircraft path planning, and determining a reward function of the amphibious unmanned aircraft path planning.
S204-1, target zone reward (rterminal). The training efficiency is improved, and the amphibious unmanned aircraft is regarded as a task to be completed in the area where the amphibious unmanned aircraft reaches the target point.
S204-2, distance reward function (r)distance). The influence of a target area is strengthened, and the amphibious unmanned aircraft is restrained to be capable of moving to the target area more quickly.
Figure BDA0003365938050000051
The distance between the amphibious unmanned aircraft and the target point at the current state is represented by DistaneNow, and the distance between the amphibious unmanned aircraft and the target point at the next state is represented by DistaneFuture. Lambda [ alpha ]distanceIs a distance weight coefficient.
S204-3, energy consumption reward function (r)power). When the amphibious unmanned aircraft moves, the energy consumed by the flight and navigation states of the amphibious unmanned aircraft is different, and in order to enable the duty ratio of flight and navigation in a route planned by a path to meet the requirements of different working scenes, an energy consumption reward function r is adoptedpower. Through energy consumption testing of an amphibious unmanned aircraft in a laboratory, flight energy consumption of one minute and navigation energy consumption of one minute are obtained, and the ratio of the flight energy consumption to the navigation energy consumption is lambdaflightsailSo the energy consumption reward function can be expressed as
Figure BDA0003365938050000052
And alpha is a proportionality coefficient, and when the amphibious unmanned aircraft is in a flight state and a navigation state, negative energy consumption rewards are generated when the amphibious unmanned aircraft does one action.
S204-4, water depth reward (r)depth). According to the environment model analyzed by the electronic chart, each coordinate point has a corresponding water depth. Different from other works, the distance between the amphibious unmanned aircraft and large obstacles such as land, island and reef is represented by the Depth value Depth of the coordinate point. Normally, the place with larger water depth is farther from the land, and the place with smaller water depth isThe closer to land. Water depth reward function rdepthCan be expressed as:
Figure BDA0003365938050000053
Figure BDA0003365938050000054
wherein λ1~λ6For the numerical value of the reward function, an obstacle flag bit (obstance) is used for better ensuring the safety of the amphibious unmanned aircraft and the appropriateness of the takeoff opportunity, a 3 x 3 square area is formed around the periphery of the amphibious unmanned aircraft and is used as a detection area of the amphibious unmanned aircraft, and if the area has an obstacle, the obstance is output as 1
S204-5, collision reward function (r)obstance). The collision reward is intended to prevent the amphibious unmanned vehicle from colliding with an obstacle. In the reinforcement learning algorithm training process, once the amphibious unmanned aircraft collides with the barrier, the collision reward function returns a large negative reward. The collision reward function may be expressed as:
robstance=-λobstance(Depth>0 and z=0) (9)
λobstanceand when the coordinate water depth value of the next state of the amphibious unmanned aircraft is positive and is not in the flying state, the amphibious unmanned aircraft is considered to collide with the obstacle, and collision reward is generated.
S205, the total reward function can be expressed as:
rtotal=λa*rterminalb*rdistancec*rpowerd*rdepthe*robstance (10)
wherein λa、λb、λc、λd、λeAre weight coefficients.
The invention is further improved, wherein the step S3 of giving a starting point and a target point, and completing global path planning according to different working scenes of the amphibious unmanned aircraft based on a Depth Q Network (DQN) algorithm according to the MDP for amphibious unmanned aircraft path planning specifically comprises:
s301, the given starting point and the target point, namely longitude and latitude coordinates of the starting point and the target point of the selected path planning task;
and S302, the MDP for planning the path of the amphibious unmanned aircraft, namely the Markov decision process for constructing the amphibious unmanned aircraft, which is described in S2, comprises a state space, an action space and a reward and punishment function. The method is characterized in that a Depth Q Network (DQN) algorithm is selected as a path planning algorithm, values of Batch size (Batch _ size) (the size of data to be learned by the amphibious unmanned aircraft each time), Learning rate (Learning rate), training times (episode), attenuation factor (ga mma) and memory playback unit size (memory _ size) are set, the number of layers of a Q prediction network is set, and training is performed according to the MDP of the amphibious unmanned aircraft at S2 and the three-dimensional environment model at S1.
S303, setting three different working scenes of the amphibious unmanned aircraft according to the different working scenes of the amphibious unmanned aircraft: in the first scenario, when an emergency event occurs, an emergency action is required to execute a task, and the target place is required to be reached at the highest speed. And a second scene is a daily work task and requires energy storage allowance. And a third scene is that the reserve margin is over half and needs to be rewound and charged. And realizing the path planning task of different working scenes by modifying different weight coefficients of the reward function in the S205.
Compared with the prior art, the invention has the following remarkable advantages:
1. the three-dimensional environment modeling method based on the electronic chart is simple and effective, the electronic chart is effectively combined with an unmanned aircraft, the route planning within the range of dozens of kilometers can be performed by utilizing the abundant geographic information of the electronic chart, and the defect of small planning range in the prior art is effectively overcome.
2. The invention adopts a reinforcement learning method to plan the path of the amphibious unmanned aircraft, sets the action space, takes take-off and landing as independent actions, and effectively considers the motion characteristics of the amphibious unmanned aircraft.
3. According to the method, a reward function is set by considering the working scene of the amphibious unmanned aircraft; the flight and navigation of the amphibious unmanned aircraft are effectively limited through the water depth information extracted from the electronic chart; the requirements of different task scenes are realized through the weight of each part rewarded.
4. The trained model has good generalization, the area of the path planning task map is changed, the existing A-x algorithm needs to search once again, the trained model can quickly search a proper path by using effective priori knowledge, and a search period is saved compared with the existing method.
Drawings
FIG. 1 is a logic step diagram of the amphibious unmanned aircraft path planning method based on reinforcement learning according to the invention,
FIG. 2 is a diagram of a structure of an extensible markup language (xml) file saved after parsing an electronic chart,
figure 3 is a grid matrix diagram of a portion of the electronic chart-based three-dimensional environment modeling of the present invention containing high level information,
figure 4 is a schematic diagram of the electronic chart-based three-dimensional environment modeling of the present invention,
figure 5 is a flow chart of the depth Q-based algorithm of the present invention,
fig. 6 is a schematic view of the working scene of the amphibious unmanned aircraft.
Detailed Description
In order to enhance the understanding of the present invention, the present invention will be described in further detail with reference to the following examples, which are provided for the purpose of illustration only and are not intended to limit the scope of the present invention.
As shown in fig. 1, the method for planning a route based on reinforcement learning of the present invention includes the following three steps.
And S1, selecting a region S of the amphibious unmanned aircraft for executing a path planning task, and extracting data of the region S corresponding to the electronic chart according to the region S to perform three-dimensional environment modeling.
In this embodiment, S1 selects a region S where the amphibious unmanned aircraft performs a path planning task, and according to the region S, data of the region S corresponding to the electronic chart is extracted to perform three-dimensional environment modeling, which is specifically implemented as follows:
step1, selecting an amphibious unmanned aircraft to perform a path planning work area S, wherein the longitude range of the S is (wlong ═ 119.5325, elong ═ 119.7325), the latitude range is (hlati ═ 23.729659, lati ═ 23.529659), and the work area S is an island area of the penghua of china on an actual map.
Step2, selecting an electronic chart with an S-57 format and with an area S according to the longitude and latitude coordinates of the area S: the south China sea electronic chart EA200001 refers to IHO S-57(ENC) to determine the layer number of an object to be analyzed from the electronic chart, and in this embodiment, land, island reef and isobath in the electronic chart area S are analyzed. The examination shows that the land map layer number is 71, the reef layer number is 153, and the equal water depth map layer number is 43. Reading the electronic chart data layer by layer according to the layer number of the type of the required extraction object by calling an S57reader class in a geospatial vector data open source library (OGR). And stored in an extensible markup language (xml) file according to the structure shown in fig. 2.
Step3, in this embodiment, according to the maximum one-minute cruising distance of the amphibious unmanned aircraft, selecting the grid size squarize of 0.002, generating a grid matrix with the row (row) of 101 and the column (col) of 101 in the latitude and longitude range of the region S, and performing polygon intersection judgment according to the element (feature) and the geometric shape (g eometery) and giving navigable and non-navigable attributes to the grid through the latitude and longitude coordinates of each grid vertex a, b, c, d and the extensible markup language (xml) file in Step 2. For example, in the extensible markup language (xml), a certain element (feature) is located under a layer (layer) of a land (land), the geometric shape of the feature is polygon, the feature is composed of a plurality of point sets (waypoints), the polygon intersection judgment is carried out on the same longitude and latitude areas as the self-built grid map according to the longitude and latitude information of the point sets (waypoints), and the grid is endowed with the non-navigable grid as long as the polygon composed of the element intersects with the grid. Meanwhile, the navigable grids are assigned according to the coordinate information of the point sets (waypoints) under the equal-water-depth line layers, and interpolation is performed according to the grids which are not assigned and the surrounding grids. And obtaining elevation information of the corresponding area S through GIS software to obtain a two-dimensional array of the elevation information of the area S, and performing elevation data assignment on the land area of the grid map according to the corresponding longitude and latitude coordinates. This results in a grid matrix with elevation data and actual geographic information, as shown in FIG. 3, which is a grid matrix with elevation data and actual geographic information for a portion of the area S. A three-dimensional environment model visualization is shown in fig. 4.
And S2, constructing a Markov Decision Process (MDP) for planning paths of the amphibious unmanned aircraft.
In this embodiment, the specific implementation steps of the S2 for constructing the Markov Decision Process (MDP) for amphibious unmanned aircraft path planning are as follows:
step1, the state space is defined as the position coordinate (x, y) and the height z of the amphibious unmanned vehicle, the position coordinate (x, y) is represented as a two-dimensional continuous space, and the height z is represented as a one-dimensional discrete space for simplifying the training process. The state space of the amphibious unmanned aircraft is represented as
[(x1,y1,z1),(x2,y2,z2),.......,(xn,yn,zn)] (1)
Step2, separating the motion of the amphibious unmanned aircraft into six motions of up, down, left, right, takeoff and landing, namely motion space A ═ up, down, left, right, fly, descan.
Step3, under the navigation condition, testing the navigation speed of the laboratory ship, and taking the displacement distance which advances one minute as the moving distance d of the up, down, left and right actionssail(ii) a Under the condition of flight, the navigation speed of the self ship in the laboratory is tested, and the displacement distance of one-minute advance is taken as the moving distance d of the up, down, left and right actionsflight. The moving distance of take-off and landing actions is simplified, namely after the take-off action is executed, the amphibious unmanned aircraft can vertically take off to reach the maximum value which can be reached by the amphibious unmanned aircraftHeight hmaxAnd after the landing action is executed, the amphibious unmanned aircraft can vertically land to the water surface with the height of 0. The state transition in a given motion, according to the defined state and motion space, can be expressed as
Figure BDA0003365938050000081
Where [ x ' y ' z ' ] is the next state and [ x y z ] is the current state.
Step3, the reward function, is represented as:
rtotal=λa*rterminalb*rdistancec*rpowerd*rdepthe*robstance (3)
wherein λa、λb、λc、λd、λeAre the weight coefficients.
Targeted zone awards (r)terminal). The training efficiency is improved, and the amphibious unmanned aircraft is regarded as a task to be completed in the area where the amphibious unmanned aircraft reaches the target point.
Distance reward function (r)distance). The influence of a target area is enhanced, and the amphibious unmanned aircraft is restrained to move to the target area more quickly.
Figure BDA0003365938050000091
The distance between the amphibious unmanned aircraft and the target point at the current state is represented by DistaneNow, and the distance between the amphibious unmanned aircraft and the target point at the next state is represented by DistaneFuture. Lambda [ alpha ]distanceIs a distance weight coefficient.
Energy consumption reward function (r)power). When the amphibious unmanned aircraft moves, the energy consumed by the flight and navigation states of the amphibious unmanned aircraft is different, and in order to enable the proportion of flight and navigation in a route planned by a path to meet the requirements of different working scenes, an energy consumption reward function r is adoptedpower. By a pair experimentThe energy consumption of the amphibious unmanned aircraft in the room is tested, and the flying energy consumption lambda of one minute is obtainedflightAnd one minute navigation energy consumption lambdasailTo obtain their ratio of lambdaflightsailSo the energy consumption reward function can be expressed as
Figure BDA0003365938050000092
And alpha is a proportionality coefficient, and when the amphibious unmanned aircraft is in a flight state and a navigation state, negative energy consumption rewards are generated when the amphibious unmanned aircraft does one action.
Depth of water reward (r)depth). According to the environment model analyzed by the electronic chart, each coordinate point has a corresponding water depth. Different from other works, the distance between the amphibious unmanned aircraft and large obstacles such as land, island and reef is represented by the Depth value Depth of the coordinate point. Normally, places with greater water depth are farther from the land, and places with smaller water depth are closer to the land. Water depth reward function rdepthCan be expressed as:
Figure BDA0003365938050000093
Figure BDA0003365938050000094
wherein λ1~λ6For the numerical value of the reward function, obstance is used for better ensuring the safety of the amphibious unmanned aircraft and the appropriateness of the takeoff opportunity, a 3 x 3 square area is formed around the periphery of the amphibious unmanned aircraft and is used as a detection area of the amphibious unmanned aircraft, and if obstacles exist in the area, obstance is output as 1
Collision reward function (r)obstance). The collision reward is intended to prevent the amphibious unmanned vehicle from colliding with an obstacle. In the training process of the reinforcement learning algorithm, the amphibious unmanned aircraftIn the event of a collision with an obstacle, the collision reward function will return a large negative reward. The collision reward function may be expressed as:
robstance=-λobstance(Depth>0 and z=0) (8)
λobstanceand when the coordinate water depth value of the next state of the amphibious unmanned aircraft is positive and is not in the flying state, the amphibious unmanned aircraft is considered to collide with the obstacle, and collision reward is generated.
S3, setting a starting point and a target point, and completing global path planning according to different working scenes of the amphibious unmanned aircraft based on a Depth Q Network (DQN) algorithm according to the MDP for path planning of the amphibious unmanned aircraft;
in this embodiment, according to the environment modeling of S1 and the MDP construction of S2, the specific implementation of S3 that completes global path planning according to different working scenarios of the amphibious unmanned aircraft based on a Depth Q Network (DQN) algorithm is as follows:
step1, starting point and target point of given path plan
Step2, importing the environment model established by S1, selecting a Depth Q Network (DQN) algorithm, and obtaining a Depth Q Network (DQN) algorithm flow chart based on amphibious unmanned aircraft path planning as shown in FIG. 5. As an algorithm for path planning, the number of layers of the Q network is set to 3 by setting the Batch size (Batch _ size) to 32, the Learning rate (Learning rate) to 0.01, the number of times of training (episode) to 5000, the attenuation factor (gamma) to 0.9, and the memory playback unit size (memory _ size) to 20000, and training is performed based on the MDP of the amphibious unmanned aircraft at S2 and the three-dimensional environment model at S1.
S303, setting three different working scenes of the amphibious unmanned aircraft as shown in FIG. 6: in the first scenario, emergency starting is required to execute tasks when an emergency occurs, and the target place is required to be reached at the fastest speed. Mainly by adjusting the reward function rtotalWeight coefficient λ inc、λdEqual to 0, regardless of energy storage and water depth limitations;
and a second scene is a daily work task and requires energy storage allowance. Adjusting the reward function rtotalAll weight coefficients in。
And a third scene is that the reserve margin is over half and needs to be rewound and charged. Adjusting the reward function rtotalIn, the weight coefficient lambdab、λdEqual to 0 regardless of water depth and distance limitations.
The foregoing shows and describes the general principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are given by way of illustration of the principles of the present invention, and that various changes and modifications may be made without departing from the spirit and scope of the invention as defined by the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (7)

1. A water-air amphibious unmanned aircraft path planning method based on reinforcement learning is characterized by comprising the following steps:
s1, selecting a region S of the amphibious unmanned aircraft for executing a path planning task, and extracting data of the region S corresponding to the electronic chart according to the region S to perform three-dimensional environment modeling;
s2, constructing a Markov Decision Process (MDP) for planning paths of the amphibious unmanned aircraft;
and S3, setting a starting point and a target point, and completing global path planning according to different working scenes of the amphibious unmanned aircraft based on a Depth Q Network (DQN) algorithm according to the MDP for path planning of the amphibious unmanned aircraft.
2. The reinforcement learning-based water-air amphibious unmanned aircraft path planning method according to claim 1, characterized in that: the specific content and steps of selecting the area S of the amphibious unmanned aircraft for executing the path planning task in the S1 and extracting the data of the area S corresponding to the electronic chart according to the area S are as follows:
(1) selecting a region S required by the amphibious unmanned aircraft to execute tasks;
(2) selecting an electronic chart with an S-57 format of a corresponding area according to the longitude and latitude of the area S, and extracting the electronic chart data required by path planning by taking the area S as a range for extracting the electronic chart data;
(3) selecting object types which need to be extracted from the electronic chart and are used for path planning, wherein the object types are usually land and reef islands, and acquiring layer numbers corresponding to the object types by referring to official documents IHO S-57(ENC) of the electronic chart;
(4) reading the chart data layer by layer according to the layer number of the type of the object to be extracted through a geospatial vector data open source library (OGR);
(5) storing the data read on the land and submerged reef electronic chart Layer by Layer in an extensible markup language (xml) file according to a Layer-element (feature) -field (field) and a geometric object (geometry); wherein the geometry object (geometry) holds the geometric property of the element, indicating that the type of the element is one of a point (point), a line (line) or a polygon (pol ygon).
3. The reinforcement learning-based water-air amphibious unmanned aircraft path planning method according to claim 1, characterized in that: the specific contents and steps for extracting the data of the corresponding area S in the electronic chart and performing three-dimensional environment modeling in the S1 are as follows:
(1) establishing a grid matrix with the column number of (wlong-elong)/square +1 and the row number of (hlati-lati)/square +1 according to the longitude range (wlong, elong) of the area S, the latitude range of (hlati, lati) and the grid size of (square esize); the latitude and longitude of the center point of the ith row and j columns of the grid matrix can be represented as:
Figure FDA0003365938040000011
the latitude and longitude of the four vertices a, b, c, d of the grid is expressed as:
a=(centerpointlon-0.5*squaresize,centerpointlat-0.5*squaresize)
b=(centerpointlon+0.5*squaresize,centerpointlat+0.5*squaresize)
c=(centerpointlon+0.5*squaresize,centerpointlat-0.5*squaresize)
d=(centerpointlon-0.5*squaresize,centerpointlat-0.5*squaresize) (2)
(2) obtaining an extensible markup language (xml) file by extracting the electronic chart data of the region S, traversing the xml, and accessing each node in sequence; the method is divided into two tasks: the method comprises the steps of establishing a first task, and establishing a navigable point and an unviable point of a grid map; a second task, filling the water depth value of each grid; the specific method comprises the following steps:
(2.1) establishing navigable points and non-navigable points of a grid map by the first task, namely when elements are located under land and submerged reef layers and the types of the elements are polygons (Polygon) and lines (line), accessing a child node point set (waypoints) of the elements, storing longitude and latitude coordinates of all coordinate point (waypoint) nodes under the point set (waypoints) nodes, filling the stored longitude and latitude coordinate points into polygons by using a Polygon function, performing Polygon intersection judgment with a rectangle formed by the points a, b, c and d, and if the two nodes intersect, setting a grid where the abcd point is located as a non-navigable grid; when the element type is point (point), acquiring longitude and latitude coordinates of a coordinate point (waypoint) node under a point set (waypoints) of the point, judging whether each coordinate is in a rectangular grid formed by a, b, c and d, and if so, setting the grid where the current a, b, c and d points are positioned as a non-navigable grid;
(2.2) filling the water depth value of each grid by the second task, namely when the element sub-node is positioned below the nodes of the isobologram layer, similarly, acquiring longitude and latitude coordinates of a coordinate point (waypoint) node under a point set (waypoints) of the element sub-node, judging whether the coordinate point is in a rectangular grid formed by a, b, c and d, and the grid is a navigable grid, and if so, assigning the water depth (dep value to the grid;
(3) finally, a grid matrix with actual geographic information can be obtained, and the cells with numerical values in the matrix represent navigable areas with depth values;
(4) obtaining elevation data of the area S, wherein the elevation data is generally in a tif format;
(5) acquiring the elevation data of the area S, namely acquiring longitude and latitude coordinates of a vertex at the upper left corner and the size of a two-dimensional array when intercepting the elevation data of the area S; calculating longitude and latitude information of each pixel point according to the resolution of the elevation data, wherein the value of each pixel point is an elevation value; therefore, after the two-dimensional array of the elevation data and the elevation value and the longitude and latitude coordinates of each unit are obtained, the non-navigable area of the grid matrix in the step (3) is assigned with the elevation value according to the longitude and latitude coordinates by a comparison assignment method, and thus the grid matrix with the elevation data can be obtained.
4. The reinforcement learning-based water-air amphibious unmanned aircraft path planning method according to claim 1, wherein the Markov Decision Process (MDP) for constructing the amphibious unmanned aircraft path planning in step S2 is defined by the following details with respect to the action space and the state space of the amphibious unmanned aircraft:
(1) the state space of the amphibious unmanned aircraft is defined as a position coordinate (x, y) and a height z of the amphibious unmanned aircraft, the position coordinate (x, y) is expressed as a two-dimensional continuous space, and the height z is expressed as a one-dimensional discrete space in order to simplify the training process; the state space of the amphibious unmanned aircraft is thus represented as
[(x1,y1,z1),(x2,y2,z2),.......,(xn,yn,zn)] (3)
(2) Considering that the amphibious unmanned aircraft has the characteristics of underwater flight and air flight at the same time, the actions of the amphibious unmanned aircraft are dispersed into six actions of up, down, left, right, takeoff and landing, namely an action space A [ up, down, left, right, fly, descan ];
(3) and under the sailing condition, testing the sailing speed of the ship in the laboratory, and taking the displacement distance of one minute of forward movement as the moving distance (d) of the up, down, left and right actionssail) (ii) a In the case of flight, the displacement distance of one minute of advance is taken as the moving distance of up, down, left and right actions (d) through the test of the navigation speed of the laboratory shipflight) (ii) a The moving distance of the take-off and landing actions is simplified, namely after the take-off action is executed, the amphibious unmanned aircraft vertically takes off to reach the maximum height (h) which can be reached by the amphibious unmanned aircraftmax) After the landing action is executed, the amphibious unmanned aircraft can vertically land to the water surface with the height of 0; the state transition in a given motion, according to the defined state and motion space, can be expressed as
Figure FDA0003365938040000031
Where [ x ' y ' z ' ] is the next state and [ x y z ] is the current state.
5. The reinforcement learning-based water-air amphibious unmanned aircraft path planning method according to claim 1, wherein the Markov Decision Process (MDP) for constructing the amphibious unmanned aircraft path planning in step S2 is defined by the following specific contents in terms of amphibious unmanned aircraft reward function:
(1) target zone reward (r)terminal) (ii) a The training efficiency is improved, and the amphibious unmanned aircraft is regarded as a task to be completed in the area where the amphibious unmanned aircraft reaches the target point;
(2) distance reward function (r)distance) (ii) a The influence of a target area is enhanced, and the amphibious unmanned aircraft is restrained to move to the target area more quickly;
Figure FDA0003365938040000032
wherein DistaneNow indicates that the currentThe distance between the state amphibious unmanned aircraft and the target point, and the DistaneFuture represents the distance between the next state amphibious unmanned aircraft and the target point; lambda [ alpha ]distanceIs a distance weight coefficient;
(3) energy consumption reward function (r)power) (ii) a When the amphibious unmanned aircraft moves, the energy consumed by the flight and navigation states of the amphibious unmanned aircraft is different, and in order to enable the duty ratio of flight and navigation in a route planned by a path to meet the requirements of different working scenes, an energy consumption reward function r is adoptedpower(ii) a Through energy consumption test of the amphibious unmanned aircraft in the laboratory, the flight energy consumption lambda of one minute is obtainedflightAnd its one minute energy consumption λ for navigationsailTo obtain their ratio of lambdaflightsailSo the energy consumption reward function can be expressed as
Figure FDA0003365938040000041
The alpha is a proportionality coefficient, and when the amphibious unmanned aircraft is in a flying state and a sailing state, negative energy consumption rewards are generated when each action is taken;
(4) water depth reward (r)depth) (ii) a According to the environment model analyzed by the electronic chart, each coordinate point has a corresponding water depth; when the amphibious unmanned aircraft works different from other works, the distance between the amphibious unmanned aircraft and large obstacles such as land, island and reef is represented by the Depth value (Depth) of the coordinate point; normally, the places with larger water depth are farther from the land, and the places with smaller water depth are closer to the land; water depth reward function rdepthCan be expressed as:
Figure FDA0003365938040000042
Figure FDA0003365938040000043
wherein λ1~λ6For the numerical value of the reward function, an obstacle flag bit (obstance) is used for better ensuring the safety of the amphibious unmanned aircraft and the appropriateness of the takeoff opportunity, a 3 x 3 square area is formed around the periphery of the amphibious unmanned aircraft and serves as a detection area of the amphibious unmanned aircraft, and if the area has an obstacle, obstance is output as 1;
(5) collision reward function (r)obstance) (ii) a The collision reward aims to prevent the amphibious unmanned aircraft from colliding with the obstacle; in the reinforcement learning algorithm training process, once the amphibious unmanned aircraft collides with the barrier, a collision reward function returns a large negative reward; the collision reward function may be expressed as:
robstance=-λobstance(Depth>0 and z=0) (9)
λobstancerepresenting a negative reward value returned by the collision reward, and generating the collision reward when the coordinate water depth value of the next state of the amphibious unmanned aircraft is positive and the amphibious unmanned aircraft is not in a flight state, namely the amphibious unmanned aircraft is considered to collide with the obstacle;
(6) the overall reward function may be expressed as:
rtotal=λa*rterminalb*rdistancec*rpowerd*rdepthe*robstance (10)
wherein λa、λb、λc、λd、λeAre weight coefficients.
6. The reinforced learning-based water-air amphibious unmanned aircraft path planning method according to claim 1, wherein the starting point and the target point are given in the step S3, and according to the MDP for amphibious unmanned aircraft path planning, a specific process based on a Depth Q Network (DQN) algorithm is as follows:
(1) a starting point and a target point of the given path plan;
(2) and importing the environment model established in S1, selecting a Deep Q Network (DQN) algorithm as a path planning algorithm, setting Batch size (Batch _ size) to 32, Learning rate (Learning rate) to 0.01, training frequency (episode) to 5000, attenuation factor (gamma) to 0.9, and memory playback unit size (memory _ size) to 20000, setting the number of layers of the Q network to 3, and training according to the MDP of the amphibious unmanned aircraft in S2 and the three-dimensional environment model in S1.
7. The reinforcement learning-based water-air amphibious unmanned aircraft path planning method according to claim 1, wherein in the step S3, the setting of working scenes of the amphibious unmanned aircraft in the global path planning is completed according to different working scenes of the amphibious unmanned aircraft as follows:
(1) setting three different working scenes of the amphibious unmanned aircraft: in a first scene, when an emergency event occurs, an emergency action is required to execute a task, and a target place is required to be reached at the highest speed; mainly by adjusting the reward function rtotalWeight coefficient λ inc、λdEqual to 0, regardless of energy storage and water depth limitations;
(2) the scene two is a daily work task and requires energy storage allowance; adjusting the reward function rtotalAll weight coefficients in (1);
(3) and the third scene is that the reserve margin is over half and needs return-voyage charging; adjusting the reward function rtotalIn, the weight coefficient lambdab、λdEqual to 0 regardless of water depth and distance limitations.
CN202111381994.2A 2021-11-22 2021-11-22 Water-air amphibious unmanned aircraft path planning method based on reinforcement learning Active CN114089762B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111381994.2A CN114089762B (en) 2021-11-22 2021-11-22 Water-air amphibious unmanned aircraft path planning method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111381994.2A CN114089762B (en) 2021-11-22 2021-11-22 Water-air amphibious unmanned aircraft path planning method based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN114089762A true CN114089762A (en) 2022-02-25
CN114089762B CN114089762B (en) 2024-06-21

Family

ID=80302350

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111381994.2A Active CN114089762B (en) 2021-11-22 2021-11-22 Water-air amphibious unmanned aircraft path planning method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN114089762B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114721409A (en) * 2022-06-08 2022-07-08 山东大学 Underwater vehicle docking control method based on reinforcement learning
CN114924587A (en) * 2022-05-27 2022-08-19 江苏科技大学 Unmanned aerial vehicle path planning method
CN115206157A (en) * 2022-08-05 2022-10-18 白杨时代(北京)科技有限公司 Unmanned underwater vehicle path finding training method and device and unmanned underwater vehicle
CN115657683A (en) * 2022-11-14 2023-01-31 中国电子科技集团公司第十研究所 Unmanned and cableless submersible real-time obstacle avoidance method capable of being used for inspection task
CN115855226A (en) * 2023-02-24 2023-03-28 青岛科技大学 Multi-AUV cooperative underwater data acquisition method based on DQN and matrix completion
CN116880551A (en) * 2023-07-13 2023-10-13 之江实验室 Flight path planning method, system and storage medium based on random event capturing

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130151107A1 (en) * 2011-12-13 2013-06-13 Daniel Nikovski Method for Optimizing Run Curve of Vehicles
CN103900573A (en) * 2014-03-27 2014-07-02 哈尔滨工程大学 Underwater vehicle multi-constrained path planning method based on S57 standard electronic chart
CN108507575A (en) * 2018-03-20 2018-09-07 华南理工大学 A kind of unmanned boat sea paths planning method and system based on RRT algorithms
US20190005828A1 (en) * 2017-06-29 2019-01-03 The Boeing Company Method and system for autonomously operating an aircraft
CN109871022A (en) * 2019-03-18 2019-06-11 江苏科技大学 A kind of intelligent path planning and barrier-avoiding method towards amphibious unmanned rescue device
CN110110028A (en) * 2019-05-09 2019-08-09 浪潮软件集团有限公司 A kind of method and system showing map by self defined area towards OGC standard
JP2020030796A (en) * 2018-08-23 2020-02-27 タタ コンサルタンシー サービシズ リミテッドTATA Consultancy Services Limited Systems and methods for predicting structure and properties of atoms and atomic alloy materials
US20200250486A1 (en) * 2019-01-31 2020-08-06 StradVision, Inc. Learning method and learning device for supporting reinforcement learning by using human driving data as training data to thereby perform personalized path planning
US20200359297A1 (en) * 2018-12-28 2020-11-12 Beijing University Of Posts And Telecommunications Method of Route Construction of UAV Network, UAV and Storage Medium thereof
CN112198870A (en) * 2020-06-01 2021-01-08 西北工业大学 Unmanned aerial vehicle autonomous guiding maneuver decision method based on DDQN
CN112698646A (en) * 2020-12-05 2021-04-23 西北工业大学 Aircraft path planning method based on reinforcement learning
KR20210063791A (en) * 2019-11-25 2021-06-02 한국기술교육대학교 산학협력단 System for mapless navigation based on dqn and slam considering characteristic of obstacle and processing method thereof
KR102529331B1 (en) * 2021-12-29 2023-05-09 서울대학교산학협력단 Method for communication based on UAV(unmanned aerial vehicle) BS(base station) using reinforcement learning and apparatus for performing the method

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130151107A1 (en) * 2011-12-13 2013-06-13 Daniel Nikovski Method for Optimizing Run Curve of Vehicles
CN103900573A (en) * 2014-03-27 2014-07-02 哈尔滨工程大学 Underwater vehicle multi-constrained path planning method based on S57 standard electronic chart
US20190005828A1 (en) * 2017-06-29 2019-01-03 The Boeing Company Method and system for autonomously operating an aircraft
CN108507575A (en) * 2018-03-20 2018-09-07 华南理工大学 A kind of unmanned boat sea paths planning method and system based on RRT algorithms
JP2020030796A (en) * 2018-08-23 2020-02-27 タタ コンサルタンシー サービシズ リミテッドTATA Consultancy Services Limited Systems and methods for predicting structure and properties of atoms and atomic alloy materials
US20200359297A1 (en) * 2018-12-28 2020-11-12 Beijing University Of Posts And Telecommunications Method of Route Construction of UAV Network, UAV and Storage Medium thereof
US20200250486A1 (en) * 2019-01-31 2020-08-06 StradVision, Inc. Learning method and learning device for supporting reinforcement learning by using human driving data as training data to thereby perform personalized path planning
CN109871022A (en) * 2019-03-18 2019-06-11 江苏科技大学 A kind of intelligent path planning and barrier-avoiding method towards amphibious unmanned rescue device
CN110110028A (en) * 2019-05-09 2019-08-09 浪潮软件集团有限公司 A kind of method and system showing map by self defined area towards OGC standard
KR20210063791A (en) * 2019-11-25 2021-06-02 한국기술교육대학교 산학협력단 System for mapless navigation based on dqn and slam considering characteristic of obstacle and processing method thereof
CN112198870A (en) * 2020-06-01 2021-01-08 西北工业大学 Unmanned aerial vehicle autonomous guiding maneuver decision method based on DDQN
CN112698646A (en) * 2020-12-05 2021-04-23 西北工业大学 Aircraft path planning method based on reinforcement learning
KR102529331B1 (en) * 2021-12-29 2023-05-09 서울대학교산학협력단 Method for communication based on UAV(unmanned aerial vehicle) BS(base station) using reinforcement learning and apparatus for performing the method

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
张晓路;李斌;常健;唐敬阁;: "水下滑翔蛇形机器人滑翔控制的强化学习方法", 机器人, no. 03 *
张晓路;李斌;常健;唐敬阁;: "水下滑翔蛇形机器人滑翔控制的强化学习方法", 机器人, no. 03, 26 March 2019 (2019-03-26) *
董超;沈赟;屈毓锛;: "基于无人机的边缘智能计算研究综述", 智能科学与技术学报, no. 03, 15 September 2020 (2020-09-15) *
赵玉新;金娜;刘厂;: "基于电子海图的AUV多约束航路规划方法", 中国航海, no. 02, 25 June 2016 (2016-06-25) *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114924587A (en) * 2022-05-27 2022-08-19 江苏科技大学 Unmanned aerial vehicle path planning method
CN114924587B (en) * 2022-05-27 2024-03-19 江苏科技大学 Unmanned aerial vehicle path planning method
CN114721409A (en) * 2022-06-08 2022-07-08 山东大学 Underwater vehicle docking control method based on reinforcement learning
CN115206157A (en) * 2022-08-05 2022-10-18 白杨时代(北京)科技有限公司 Unmanned underwater vehicle path finding training method and device and unmanned underwater vehicle
CN115657683A (en) * 2022-11-14 2023-01-31 中国电子科技集团公司第十研究所 Unmanned and cableless submersible real-time obstacle avoidance method capable of being used for inspection task
CN115855226A (en) * 2023-02-24 2023-03-28 青岛科技大学 Multi-AUV cooperative underwater data acquisition method based on DQN and matrix completion
CN115855226B (en) * 2023-02-24 2023-05-30 青岛科技大学 Multi-AUV cooperative underwater data acquisition method based on DQN and matrix completion
CN116880551A (en) * 2023-07-13 2023-10-13 之江实验室 Flight path planning method, system and storage medium based on random event capturing
CN116880551B (en) * 2023-07-13 2024-06-14 之江实验室 Flight path planning method, system and storage medium based on random event capturing

Also Published As

Publication number Publication date
CN114089762B (en) 2024-06-21

Similar Documents

Publication Publication Date Title
CN114089762B (en) Water-air amphibious unmanned aircraft path planning method based on reinforcement learning
Tang et al. Geometric A-star algorithm: An improved A-star algorithm for AGV path planning in a port environment
CN110108284B (en) Unmanned aerial vehicle three-dimensional flight path rapid planning method considering complex environment constraint
Xiaofei et al. Global path planning algorithm based on double DQN for multi-tasks amphibious unmanned surface vehicle
LU102400B1 (en) Path planning method and system for unmanned surface vehicle based on improved genetic algorithm
CN108459503B (en) Unmanned surface vehicle track planning method based on quantum ant colony algorithm
CN108564202B (en) Unmanned ship route optimization method based on environment forecast information
CN109871022A (en) A kind of intelligent path planning and barrier-avoiding method towards amphibious unmanned rescue device
CN111679692A (en) Unmanned aerial vehicle path planning method based on improved A-star algorithm
CN107816999A (en) A kind of unmanned boat navigation path contexture by self method based on ant group algorithm
CN113505431B (en) Method, device, equipment and medium for searching targets of maritime unmanned aerial vehicle based on ST-DQN
CN111222701A (en) Marine environment map layer-based automatic planning and evaluation method for ship route
Guo et al. An improved a-star algorithm for complete coverage path planning of unmanned ships
CN110889198A (en) Multi-factor joint learning-based dead reckoning probability distribution prediction method and system
CN111665846B (en) Water surface unmanned ship path planning method based on rapid scanning method
Lan et al. Improved RRT algorithms to solve path planning of multi-glider in time-varying ocean currents
CN112859864A (en) Unmanned ship-oriented geometric path planning method
Du et al. An optimized path planning method for coastal ships based on improved DDPG and DP
Wang et al. A novel maritime autonomous navigation decision-making system: Modeling, integration, and real ship trial
CN117193296A (en) Improved A star unmanned ship path planning method based on high safety
Gao et al. An optimized path planning method for container ships in Bohai bay based on improved deep Q-learning
Li et al. Dynamic route planning for a USV-UAV multi-robot system in the rendezvous task with obstacles
Lee et al. Generation of Ship’s passage plan using data-driven shortest path algorithms
Williams et al. A rapid method for planning paths in three dimensions for a small aerial robot
Zhang et al. A MILP model on coordinated coverage path planning system for UAV-ship hybrid team scheduling software

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant