CN114564039A - Flight path planning method based on deep Q network and fast search random tree algorithm - Google Patents

Flight path planning method based on deep Q network and fast search random tree algorithm Download PDF

Info

Publication number
CN114564039A
CN114564039A CN202210089643.2A CN202210089643A CN114564039A CN 114564039 A CN114564039 A CN 114564039A CN 202210089643 A CN202210089643 A CN 202210089643A CN 114564039 A CN114564039 A CN 114564039A
Authority
CN
China
Prior art keywords
space
network
tree
algorithm
state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210089643.2A
Other languages
Chinese (zh)
Other versions
CN114564039B (en
Inventor
李昭莹
石若凌
欧一鸣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN202210089643.2A priority Critical patent/CN114564039B/en
Publication of CN114564039A publication Critical patent/CN114564039A/en
Application granted granted Critical
Publication of CN114564039B publication Critical patent/CN114564039B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/10Simultaneous control of position or course in three dimensions
    • G05D1/101Simultaneous control of position or course in three dimensions specially adapted for aircraft
    • G05D1/106Change initiated in response to external conditions, e.g. avoidance of elevated terrain or of no-fly zones

Landscapes

  • Engineering & Computer Science (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a flight path planning method based on a deep Q network and a fast search random tree algorithm, which comprises the steps of firstly abstracting a Markov decision process in an RRT algorithm; due to the randomness of RRT growth, for each expansion process, the RRT can be regarded as a Markov process and MDP models of the RRT can be built. Then, a deep Q network is trained, and the optimal action corresponding to each state can be obtained by studying the deep Q network: and performing improved RRT path planning after introducing the most action. The invention can make the RRT-GoalBias algorithm have stronger obstacle avoidance capability and increase the chance of greedy expansion, thereby improving the efficiency and stability of the algorithm.

Description

Flight path planning method based on deep Q network and fast search random tree algorithm
Technical Field
The invention relates to the field of flight path planning, in particular to a flight path planning method combining a deep Q network and a fast search random tree algorithm.
Background
The flight path planning technology is one of the key links of the intellectualization of the unmanned aerial vehicle, and is rapidly developed under the drive of a computer technology, an information technology and an artificial intelligence technology. The path planning algorithm is the core of the flight path planning technology, and aims to find a path solution from a starting point to a target point in a model space, wherein the path solution needs to meet a certain constraint condition and meet certain performance indexes (path length, time, energy consumption and the like) according to actual needs. Therefore, the path not only needs to satisfy various platform constraints, but also needs to ensure that the intelligent agent does not collide with obstacles when running along the path. The rapid-searching Random Trees (RRT) algorithm has the advantages of no need of preprocessing a state space, simple process and the like, but has the problems of high randomness, low searching redundancy efficiency, non-optimal path quality and the like, and the application of the algorithm is restricted. The RRT-GoalBias algorithm is a variant of the RRT algorithm, improves target guidance, has the advantages of simplicity, high efficiency and rapid convergence, is still high in algorithm randomness, is lack of obstacle avoidance capability, and reduces algorithm efficiency to a certain extent.
Disclosure of Invention
The invention provides an RRT-GoalBias path planning optimization algorithm (DQN-RRTGoBias) based on a deep Q network to solve the problems of algorithm efficiency reduction and unstable operation caused by high randomness and poor obstacle avoidance capability of an RRT-GoalBias algorithm so as to improve the algorithm efficiency and stability.
The invention relates to a flight path planning method based on a deep Q network and a fast search random tree algorithm, which comprises the following steps:
step 1: modeling the Markov decision process of the RRT algorithm.
And 2, step: a deep Q network is trained.
The purpose of using the DQN algorithm is to obtain a state-action pair(s) that can be evaluatedt,at) The function of merit, called the target Q value, is denoted Q(s)t,at) (ii) a Approximating Q(s) using a neural network as a function approximatort,at) And minimizes the error by gradient descent, approximating the function Q (s, a; w), w are fitting parameters.
And step 3: and planning the path according to the depth Q network.
According to the deep Q network, the optimal action a corresponding to each state can be obtainedopt
Figure BDA0003488677860000021
The improved RRT path planning process introducing the optimal action is as follows:
1) adding a starting point S toRandom tree table Xtree
2) Sampling tree nodes P in state space XsampThe sampling method comprises the following steps:
Figure BDA0003488677860000022
wherein, PgoalAs end point, P randIs a random point in the state space, p is a constant (0)<p<1);
3) In random tree table XtreeFind the distance random node PsampNearest tree node Pnear
4) If Psamp≠PgoalFrom the nearest neighbor tree node PnearTo random node PrandAt a certain step length d0Extending to obtain new tree node PnewOtherwise, executing the optimal action a of the current stateoptTo obtain a new tree node Pnew
5) Judging new tree node PnewAnd a new crotch PnearPnewWhether or not in free space Xfree. If so, q is addednewAdd to random Tree Table Xtree(ii) a If not, returning to the step 2);
6) and repeating the steps 2) to 5) until the random tree extends to the terminal point.
The invention has the advantages that:
1. the flight path planning method based on the deep Q network and the fast search random tree algorithm can enable the obstacle avoidance capability of the RRT-GoalBias algorithm to be stronger, increase the probability of greedy expansion, and further improve the efficiency and stability of the algorithm.
2. The invention relates to a flight path planning method based on a deep Q network and a fast search random tree algorithm, which models a node exploration process of RRT into a Markov Decision Process (MDP) model, and embodies the exploration preference of the RRT algorithm through the reward function design of environment feedback.
3. The invention relates to a flight path planning method based on a deep Q network and a fast search random tree algorithm, which designs a new exploration mechanism based on an RRT algorithm, improves exploration efficiency and considers probability completeness of the algorithm at the same time;
Drawings
FIG. 1 is a flow chart of a flight path planning method based on a deep Q network and a fast search random tree algorithm according to the present invention.
Fig. 2 is a schematic diagram of a complex field variable step size obstacle avoidance strategy. .
FIG. 3 is the structure diagram of BP neural network
Fig. 4 is a map a.
Fig. 5 is a map b.
Fig. 6 is a map a optimal action visualization diagram.
Fig. 7 is a map b optimal action visualization diagram.
Fig. 8 is a simulation example setup.
FIG. 9 is a comparison of algorithm performance under each calculation example.
FIG. 10 shows the algorithm running time in example 1.
FIG. 11 shows the algorithm running time in example 2.
FIG. 12 shows the algorithm running time in example 3.
Detailed Description
The present invention will be described in further detail with reference to examples.
The invention relates to a flight path planning method based on a deep Q network and a fast search random tree algorithm, which comprises the following specific steps as shown in figure 1:
step 1: markov decision process modeling for RRT algorithm
Reinforcement learning generally uses Markov Decision Process (MDP) as a basic framework, and in the simulation of MDP, an intelligent agent senses the current system state, selects and implements actions from an action space according to an optimal strategy, thereby changing the environment and the state of the intelligent agent and obtaining feedback (reward) of the environment. To introduce a reinforcement learning algorithm, the Markov Decision Process (MDP) in the RRT algorithm must first be abstracted. Due to the randomness of RRT growth, for each expansion process, the RRT can be regarded as a Markov process and MDP models of the RRT can be built. The following is a definition of each element of the MDP model in the RRT algorithm according to the present invention.
(ii) state space
The area specified by the path planning task is called "planning space" (planning space), and can be described as:
X={(x,y)|xmin≤x≤xmax,ymin≤y≤ymax}
in the formula, X is a planning space; x and y are two-dimensional coordinates of the position of the agent respectively; x is the number ofmin,yminRespectively, the minimum value of the two-dimensional coordinates; x is the number ofmax,ymaxRespectively, the maximum value of the two-dimensional coordinates.
The plan space can be divided into "free space" (free space) and "obstacle space" (obstacle space), the free space represents the area where the agent (agent) can pass through, and the obstacle space represents the area where the agent cannot pass through, there are:
X=Xfree+Xobs
in the formula, XfreeIs free space; xobsIs an obstacle space.
For convenience of calculation, a binary grid map (binary grid map) is often used in the path planning process to discretely represent a planning space model, a grid with a value of 0 represents a free space node, and a grid with a value of 1 represents an obstacle space node, and there are:
Figure BDA0003488677860000041
in the formula, map (x, y) is a binary grid map.
On a binary grid map, free space can be defined as:
Xfree={(x,y)|map(x,y)=0}
accordingly, the obstacle space may be defined as:
Xobs={(x,y)|map(x,y)=1}
in the present invention, the free space XfreeThe two-dimensional space composed of the coordinates of all the points is called a state space, and the representation method is as follows:
S={(x,y)|map(x,y)=0}
motion space-
In order to enable the RRT to avoid the obstacle independently by adjusting the growth direction, the action in the random tree exploration process is designed as the growth angle of the branches, and a complex field variable step size obstacle avoidance strategy is introduced, wherein the complex step size is defined as:
d=d0e
Wherein, d0The crotch length is represented and is a positive real constant; j is an imaginary unit; θ represents the new fork relative to the orientation to the target point PgoalThe rotation angle of the direction is in the range of (-pi, pi). As shown in FIG. 2, the action space designed by the present invention is a set of 5 complex steps, that is, the growing direction of each branch has five choices, and the expression is as follows.
Figure BDA0003488677860000051
③ reward function
In the RRT algorithm, generally, the less the number of times of encountering an obstacle, the faster the speed of approaching the target point, and the higher the efficiency of path planning. Therefore, the reward function is designed to measure whether the current state action pair can cause the random tree to encounter obstacles and be closer to the target point. Meanwhile, the obstacle avoidance of the random tree needs to have certain predictability, namely, the obstacle avoidance operation can be carried out at PnearWhen the distance from the obstacle is still a certain distance; the reward function is thus designed to be:
Figure BDA0003488677860000052
wherein, c1,c2,c3K is a normal number, | arg (a) | represents the argument of action a. Condition P is expressed as: from the current state stMake action a ═ d0ej0Expanding the random tree to the next state st+1And based on the result of the reinforcement learning algorithm training, state st+1The corresponding optimal action is
Figure BDA0003488677860000053
Or
Figure BDA0003488677860000054
Step 2: training deep Q-network
The purpose of using the DQN algorithm is to obtain a "state-action pair"(s) that can be evaluatedt,at) The function of merit, called the target Q value, is denoted Q(s)t,at). The invention utilizes a neural network as a function approximator to approximate Q(s)t,at) And minimizes the error by gradient descent, approximating the function Q (s, a; w), w are fitting parameters.
The invention adopts a back propagation algorithm to design a neural network, and inputs the state s corresponding to the current statetSince the output is the Q value after five operations in the operation space in this state, a 2-input 5-output feedforward neural network is used, and the configuration thereof is shown in fig. 3.
The main process of training is as follows:
(1) initializing the playback buffer D, predicting the network Q (s, a; w) and the target network Q (s, a; w)-) Wherein w is-W is a random weight;
(2) initialization state st
(3) Returning the Q values corresponding to the five actions in the state by using the prediction network, and selecting the action a corresponding to the maximum Q valuetTo perform;
(4) after performing the action, a reward R is calculated using a reward functiont+1Transition to a new state st+1And updating the corresponding Q value according to:
Q(st,at)=Rt+1+λmaxQ(st+1,a;w-)
wherein λ is a discount factor;
(5) transferring the information<st at Rt+1st+1>Storing in a playback buffer D;
(6) randomly sampling a batch of transfer information from the playback buffer D, and calculating a loss function:
L(w)=[Rt+1+λmaxQ(st+1,a;w-)-Q(st,at;w)]2
And updating w by using a gradient descent method;
(7) repeating the steps (3) to (6) for C times, and enabling w-=w;
(8) And (4) repeating the steps (3) to (7) until the final state (namely the random tree is extended to the end point).
And 3, step 3: path planning according to deep Q network
According to the deep Q network, the optimal action a corresponding to each state can be obtainedopt
Figure BDA0003488677860000061
The improved RRT path planning process introducing the optimal action is as follows:
7) adding the starting point S to the random tree table Xtree
8) Sampling tree nodes P in state space XsampThe sampling method comprises the following steps:
Figure BDA0003488677860000062
wherein, PgoalAs end point, PrandIs a random point in the state space, p is a constant (0)<p<1);
9) In random tree table XtreeFind the distance random node PsampNearest tree node Pnear
10) If Psamp≠PgoalFrom the nearest neighbor tree node PnearTo random node PrandAt a certain step length d0Extending to obtain new tree node PnewOtherwise, executing the optimal action a of the current stateoptTo obtain a new tree node Pnew
11) Judging new tree node PnewAnd a new crotch PnearPnewWhether or not in free space Xfree. If so, q is addednewAdd to random Tree Table Xtree(ii) a If not, returning to the step 2);
12) and repeating the steps 2) to 5) until the random tree extends to the terminal point.
To verify the effect of the algorithm, two 500 × 500 maps are selected and marked as map a and map b, as shown in fig. 4 and 5. Training and extracting an optimal action table on the matlab platform, as shown in fig. 6 and 7, and performing a path planning simulation experiment according to the optimal action table. Three examples were set in the simulation, and as shown in fig. 8, 1000 experiments were performed for each example, and the planning time of each experiment was recorded, and the results of the experiments are shown in fig. 9 to 12. Simulation results show that the DQN-RRTGoalbias algorithm can improve efficiency and time performance stability, and can achieve expected effects under different calculation conditions.
The RRT-GoalBias path planning optimization algorithm based on the deep Q network combines a relatively mechanical RRT-GoalBias algorithm with poor obstacle avoidance capability and a deep Q network algorithm by designing a complex number field variable step length obstacle avoidance strategy, so that the deep Q network RRT-GoalBias path planning optimization algorithm has the capability of flexibly avoiding obstacles according to the learning experience. Simulation shows that the optimization algorithm can improve the efficiency and the time performance stability of the algorithm.

Claims (4)

1. A flight path planning method based on a deep Q network and a fast search random tree algorithm is characterized in that:
step 1: modeling a Markov decision process of the RRT algorithm;
step 2: training a deep Q network;
the purpose of using the DQN algorithm is to obtain a state-action pair(s) that can be evaluatedt,at) The function of merit, called the target Q value, is denoted Q(s)t,at) (ii) a Approximating Q(s) using a neural network as a function approximatort,at) And minimizes the error by gradient descent, approximating the function Q (s, a; w), w is a fitting parameter;
and step 3: planning a path according to the depth Q network;
according to the deep Q network, the optimal action a corresponding to each state can be obtainedopt
Figure FDA0003488677850000011
The improved RRT path planning process introducing the optimal action is as follows:
1) adding origin S to random Tree Table Xtree
2) Sampling tree nodes P in state space X sampThe sampling method comprises the following steps:
Figure FDA0003488677850000012
wherein, PgoalAs end point, PrandIs a random point in the state space, p is a constant (0)<p<1);
3) In random tree table XtreeFind random node P of distancesampNearest tree node Pnear
4) If Psamp≠PgoalFrom the nearest neighbor tree node PnearTo random node PrandAt a certain step length d0Extending to obtain new tree node PnewOtherwise, executing the optimal action a of the current stateoptTo obtain a new tree node Pnew
5) Judging new tree node PnewAnd a new crotch PnearPnewWhether or not in free space Xfree(ii) a If so, q is addednewAdd to random Tree Table Xtree(ii) a If not, returning to the step 2);
6) and repeating the steps 2) to 5) until the random tree extends to the terminal point.
2. The flight path planning method based on the deep Q network and the fast search random tree algorithm as claimed in claim 1, characterized in that:
the elements of the markov decision process model are defined as:
(ii) state space
The area specified by the path planning task is called a planning space, and can be described as follows:
X={(x,y)|xmin≤x≤xmax,ymin≤y≤ymax}
in the formula, X is a planning space; x and y are two-dimensional coordinates of the position of the agent respectively; x is the number ofmin,yminRespectively, the minimum value of the two-dimensional coordinates; x is the number ofmax,ymaxRespectively are maximum values of two-dimensional coordinates;
the planning space can be divided into free space and obstacle space, and the free space represents the region that the agent can pass, and the obstacle space then represents the region that the agent can not pass, then has:
X=Xfree+Xobs
In the formula, XfreeIs free space; xobsIs an obstacle space;
for convenience of operation, a binary grid map is often used in a path planning process to discretely represent a planning space model, a grid with a value of 0 represents a free space node, and a grid with a value of 1 represents an obstacle space node, and the following methods are provided:
Figure FDA0003488677850000021
in the formula, map (x, y) is a binary grid map;
on a binary grid map, free space can be defined as:
Xfree={(x,y)|map(x,y)=0}
accordingly, the obstacle space may be defined as:
Xobs={(x,y)|map(x,y)=1}
free space XfreeThe two-dimensional space composed of the coordinates of all the points is called a state space, and the representation method is as follows:
S={(x,y)|map(x,y)=0}
motion space-
In order to enable the RRT to avoid the obstacle independently by adjusting the growth direction, the action in the random tree exploration process is designed as the growth angle of the branches, and a complex field variable step size obstacle avoidance strategy is introduced, wherein the complex step size is defined as:
d=d0e
wherein d is0The crotch length is represented as a positive real constant; theta denotes the new bifurcation relative to the orientation target point PgoalThe value range of the rotation angle of the direction is (-pi, pi);
③ reward function
The reward function is designed as:
Figure FDA0003488677850000031
wherein, c1,c2,c3K is a normal number, | arg (a) | represents the argument of action a. Condition P is expressed as: from the current state stMake action a ═ d0ej0Expanding the random tree to the next state s t+1And based on the result of the reinforcement learning algorithm training, state st+1The corresponding optimal action is
Figure FDA0003488677850000032
Or
Figure FDA0003488677850000033
3. The flight path planning method based on the deep Q network and the fast search random tree algorithm as claimed in claim 2, characterized in that: the motion space is designed as a set of 5 complex steps, and the expression is as follows:
Figure FDA0003488677850000034
4. a method as claimed in claim 1, based onThe flight path planning method of the deep Q network and the fast search random tree algorithm is characterized in that: in step 2, a neural network is designed by adopting a back propagation algorithm, and the input corresponds to the current state stThe output is the Q value after each action in the action space in this state, so that the deep Q network training method is to use a 2-input 5-output feedforward neural network as follows:
(1) initializing the playback buffer D, predicting the network Q (s, a; w) and the target network Q (s, a; w)-) Wherein w is-W is a random weight;
(2) initialization state st
(3) Returning the Q values corresponding to all actions in the state by using a prediction network, and selecting the action a corresponding to the maximum Q valuetTo perform;
(4) after performing the action, a reward R is calculated using a reward functiont+1Transition to a new state st+1And updating the corresponding Q value according to:
Q(st,at)=Rt+1+λmaxQ(st+1,a;w-)
Wherein λ is a discounting factor;
(5) transferring the information<st at Rt+1 st+1>Storing in a playback buffer D;
(6) randomly sampling a batch of transfer information from the playback buffer D, and calculating a loss function:
L(w)=[Rt+1+λmaxQ(st+1,a;w-)-Q(st,at;w)]2
updating w by using a gradient descent method;
(7) repeating the steps (3) to (6) for C times, and enabling w-=w;
(8) And (5) repeatedly executing the steps (3) to (7) until the final state is reached.
CN202210089643.2A 2022-01-25 2022-01-25 Flight path planning method based on deep Q network and rapid search random tree algorithm Active CN114564039B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210089643.2A CN114564039B (en) 2022-01-25 2022-01-25 Flight path planning method based on deep Q network and rapid search random tree algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210089643.2A CN114564039B (en) 2022-01-25 2022-01-25 Flight path planning method based on deep Q network and rapid search random tree algorithm

Publications (2)

Publication Number Publication Date
CN114564039A true CN114564039A (en) 2022-05-31
CN114564039B CN114564039B (en) 2024-08-02

Family

ID=81713754

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210089643.2A Active CN114564039B (en) 2022-01-25 2022-01-25 Flight path planning method based on deep Q network and rapid search random tree algorithm

Country Status (1)

Country Link
CN (1) CN114564039B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117970931A (en) * 2024-03-29 2024-05-03 青岛科技大学 Robot dynamic path planning method, equipment and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107085437A (en) * 2017-03-20 2017-08-22 浙江工业大学 A kind of unmanned aerial vehicle flight path planing method based on EB RRT
CN111487992A (en) * 2020-04-22 2020-08-04 北京航空航天大学 Unmanned aerial vehicle sensing and obstacle avoidance integrated method and device based on deep reinforcement learning
CN111752306A (en) * 2020-08-14 2020-10-09 西北工业大学 Unmanned aerial vehicle route planning method based on fast-expanding random tree
CN112799420A (en) * 2021-01-08 2021-05-14 南京邮电大学 Real-time track planning method based on multi-sensor unmanned aerial vehicle
US20210165405A1 (en) * 2019-12-03 2021-06-03 University-Industry Cooperation Group Of Kyung Hee University Multiple unmanned aerial vehicles navigation optimization method and multiple unmanned aerial vehicles system using the same
CN113110592A (en) * 2021-04-23 2021-07-13 南京大学 Unmanned aerial vehicle obstacle avoidance and path planning method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107085437A (en) * 2017-03-20 2017-08-22 浙江工业大学 A kind of unmanned aerial vehicle flight path planing method based on EB RRT
US20210165405A1 (en) * 2019-12-03 2021-06-03 University-Industry Cooperation Group Of Kyung Hee University Multiple unmanned aerial vehicles navigation optimization method and multiple unmanned aerial vehicles system using the same
CN111487992A (en) * 2020-04-22 2020-08-04 北京航空航天大学 Unmanned aerial vehicle sensing and obstacle avoidance integrated method and device based on deep reinforcement learning
CN111752306A (en) * 2020-08-14 2020-10-09 西北工业大学 Unmanned aerial vehicle route planning method based on fast-expanding random tree
CN112799420A (en) * 2021-01-08 2021-05-14 南京邮电大学 Real-time track planning method based on multi-sensor unmanned aerial vehicle
CN113110592A (en) * 2021-04-23 2021-07-13 南京大学 Unmanned aerial vehicle obstacle avoidance and path planning method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
吴健发;王宏伦;刘一恒;姚鹏;: "无人机避障航路规划方法研究综述", 无人系统技术, no. 01, 15 January 2020 (2020-01-15) *
潘广贞;秦帆;张文斌;: "动态自适应快速扩展树航迹规划算法研究", 微电子学与计算机, no. 01, 5 January 2013 (2013-01-05) *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117970931A (en) * 2024-03-29 2024-05-03 青岛科技大学 Robot dynamic path planning method, equipment and medium
CN117970931B (en) * 2024-03-29 2024-07-05 青岛科技大学 Robot dynamic path planning method, equipment and medium

Also Published As

Publication number Publication date
CN114564039B (en) 2024-08-02

Similar Documents

Publication Publication Date Title
CN110083165B (en) Path planning method of robot in complex narrow environment
CN110347151B (en) Robot path planning method fused with Bezier optimization genetic algorithm
CN104155974B (en) Path planning method and apparatus for robot fast collision avoidance
CN110297490B (en) Self-reconstruction planning method of heterogeneous modular robot based on reinforcement learning algorithm
CN116242383B (en) Unmanned vehicle path planning method based on reinforced Harris eagle algorithm
CN110883776A (en) Robot path planning algorithm for improving DQN under quick search mechanism
CN108413963A (en) Bar-type machine people&#39;s paths planning method based on self study ant group algorithm
CN114489052B (en) Path planning method for improving RRT algorithm reconnection strategy
CN111159489B (en) Searching method
CN111191785A (en) Structure searching method based on expanded search space
CN114564039A (en) Flight path planning method based on deep Q network and fast search random tree algorithm
Chen et al. Intelligent warehouse robot path planning based on improved ant colony algorithm
CN115493597A (en) AUV path planning control method based on SAC algorithm
Tusi et al. Using ABC and RRT algorithms to improve mobile robot path planning with danger degree
CN117520956A (en) Two-stage automatic feature engineering method based on reinforcement learning and meta learning
Yu et al. AGV multi-objective path planning method based on improved cuckoo algorithm
Cui et al. Improved multi-objective artificial bee colony algorithm-based path planning for mobile robots
Huang et al. An Improved Q-Learning Algorithm for Path Planning
Qiu et al. Obstacle avoidance planning combining reinforcement learning and RRT* applied to underwater operations
Chen et al. Optimization of robot path planning based on improved BP algorithm
Ji et al. Research on Path Planning of Mobile Robot Based on Reinforcement Learning
Li et al. Mobile Robot Path Planning Algorithm Based on Improved RRT* FN
Zhang et al. Robot path planning based on shuffled frog leaping algorithm combined with genetic algorithm
CN117388643B (en) Method, system, equipment and storage medium for positioning fault section of active power distribution network
CN116718198B (en) Unmanned aerial vehicle cluster path planning method and system based on time sequence knowledge graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant