CN114564039A - Flight path planning method based on deep Q network and fast search random tree algorithm - Google Patents
Flight path planning method based on deep Q network and fast search random tree algorithm Download PDFInfo
- Publication number
- CN114564039A CN114564039A CN202210089643.2A CN202210089643A CN114564039A CN 114564039 A CN114564039 A CN 114564039A CN 202210089643 A CN202210089643 A CN 202210089643A CN 114564039 A CN114564039 A CN 114564039A
- Authority
- CN
- China
- Prior art keywords
- space
- network
- tree
- algorithm
- state
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000004422 calculation algorithm Methods 0.000 title claims abstract description 59
- 238000000034 method Methods 0.000 title claims abstract description 44
- 230000009471 action Effects 0.000 claims abstract description 35
- 230000008569 process Effects 0.000 claims abstract description 22
- 230000006870 function Effects 0.000 claims description 17
- 239000003795 chemical substances by application Substances 0.000 claims description 10
- 238000013528 artificial neural network Methods 0.000 claims description 8
- 238000005070 sampling Methods 0.000 claims description 8
- 238000012549 training Methods 0.000 claims description 7
- 230000002787 reinforcement Effects 0.000 claims description 4
- JXASPPWQHFOWPL-UHFFFAOYSA-N Tamarixin Natural products C1=C(O)C(OC)=CC=C1C1=C(OC2C(C(O)C(O)C(CO)O2)O)C(=O)C2=C(O)C=C(O)C=C2O1 JXASPPWQHFOWPL-UHFFFAOYSA-N 0.000 claims description 2
- 238000011478 gradient descent method Methods 0.000 claims description 2
- 238000012546 transfer Methods 0.000 claims description 2
- 238000004088 simulation Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000005457 optimization Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 238000012800 visualization Methods 0.000 description 2
- 206010063385 Intellectualisation Diseases 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005265 energy consumption Methods 0.000 description 1
- 230000003631 expected effect Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
- G05D1/10—Simultaneous control of position or course in three dimensions
- G05D1/101—Simultaneous control of position or course in three dimensions specially adapted for aircraft
- G05D1/106—Change initiated in response to external conditions, e.g. avoidance of elevated terrain or of no-fly zones
Landscapes
- Engineering & Computer Science (AREA)
- Aviation & Aerospace Engineering (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Automation & Control Theory (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a flight path planning method based on a deep Q network and a fast search random tree algorithm, which comprises the steps of firstly abstracting a Markov decision process in an RRT algorithm; due to the randomness of RRT growth, for each expansion process, the RRT can be regarded as a Markov process and MDP models of the RRT can be built. Then, a deep Q network is trained, and the optimal action corresponding to each state can be obtained by studying the deep Q network: and performing improved RRT path planning after introducing the most action. The invention can make the RRT-GoalBias algorithm have stronger obstacle avoidance capability and increase the chance of greedy expansion, thereby improving the efficiency and stability of the algorithm.
Description
Technical Field
The invention relates to the field of flight path planning, in particular to a flight path planning method combining a deep Q network and a fast search random tree algorithm.
Background
The flight path planning technology is one of the key links of the intellectualization of the unmanned aerial vehicle, and is rapidly developed under the drive of a computer technology, an information technology and an artificial intelligence technology. The path planning algorithm is the core of the flight path planning technology, and aims to find a path solution from a starting point to a target point in a model space, wherein the path solution needs to meet a certain constraint condition and meet certain performance indexes (path length, time, energy consumption and the like) according to actual needs. Therefore, the path not only needs to satisfy various platform constraints, but also needs to ensure that the intelligent agent does not collide with obstacles when running along the path. The rapid-searching Random Trees (RRT) algorithm has the advantages of no need of preprocessing a state space, simple process and the like, but has the problems of high randomness, low searching redundancy efficiency, non-optimal path quality and the like, and the application of the algorithm is restricted. The RRT-GoalBias algorithm is a variant of the RRT algorithm, improves target guidance, has the advantages of simplicity, high efficiency and rapid convergence, is still high in algorithm randomness, is lack of obstacle avoidance capability, and reduces algorithm efficiency to a certain extent.
Disclosure of Invention
The invention provides an RRT-GoalBias path planning optimization algorithm (DQN-RRTGoBias) based on a deep Q network to solve the problems of algorithm efficiency reduction and unstable operation caused by high randomness and poor obstacle avoidance capability of an RRT-GoalBias algorithm so as to improve the algorithm efficiency and stability.
The invention relates to a flight path planning method based on a deep Q network and a fast search random tree algorithm, which comprises the following steps:
step 1: modeling the Markov decision process of the RRT algorithm.
And 2, step: a deep Q network is trained.
The purpose of using the DQN algorithm is to obtain a state-action pair(s) that can be evaluatedt,at) The function of merit, called the target Q value, is denoted Q(s)t,at) (ii) a Approximating Q(s) using a neural network as a function approximatort,at) And minimizes the error by gradient descent, approximating the function Q (s, a; w), w are fitting parameters.
And step 3: and planning the path according to the depth Q network.
According to the deep Q network, the optimal action a corresponding to each state can be obtainedopt:
The improved RRT path planning process introducing the optimal action is as follows:
1) adding a starting point S toRandom tree table Xtree;
2) Sampling tree nodes P in state space XsampThe sampling method comprises the following steps:
wherein, PgoalAs end point, P randIs a random point in the state space, p is a constant (0)<p<1);
3) In random tree table XtreeFind the distance random node PsampNearest tree node Pnear;
4) If Psamp≠PgoalFrom the nearest neighbor tree node PnearTo random node PrandAt a certain step length d0Extending to obtain new tree node PnewOtherwise, executing the optimal action a of the current stateoptTo obtain a new tree node Pnew;
5) Judging new tree node PnewAnd a new crotch PnearPnewWhether or not in free space Xfree. If so, q is addednewAdd to random Tree Table Xtree(ii) a If not, returning to the step 2);
6) and repeating the steps 2) to 5) until the random tree extends to the terminal point.
The invention has the advantages that:
1. the flight path planning method based on the deep Q network and the fast search random tree algorithm can enable the obstacle avoidance capability of the RRT-GoalBias algorithm to be stronger, increase the probability of greedy expansion, and further improve the efficiency and stability of the algorithm.
2. The invention relates to a flight path planning method based on a deep Q network and a fast search random tree algorithm, which models a node exploration process of RRT into a Markov Decision Process (MDP) model, and embodies the exploration preference of the RRT algorithm through the reward function design of environment feedback.
3. The invention relates to a flight path planning method based on a deep Q network and a fast search random tree algorithm, which designs a new exploration mechanism based on an RRT algorithm, improves exploration efficiency and considers probability completeness of the algorithm at the same time;
Drawings
FIG. 1 is a flow chart of a flight path planning method based on a deep Q network and a fast search random tree algorithm according to the present invention.
Fig. 2 is a schematic diagram of a complex field variable step size obstacle avoidance strategy. .
FIG. 3 is the structure diagram of BP neural network
Fig. 4 is a map a.
Fig. 5 is a map b.
Fig. 6 is a map a optimal action visualization diagram.
Fig. 7 is a map b optimal action visualization diagram.
Fig. 8 is a simulation example setup.
FIG. 9 is a comparison of algorithm performance under each calculation example.
FIG. 10 shows the algorithm running time in example 1.
FIG. 11 shows the algorithm running time in example 2.
FIG. 12 shows the algorithm running time in example 3.
Detailed Description
The present invention will be described in further detail with reference to examples.
The invention relates to a flight path planning method based on a deep Q network and a fast search random tree algorithm, which comprises the following specific steps as shown in figure 1:
step 1: markov decision process modeling for RRT algorithm
Reinforcement learning generally uses Markov Decision Process (MDP) as a basic framework, and in the simulation of MDP, an intelligent agent senses the current system state, selects and implements actions from an action space according to an optimal strategy, thereby changing the environment and the state of the intelligent agent and obtaining feedback (reward) of the environment. To introduce a reinforcement learning algorithm, the Markov Decision Process (MDP) in the RRT algorithm must first be abstracted. Due to the randomness of RRT growth, for each expansion process, the RRT can be regarded as a Markov process and MDP models of the RRT can be built. The following is a definition of each element of the MDP model in the RRT algorithm according to the present invention.
(ii) state space
The area specified by the path planning task is called "planning space" (planning space), and can be described as:
X={(x,y)|xmin≤x≤xmax,ymin≤y≤ymax}
in the formula, X is a planning space; x and y are two-dimensional coordinates of the position of the agent respectively; x is the number ofmin,yminRespectively, the minimum value of the two-dimensional coordinates; x is the number ofmax,ymaxRespectively, the maximum value of the two-dimensional coordinates.
The plan space can be divided into "free space" (free space) and "obstacle space" (obstacle space), the free space represents the area where the agent (agent) can pass through, and the obstacle space represents the area where the agent cannot pass through, there are:
X=Xfree+Xobs
in the formula, XfreeIs free space; xobsIs an obstacle space.
For convenience of calculation, a binary grid map (binary grid map) is often used in the path planning process to discretely represent a planning space model, a grid with a value of 0 represents a free space node, and a grid with a value of 1 represents an obstacle space node, and there are:
in the formula, map (x, y) is a binary grid map.
On a binary grid map, free space can be defined as:
Xfree={(x,y)|map(x,y)=0}
accordingly, the obstacle space may be defined as:
Xobs={(x,y)|map(x,y)=1}
in the present invention, the free space XfreeThe two-dimensional space composed of the coordinates of all the points is called a state space, and the representation method is as follows:
S={(x,y)|map(x,y)=0}
motion space-
In order to enable the RRT to avoid the obstacle independently by adjusting the growth direction, the action in the random tree exploration process is designed as the growth angle of the branches, and a complex field variable step size obstacle avoidance strategy is introduced, wherein the complex step size is defined as:
d=d0ejθ
Wherein, d0The crotch length is represented and is a positive real constant; j is an imaginary unit; θ represents the new fork relative to the orientation to the target point PgoalThe rotation angle of the direction is in the range of (-pi, pi). As shown in FIG. 2, the action space designed by the present invention is a set of 5 complex steps, that is, the growing direction of each branch has five choices, and the expression is as follows.
③ reward function
In the RRT algorithm, generally, the less the number of times of encountering an obstacle, the faster the speed of approaching the target point, and the higher the efficiency of path planning. Therefore, the reward function is designed to measure whether the current state action pair can cause the random tree to encounter obstacles and be closer to the target point. Meanwhile, the obstacle avoidance of the random tree needs to have certain predictability, namely, the obstacle avoidance operation can be carried out at PnearWhen the distance from the obstacle is still a certain distance; the reward function is thus designed to be:
wherein, c1,c2,c3K is a normal number, | arg (a) | represents the argument of action a. Condition P is expressed as: from the current state stMake action a ═ d0ej0Expanding the random tree to the next state st+1And based on the result of the reinforcement learning algorithm training, state st+1The corresponding optimal action isOr
Step 2: training deep Q-network
The purpose of using the DQN algorithm is to obtain a "state-action pair"(s) that can be evaluatedt,at) The function of merit, called the target Q value, is denoted Q(s)t,at). The invention utilizes a neural network as a function approximator to approximate Q(s)t,at) And minimizes the error by gradient descent, approximating the function Q (s, a; w), w are fitting parameters.
The invention adopts a back propagation algorithm to design a neural network, and inputs the state s corresponding to the current statetSince the output is the Q value after five operations in the operation space in this state, a 2-input 5-output feedforward neural network is used, and the configuration thereof is shown in fig. 3.
The main process of training is as follows:
(1) initializing the playback buffer D, predicting the network Q (s, a; w) and the target network Q (s, a; w)-) Wherein w is-W is a random weight;
(2) initialization state st;
(3) Returning the Q values corresponding to the five actions in the state by using the prediction network, and selecting the action a corresponding to the maximum Q valuetTo perform;
(4) after performing the action, a reward R is calculated using a reward functiont+1Transition to a new state st+1And updating the corresponding Q value according to:
Q(st,at)=Rt+1+λmaxQ(st+1,a;w-)
wherein λ is a discount factor;
(5) transferring the information<st at Rt+1st+1>Storing in a playback buffer D;
(6) randomly sampling a batch of transfer information from the playback buffer D, and calculating a loss function:
L(w)=[Rt+1+λmaxQ(st+1,a;w-)-Q(st,at;w)]2
And updating w by using a gradient descent method;
(7) repeating the steps (3) to (6) for C times, and enabling w-=w;
(8) And (4) repeating the steps (3) to (7) until the final state (namely the random tree is extended to the end point).
And 3, step 3: path planning according to deep Q network
According to the deep Q network, the optimal action a corresponding to each state can be obtainedopt:
The improved RRT path planning process introducing the optimal action is as follows:
7) adding the starting point S to the random tree table Xtree;
8) Sampling tree nodes P in state space XsampThe sampling method comprises the following steps:
wherein, PgoalAs end point, PrandIs a random point in the state space, p is a constant (0)<p<1);
9) In random tree table XtreeFind the distance random node PsampNearest tree node Pnear;
10) If Psamp≠PgoalFrom the nearest neighbor tree node PnearTo random node PrandAt a certain step length d0Extending to obtain new tree node PnewOtherwise, executing the optimal action a of the current stateoptTo obtain a new tree node Pnew;
11) Judging new tree node PnewAnd a new crotch PnearPnewWhether or not in free space Xfree. If so, q is addednewAdd to random Tree Table Xtree(ii) a If not, returning to the step 2);
12) and repeating the steps 2) to 5) until the random tree extends to the terminal point.
To verify the effect of the algorithm, two 500 × 500 maps are selected and marked as map a and map b, as shown in fig. 4 and 5. Training and extracting an optimal action table on the matlab platform, as shown in fig. 6 and 7, and performing a path planning simulation experiment according to the optimal action table. Three examples were set in the simulation, and as shown in fig. 8, 1000 experiments were performed for each example, and the planning time of each experiment was recorded, and the results of the experiments are shown in fig. 9 to 12. Simulation results show that the DQN-RRTGoalbias algorithm can improve efficiency and time performance stability, and can achieve expected effects under different calculation conditions.
The RRT-GoalBias path planning optimization algorithm based on the deep Q network combines a relatively mechanical RRT-GoalBias algorithm with poor obstacle avoidance capability and a deep Q network algorithm by designing a complex number field variable step length obstacle avoidance strategy, so that the deep Q network RRT-GoalBias path planning optimization algorithm has the capability of flexibly avoiding obstacles according to the learning experience. Simulation shows that the optimization algorithm can improve the efficiency and the time performance stability of the algorithm.
Claims (4)
1. A flight path planning method based on a deep Q network and a fast search random tree algorithm is characterized in that:
step 1: modeling a Markov decision process of the RRT algorithm;
step 2: training a deep Q network;
the purpose of using the DQN algorithm is to obtain a state-action pair(s) that can be evaluatedt,at) The function of merit, called the target Q value, is denoted Q(s)t,at) (ii) a Approximating Q(s) using a neural network as a function approximatort,at) And minimizes the error by gradient descent, approximating the function Q (s, a; w), w is a fitting parameter;
and step 3: planning a path according to the depth Q network;
according to the deep Q network, the optimal action a corresponding to each state can be obtainedopt:
The improved RRT path planning process introducing the optimal action is as follows:
1) adding origin S to random Tree Table Xtree;
2) Sampling tree nodes P in state space X sampThe sampling method comprises the following steps:
wherein, PgoalAs end point, PrandIs a random point in the state space, p is a constant (0)<p<1);
3) In random tree table XtreeFind random node P of distancesampNearest tree node Pnear;
4) If Psamp≠PgoalFrom the nearest neighbor tree node PnearTo random node PrandAt a certain step length d0Extending to obtain new tree node PnewOtherwise, executing the optimal action a of the current stateoptTo obtain a new tree node Pnew;
5) Judging new tree node PnewAnd a new crotch PnearPnewWhether or not in free space Xfree(ii) a If so, q is addednewAdd to random Tree Table Xtree(ii) a If not, returning to the step 2);
6) and repeating the steps 2) to 5) until the random tree extends to the terminal point.
2. The flight path planning method based on the deep Q network and the fast search random tree algorithm as claimed in claim 1, characterized in that:
the elements of the markov decision process model are defined as:
(ii) state space
The area specified by the path planning task is called a planning space, and can be described as follows:
X={(x,y)|xmin≤x≤xmax,ymin≤y≤ymax}
in the formula, X is a planning space; x and y are two-dimensional coordinates of the position of the agent respectively; x is the number ofmin,yminRespectively, the minimum value of the two-dimensional coordinates; x is the number ofmax,ymaxRespectively are maximum values of two-dimensional coordinates;
the planning space can be divided into free space and obstacle space, and the free space represents the region that the agent can pass, and the obstacle space then represents the region that the agent can not pass, then has:
X=Xfree+Xobs
In the formula, XfreeIs free space; xobsIs an obstacle space;
for convenience of operation, a binary grid map is often used in a path planning process to discretely represent a planning space model, a grid with a value of 0 represents a free space node, and a grid with a value of 1 represents an obstacle space node, and the following methods are provided:
in the formula, map (x, y) is a binary grid map;
on a binary grid map, free space can be defined as:
Xfree={(x,y)|map(x,y)=0}
accordingly, the obstacle space may be defined as:
Xobs={(x,y)|map(x,y)=1}
free space XfreeThe two-dimensional space composed of the coordinates of all the points is called a state space, and the representation method is as follows:
S={(x,y)|map(x,y)=0}
motion space-
In order to enable the RRT to avoid the obstacle independently by adjusting the growth direction, the action in the random tree exploration process is designed as the growth angle of the branches, and a complex field variable step size obstacle avoidance strategy is introduced, wherein the complex step size is defined as:
d=d0ejθ
wherein d is0The crotch length is represented as a positive real constant; theta denotes the new bifurcation relative to the orientation target point PgoalThe value range of the rotation angle of the direction is (-pi, pi);
③ reward function
The reward function is designed as:
wherein, c1,c2,c3K is a normal number, | arg (a) | represents the argument of action a. Condition P is expressed as: from the current state stMake action a ═ d0ej0Expanding the random tree to the next state s t+1And based on the result of the reinforcement learning algorithm training, state st+1The corresponding optimal action isOr
4. a method as claimed in claim 1, based onThe flight path planning method of the deep Q network and the fast search random tree algorithm is characterized in that: in step 2, a neural network is designed by adopting a back propagation algorithm, and the input corresponds to the current state stThe output is the Q value after each action in the action space in this state, so that the deep Q network training method is to use a 2-input 5-output feedforward neural network as follows:
(1) initializing the playback buffer D, predicting the network Q (s, a; w) and the target network Q (s, a; w)-) Wherein w is-W is a random weight;
(2) initialization state st;
(3) Returning the Q values corresponding to all actions in the state by using a prediction network, and selecting the action a corresponding to the maximum Q valuetTo perform;
(4) after performing the action, a reward R is calculated using a reward functiont+1Transition to a new state st+1And updating the corresponding Q value according to:
Q(st,at)=Rt+1+λmaxQ(st+1,a;w-)
Wherein λ is a discounting factor;
(5) transferring the information<st at Rt+1 st+1>Storing in a playback buffer D;
(6) randomly sampling a batch of transfer information from the playback buffer D, and calculating a loss function:
L(w)=[Rt+1+λmaxQ(st+1,a;w-)-Q(st,at;w)]2
updating w by using a gradient descent method;
(7) repeating the steps (3) to (6) for C times, and enabling w-=w;
(8) And (5) repeatedly executing the steps (3) to (7) until the final state is reached.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210089643.2A CN114564039B (en) | 2022-01-25 | 2022-01-25 | Flight path planning method based on deep Q network and rapid search random tree algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210089643.2A CN114564039B (en) | 2022-01-25 | 2022-01-25 | Flight path planning method based on deep Q network and rapid search random tree algorithm |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114564039A true CN114564039A (en) | 2022-05-31 |
CN114564039B CN114564039B (en) | 2024-08-02 |
Family
ID=81713754
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210089643.2A Active CN114564039B (en) | 2022-01-25 | 2022-01-25 | Flight path planning method based on deep Q network and rapid search random tree algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114564039B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117970931A (en) * | 2024-03-29 | 2024-05-03 | 青岛科技大学 | Robot dynamic path planning method, equipment and medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107085437A (en) * | 2017-03-20 | 2017-08-22 | 浙江工业大学 | A kind of unmanned aerial vehicle flight path planing method based on EB RRT |
CN111487992A (en) * | 2020-04-22 | 2020-08-04 | 北京航空航天大学 | Unmanned aerial vehicle sensing and obstacle avoidance integrated method and device based on deep reinforcement learning |
CN111752306A (en) * | 2020-08-14 | 2020-10-09 | 西北工业大学 | Unmanned aerial vehicle route planning method based on fast-expanding random tree |
CN112799420A (en) * | 2021-01-08 | 2021-05-14 | 南京邮电大学 | Real-time track planning method based on multi-sensor unmanned aerial vehicle |
US20210165405A1 (en) * | 2019-12-03 | 2021-06-03 | University-Industry Cooperation Group Of Kyung Hee University | Multiple unmanned aerial vehicles navigation optimization method and multiple unmanned aerial vehicles system using the same |
CN113110592A (en) * | 2021-04-23 | 2021-07-13 | 南京大学 | Unmanned aerial vehicle obstacle avoidance and path planning method |
-
2022
- 2022-01-25 CN CN202210089643.2A patent/CN114564039B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107085437A (en) * | 2017-03-20 | 2017-08-22 | 浙江工业大学 | A kind of unmanned aerial vehicle flight path planing method based on EB RRT |
US20210165405A1 (en) * | 2019-12-03 | 2021-06-03 | University-Industry Cooperation Group Of Kyung Hee University | Multiple unmanned aerial vehicles navigation optimization method and multiple unmanned aerial vehicles system using the same |
CN111487992A (en) * | 2020-04-22 | 2020-08-04 | 北京航空航天大学 | Unmanned aerial vehicle sensing and obstacle avoidance integrated method and device based on deep reinforcement learning |
CN111752306A (en) * | 2020-08-14 | 2020-10-09 | 西北工业大学 | Unmanned aerial vehicle route planning method based on fast-expanding random tree |
CN112799420A (en) * | 2021-01-08 | 2021-05-14 | 南京邮电大学 | Real-time track planning method based on multi-sensor unmanned aerial vehicle |
CN113110592A (en) * | 2021-04-23 | 2021-07-13 | 南京大学 | Unmanned aerial vehicle obstacle avoidance and path planning method |
Non-Patent Citations (2)
Title |
---|
吴健发;王宏伦;刘一恒;姚鹏;: "无人机避障航路规划方法研究综述", 无人系统技术, no. 01, 15 January 2020 (2020-01-15) * |
潘广贞;秦帆;张文斌;: "动态自适应快速扩展树航迹规划算法研究", 微电子学与计算机, no. 01, 5 January 2013 (2013-01-05) * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117970931A (en) * | 2024-03-29 | 2024-05-03 | 青岛科技大学 | Robot dynamic path planning method, equipment and medium |
CN117970931B (en) * | 2024-03-29 | 2024-07-05 | 青岛科技大学 | Robot dynamic path planning method, equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
CN114564039B (en) | 2024-08-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110083165B (en) | Path planning method of robot in complex narrow environment | |
CN110347151B (en) | Robot path planning method fused with Bezier optimization genetic algorithm | |
CN104155974B (en) | Path planning method and apparatus for robot fast collision avoidance | |
CN110297490B (en) | Self-reconstruction planning method of heterogeneous modular robot based on reinforcement learning algorithm | |
CN116242383B (en) | Unmanned vehicle path planning method based on reinforced Harris eagle algorithm | |
CN110883776A (en) | Robot path planning algorithm for improving DQN under quick search mechanism | |
CN108413963A (en) | Bar-type machine people's paths planning method based on self study ant group algorithm | |
CN114489052B (en) | Path planning method for improving RRT algorithm reconnection strategy | |
CN111159489B (en) | Searching method | |
CN111191785A (en) | Structure searching method based on expanded search space | |
CN114564039A (en) | Flight path planning method based on deep Q network and fast search random tree algorithm | |
Chen et al. | Intelligent warehouse robot path planning based on improved ant colony algorithm | |
CN115493597A (en) | AUV path planning control method based on SAC algorithm | |
Tusi et al. | Using ABC and RRT algorithms to improve mobile robot path planning with danger degree | |
CN117520956A (en) | Two-stage automatic feature engineering method based on reinforcement learning and meta learning | |
Yu et al. | AGV multi-objective path planning method based on improved cuckoo algorithm | |
Cui et al. | Improved multi-objective artificial bee colony algorithm-based path planning for mobile robots | |
Huang et al. | An Improved Q-Learning Algorithm for Path Planning | |
Qiu et al. | Obstacle avoidance planning combining reinforcement learning and RRT* applied to underwater operations | |
Chen et al. | Optimization of robot path planning based on improved BP algorithm | |
Ji et al. | Research on Path Planning of Mobile Robot Based on Reinforcement Learning | |
Li et al. | Mobile Robot Path Planning Algorithm Based on Improved RRT* FN | |
Zhang et al. | Robot path planning based on shuffled frog leaping algorithm combined with genetic algorithm | |
CN117388643B (en) | Method, system, equipment and storage medium for positioning fault section of active power distribution network | |
CN116718198B (en) | Unmanned aerial vehicle cluster path planning method and system based on time sequence knowledge graph |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |