CN115494866A - Multi-unmanned aerial vehicle global and local path intelligent planning method and system - Google Patents

Multi-unmanned aerial vehicle global and local path intelligent planning method and system Download PDF

Info

Publication number
CN115494866A
CN115494866A CN202211160137.4A CN202211160137A CN115494866A CN 115494866 A CN115494866 A CN 115494866A CN 202211160137 A CN202211160137 A CN 202211160137A CN 115494866 A CN115494866 A CN 115494866A
Authority
CN
China
Prior art keywords
unmanned aerial
aerial vehicle
obstacle
global
path planning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211160137.4A
Other languages
Chinese (zh)
Inventor
谢拥军
贾培
谭川
王学慧
刘莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuhai Anqing Technology Co ltd
Original Assignee
Zhuhai Anqing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhuhai Anqing Technology Co ltd filed Critical Zhuhai Anqing Technology Co ltd
Priority to CN202211160137.4A priority Critical patent/CN115494866A/en
Publication of CN115494866A publication Critical patent/CN115494866A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/10Simultaneous control of position or course in three dimensions
    • G05D1/101Simultaneous control of position or course in three dimensions specially adapted for aircraft
    • G05D1/104Simultaneous control of position or course in three dimensions specially adapted for aircraft involving a plurality of aircrafts, e.g. formation flying

Abstract

The invention discloses a method and a system for intelligently planning global and local paths of multiple unmanned aerial vehicles, which comprises the steps of constructing a grid map, obtaining a global path planning model by utilizing an improved bidirectional search A star algorithm and planning the global path; acquiring barrier motion state information in an environment through a millimeter wave radar, and acquiring flight state information of the unmanned aerial vehicle; judging whether the conflicts exist, designing a reward function, and generating a corresponding obstacle avoidance strategy; acquiring a state space and an action space, performing obstacle avoidance training by adopting a depth certainty strategy gradient algorithm, controlling the action of the unmanned aerial vehicle on a continuous action space, avoiding obstacles, acquiring a local path planning model and tracking a global navigation path; each unmanned aerial vehicle carries out path planning by using a global path planning model and a local path planning model at the same time; the invention can improve the track planning and obstacle avoidance efficiency of the multi-unmanned aerial vehicle system, realize the control of the continuous action output of the multi-unmanned aerial vehicle in a dynamic unknown complex environment and meet all-weather working conditions.

Description

Multi-unmanned aerial vehicle global and local path intelligent planning method and system
Technical Field
The invention relates to the technical field of unmanned aerial vehicles, in particular to a method and a system for intelligently planning global and local paths of multiple unmanned aerial vehicles.
Background
With the continuous progress and development of the scientific and technological level, the unmanned aerial vehicle is widely applied to the fields of military affairs, agriculture, transportation, public management and the like by virtue of the characteristics of strong maneuverability, low cost, convenient operation and the like. The application generally relates to the joint completion of tasks of multiple unmanned aerial vehicles to improve efficiency, and because static and dynamic obstacles exist in the air, the unmanned aerial vehicles need to have very high global path planning capability and local dynamic obstacle avoidance capability to complete designated tasks, and the global path planning and local dynamic obstacle avoidance research of the multiple unmanned aerial vehicles is one of important technologies in the field of unmanned aerial vehicles.
The traditional global path intelligent planning algorithm mainly comprises an A star search algorithm, a particle swarm algorithm, a fast expansion random tree algorithm and a genetic algorithm, and the local dynamic obstacle avoidance method mainly comprises an obstacle avoidance algorithm based on conductivity, an obstacle avoidance algorithm based on a speed obstacle method and an obstacle avoidance algorithm based on an artificial potential field, but the traditional obstacle avoidance method has large limitation and is not suitable for complex and dynamic unknown environments; one of the popular directions which are receiving attention in recent years is an intelligent obstacle avoidance method combining deep learning and reinforcement learning, but the method has the problems of high convergence difficulty, poor real-time performance and the like.
Therefore, how to provide an intelligent planning method and system for global and local paths of multiple unmanned aerial vehicles is a problem that needs to be solved urgently by those skilled in the art.
Disclosure of Invention
In view of this, the invention provides a method and a system for intelligently planning global and local paths of multiple unmanned aerial vehicles, which can improve the track planning and obstacle avoidance efficiency of a multiple unmanned aerial vehicle system, realize the control of continuous action output of the multiple unmanned aerial vehicles in a dynamic unknown complex environment, and meet all-weather working conditions.
In order to achieve the purpose, the invention adopts the following technical scheme:
a multi-unmanned aerial vehicle global and local path intelligent planning method comprises the following steps:
s1, constructing a grid map according to a starting point and a target point of each unmanned aerial vehicle navigation task, obtaining a global path planning model by using an improved bidirectional search A star algorithm, and performing global path planning to obtain a reference global navigation path of each unmanned aerial vehicle;
s2, acquiring obstacle motion state information in the surrounding environment of the unmanned aerial vehicle through a millimeter wave radar, and acquiring flight state information of the unmanned aerial vehicle;
s3, judging whether to plan a local obstacle avoidance path according to the relative position relation between the unmanned aerial vehicle and the obstacle;
s4, judging whether the unmanned aerial vehicle and the obstacle generate flight conflict or not according to the obstacle motion state information and the unmanned aerial vehicle flight state information, designing a reward function according to the flight conflict, and generating a corresponding local obstacle avoidance strategy;
s5, acquiring a state space and an action space, performing obstacle avoidance training by adopting a depth certainty strategy gradient algorithm, controlling the action of the unmanned aerial vehicle on a continuous action space, avoiding obstacles, acquiring a local path planning model and tracking a global navigation path;
and S6, simultaneously using the global path planning model and the local path planning model to plan paths for each unmanned aerial vehicle in the unmanned aerial vehicle cluster.
Preferably, the improved bidirectional search a-star algorithm in S1 includes:
s11, optimizing the path planning;
specifically, the method comprises the following steps: searching the shortest route to the destination by adjusting the heuristic function of the A star algorithm;
s12, smoothing the flight path;
specifically, the method comprises the following steps: and smoothing the navigation path by using a quasi-uniform B-spline curve, wherein the quasi-uniform B-spline curve adopts a segmented continuous polynomial.
Preferably, the obstacle motion state information includes a distance between the obstacle and the unmanned aerial vehicle, and position information and motion speed information of the obstacle relative to the unmanned aerial vehicle; the flight state information of the unmanned aerial vehicle comprises unmanned aerial vehicle position information and flight speed information.
Preferably, when the speed direction of the unmanned aerial vehicle relative to the obstacle and the position direction of the obstacle relative to the unmanned aerial vehicle are detected to be in the same quadrant, local path planning is carried out;
preferably, the specific method for judging whether the unmanned aerial vehicle and the obstacle generate flight conflict in S4 is as follows:
simplifying the unmanned aerial vehicle into particle A, regarding the barrier as barrier circle O with safe radius, the distance between the two points A and O is d i The line segment OA is on the straight line l, and the relative velocity vector of the unmanned aerial vehicle and the obstacle is v uioi
v uioi =v ui -v oi
Wherein v is ui Velocity vector information for unmanned aerial vehicles, v oi Velocity vector information for the obstacle;
obtaining a tangent line l passing through the point A and tangent to the obstacle circle 1 And l 2 Straight line l and tangent line l 1 Or l 2 The included angle between them is alpha i Relative velocity vector v uioi And the straight line l form an angle beta i The radius of the obstacle circle is R i When is beta i ≤α i In time, a flight conflict exists between the unmanned aerial vehicle and the obstacle; otherwise, when beta i >α i When, there is no flight conflict;
preferably, the state space after the interaction between the unmanned aerial vehicle and the environment in S5 is:
Figure BDA0003859419790000031
wherein u is i Real-time position information of the ith unmanned aerial vehicle; v. of ui Real-time velocity vector information of the ith unmanned aerial vehicle;
Figure BDA0003859419790000032
real-time course angle information, gamma, for the ith drone i For real-time pitch angle information of the unmanned aerial vehicle, o is the real-time pitch angle information of the obstacle detected by the millimeter wave radarPosition vector information, v o The real-time velocity vector information of the obstacle detected by the millimeter wave radar, d is the real-time distance vector information of the obstacle and the millimeter wave radar, p is the position information of a real-time sub-target point on the global path, and g is the position information of the global path terminal.
Preferably, the action space after the interaction between the unmanned aerial vehicle and the environment in S5 is:
Figure BDA0003859419790000041
wherein v is ui Is the real-time velocity vector information of the unmanned aerial vehicle,
Figure BDA0003859419790000042
real-time course angle information, gamma, for unmanned aerial vehicles i Real-time pitch angle information of the unmanned aerial vehicle;
v ui
Figure BDA0003859419790000043
and gamma i The variation range of (A) is as follows:
v ui ∈[v min ,v max ]∩[v-v a Δt,v+v b Δt]
Figure BDA0003859419790000044
γ i ∈[0,π]∩[γ-γ a Δt,γ+γ b Δt]
wherein v is min And v max Minimum and maximum speed, v, of the drone a And v b For the maximum deceleration and the maximum acceleration of the drone in the direction of travel,
Figure BDA0003859419790000045
and
Figure BDA0003859419790000046
for the unmanned plane on the most sailing angleLarge deceleration and maximum acceleration, gamma a And gamma b The maximum deceleration and the maximum acceleration of the unmanned plane at the pitch angle.
Preferably, the reward function is:
R(s,a)=R 1 (t)+R 2 (t)+R 3 (t)+R 4 (t)
wherein R is 1 (t) reward generated by the distance change between the unmanned aerial vehicle and the obstacle, wherein the reward is smaller when the unmanned aerial vehicle is closer to the obstacle within the safe distance from the obstacle, and the reward is a normal number when the unmanned aerial vehicle is out of the safe distance;
R 2 (t) reward generated by speed change when the unmanned aerial vehicle detects that the barrier threatens the flight of the unmanned aerial vehicle; when flight conflict exists between the unmanned aerial vehicle and the barrier, obtaining a negative reward; otherwise, flight conflict does not exist, and a positive reward is obtained;
R 3 (t) rewards are generated by the distance change between the unmanned aerial vehicle and the navigation task target point, and the rewards are larger when the distance is closer;
R 4 (t) represents the reward generated by the change in distance between the drone and the temporary sub-target points, with rewards increasing with closer distance.
Preferably, the specific content of performing obstacle avoidance training by using the depth deterministic strategy gradient algorithm in S5 includes:
the algorithm comprises a strategy network based on a strategy, a target strategy network, a value network based on a value and a target value network;
the unmanned aerial vehicle obtains the current state s through a policy network t Action a of t Interacting with the environment to obtain the next state s t+1 Wherein the values of the state and the action meet the constraints of the state space and the action space, and the current reward r is calculated through the reward function t And the reward r in the next state t+1
Saving S t ,a t ,r t+1 ,S t+1 And fourthly, randomly selecting N samples from the experience pool to train the network when the samples reach the condition of starting trainingAnd updating the parameters.
A multi-unmanned aerial vehicle global and local path intelligent planning system comprises a global path planning module, an acquisition module, a local path planning module and a training module;
the global path planning module is used for constructing a grid map according to the starting point and the target point of each unmanned aerial vehicle navigation task, obtaining a global path planning model by using an improved bidirectional search A star algorithm, planning a global path, obtaining a reference global navigation path of each unmanned aerial vehicle, and uniformly selecting a plurality of sub target points in the reference global navigation path;
the acquisition module is used for acquiring barrier motion state information in the surrounding environment of the unmanned aerial vehicle through a millimeter wave radar and acquiring flight state information of the unmanned aerial vehicle, wherein the barrier motion state information comprises the distance between a barrier and the unmanned aerial vehicle, and position information and motion speed information of the barrier relative to the unmanned aerial vehicle; the flight state information of the unmanned aerial vehicle comprises unmanned aerial vehicle position information and flight speed information;
the local path planning module is used for judging whether to plan a local obstacle avoidance path according to the relative position relationship between the unmanned aerial vehicle and an obstacle, judging whether the unmanned aerial vehicle and the obstacle generate flight conflict according to the obstacle motion state information and the unmanned aerial vehicle flight state information, realizing control over the action of the unmanned aerial vehicle on a continuous action space by using an obstacle avoidance strategy of the local path planning model, avoiding the obstacle, and tracking the unmanned aerial vehicle to a global navigation path;
the training module is used for designing a reward function, acquiring a state space and an action space, performing obstacle avoidance training by adopting a depth certainty strategy gradient algorithm, realizing control over the action of the unmanned aerial vehicle on a continuous action space and acquiring a local path planning model;
the global path planning model and the local path planning model are used for planning paths of all unmanned planes in the unmanned plane cluster.
Compared with the prior art, the method and the system for intelligently planning the global and local paths of the multiple unmanned aerial vehicles have the advantages that:
(1) The unmanned aerial vehicle global and local path intelligent planning method is provided for multi-type obstacle environments of multiple unmanned aerial vehicles, compared with single global or local path planning, the unmanned aerial vehicle global and local path intelligent planning method is more flexible, and each unmanned aerial vehicle in an unmanned aerial vehicle cluster has the global path planning and local dynamic obstacle avoidance capabilities;
(2) The invention realizes the control of the continuous action output of the multi-unmanned aerial vehicle system through the depth certainty strategy gradient algorithm, and has the advantages of applicability to dynamic unknown complex environment, strong real-time performance and high efficiency;
(3) The unmanned aerial vehicle system simultaneously considers the flight state information of the unmanned aerial vehicle and the motion state information of the barrier, judges whether to adopt obstacle avoidance measures according to the threat degree of the barrier, and designs a reward function according to the judgment, thereby improving the path planning and obstacle avoidance efficiency of the multi-unmanned aerial vehicle system;
(4) The invention uses the millimeter wave radar to collect the environmental data around each unmanned aerial vehicle and obtain the obstacle information, compared with the laser radar, the detection range is wider, the penetration capability of the guide head to the penetrating fog, smoke, dust and the like is strong, the all-weather work can be realized, the adaptability to the environment is strong, and the applicable range is wider.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of an intelligent planning method for global and local paths of multiple unmanned aerial vehicles according to the present invention;
fig. 2 is a schematic diagram illustrating an intelligent planning method for global and local paths of multiple unmanned aerial vehicles according to the present invention;
fig. 3 is a schematic diagram of dividing a flight conflict range and a non-flight conflict range of the unmanned aerial vehicle relative to an obstacle, provided by the invention;
fig. 4 is a flowchart illustrating obstacle avoidance training performed according to a depth deterministic strategy gradient algorithm according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention discloses a multi-unmanned aerial vehicle global and local path intelligent planning method, which comprises the following steps as shown in figures 1 and 2:
s1, constructing a grid map according to a starting point and a target point of each unmanned aerial vehicle navigation task, obtaining a global path planning model by using an improved bidirectional search A star algorithm, and performing global path planning to obtain a reference global navigation path of each unmanned aerial vehicle;
s2, acquiring barrier motion state information in the surrounding environment of the unmanned aerial vehicle through a millimeter wave radar, and acquiring flight state information of the unmanned aerial vehicle;
s3, judging whether to plan a local obstacle avoidance path according to the relative position relation between the unmanned aerial vehicle and the obstacle;
s4, judging whether the unmanned aerial vehicle and the obstacle generate flight conflicts or not according to the obstacle motion state information and the unmanned aerial vehicle flight state information, designing a reward function according to the flight conflicts, and generating a corresponding local obstacle avoidance strategy;
s5, acquiring a state space and an action space, performing obstacle avoidance training by adopting a depth certainty strategy gradient algorithm, controlling the action of the unmanned aerial vehicle on a continuous action space, avoiding obstacles, acquiring a local path planning model and tracking a global navigation path;
and S6, simultaneously using the global path planning model and the local path planning model to plan paths for each unmanned aerial vehicle in the unmanned aerial vehicle cluster.
In order to further implement the above technical solution, the improved bidirectional search a-star algorithm in S1 includes:
s11, optimizing the path planning;
specifically, the method comprises the following steps: by adjusting the heuristic function of the A star algorithm, the path which quickly reaches the destination is more prone to be searched when the search is started, and the shortest path which reaches the destination is more prone to be searched when the search is finished;
in this embodiment, the improved heuristic function is:
F(n)=G(n)+w*H(n)
wherein n is a current node, F (n) is an estimated value from a starting point to a target point, G (n) is an estimated value from a starting point to a current node, H (n) is an estimated value from a current node to a target point, and omega is a dynamic coefficient; the value of omega is reduced along with the reduction of the diagonal distance from the current node to the target point;
s12, smoothing the flight path;
specifically, the method comprises the following steps: smoothing the navigation path by using a quasi-uniform B-spline curve, wherein the quasi-uniform B-spline curve adopts a segmented continuous polynomial;
the whole curve uses a complete expression, but the internal quantity is one section, so that the optimization effect is better compared with that of a Bezier curve under the same order, and the modification of the local path planning is facilitated.
In one embodiment, each curve segment is determined by continuously adjacent 4 track control points, and the navigation path is smoothed.
In this embodiment, the improved bidirectional search a-star algorithm can greatly shorten the path search time and improve the search efficiency.
In order to further implement the technical scheme, the obstacle motion state information comprises the distance between the obstacle and the unmanned aerial vehicle, and position information and motion speed information of the obstacle relative to the unmanned aerial vehicle; the flight state information of the unmanned aerial vehicle comprises unmanned aerial vehicle position information and flight speed information.
In the embodiment, the millimeter wave radar is arranged on the left side, the right side, the front side and the rear side of the unmanned aerial vehicle and used for collecting information of obstacles around the unmanned aerial vehicle, and the millimeter wave radar has strong penetration capability to mist, smoke, dust and the like, has higher ranging accuracy and longer ranging range and provides powerful guarantee for the safe flight of a multi-unmanned aerial vehicle system under complex climate and environmental conditions; the flight state information of the unmanned aerial vehicle is obtained through monitoring of the ground station.
In order to further implement the technical scheme, when the speed direction of the unmanned aerial vehicle relative to the obstacle and the position direction of the obstacle relative to the unmanned aerial vehicle are detected to be in the same quadrant, local path planning is carried out.
Specifically, the method comprises the following steps:
1. when the barrier is in the front of the unmanned aerial vehicle on the right, under the track coordinate system, when the following three conditions are met, the unmanned aerial vehicle immediately avoids the barrier: a. the projection of the unmanned aerial vehicle on the xoy plane relative to the speed direction of the obstacle is in a first quadrant, b, the obstacle approaches to the flying height of the unmanned aerial vehicle, and c, in a preset time period, the obstacle can threaten the unmanned aerial vehicle to fly; if not satisfied above condition, unmanned aerial vehicle navigates according to original flight path.
2. When the barrier is in the left front of the unmanned aerial vehicle, under a track coordinate system, when the following three conditions are met, the unmanned aerial vehicle immediately avoids the barrier: a. the projection of the unmanned aerial vehicle on the xoy plane relative to the speed direction of the barrier is in a second quadrant, b, the barrier is close to the flying height of the unmanned aerial vehicle, and c, the barrier can threaten the flying of the unmanned aerial vehicle within a preset time period; if not satisfied above condition, unmanned aerial vehicle navigates according to original flight path.
3. When the barrier is at the left rear of the unmanned aerial vehicle, under a track coordinate system, when the following three conditions are met, the unmanned aerial vehicle immediately avoids the barrier: a. the projection of the unmanned aerial vehicle on the xoy plane relative to the speed direction of the obstacle is in a third quadrant, b, the obstacle approaches to the flying height of the unmanned aerial vehicle, and c, in a preset time period, the obstacle can threaten the unmanned aerial vehicle to fly; if not satisfied above condition, unmanned aerial vehicle navigates according to original flight path.
4. When the barrier is behind the unmanned aerial vehicle right, under the track coordinate system, when satisfying following three condition, unmanned aerial vehicle carries out avoiding of barrier immediately: a. the projection of the unmanned aerial vehicle on the xoy plane relative to the speed direction of the obstacle is in the fourth quadrant, b, the obstacle approaches to the flying height of the unmanned aerial vehicle, and c, in a preset time period, the obstacle can threaten the unmanned aerial vehicle to fly; if not satisfied above condition, unmanned aerial vehicle navigates according to original flight path.
In order to further implement the above technical solution, as shown in fig. 3, a specific method for determining whether the unmanned aerial vehicle and the obstacle generate a flight conflict in S4 includes:
simplifying the unmanned aerial vehicle into a particle A, regarding the obstacle as an obstacle circle with a safe radius [ [ O ] ], and the distance between the two points A and O is d i The line segment OA is on the straight line l, and the relative velocity vector of the unmanned aerial vehicle and the obstacle is v uioi
v uioi =v ui -v oi
Wherein v is ui Velocity vector information for unmanned aerial vehicles, v oi Velocity vector information for the obstacle;
obtaining a tangent line l passing through the point A and tangent to the obstacle circle 1 And l 2 Straight line l and tangent line l 1 Or l 2 The included angle between is alpha i Relative velocity vector v uioi And the straight line l form an angle beta i The radius of the obstacle circle is R i When is beta i ≤α i In time, a flight conflict exists between the unmanned aerial vehicle and the obstacle; otherwise, when beta i >α i There is no flight conflict.
Preferably, the state space after the interaction between the unmanned aerial vehicle and the environment in S5 is:
Figure BDA0003859419790000101
wherein u is i Real-time position information of the ith unmanned aerial vehicle; v. of ui Real-time velocity vector information of the ith unmanned aerial vehicle;
Figure BDA0003859419790000102
real-time course angle information, gamma, for the ith drone i For real-time pitch angle information of the unmanned aerial vehicle, o is the real-time position vector information of the obstacle detected by the millimeter wave radar, v o The real-time velocity vector information of the obstacle detected by the millimeter wave radar, d is the real-time distance vector information of the obstacle and the millimeter wave radar, p is the position information of a real-time sub-target point on the global path, and g is the position information of the global path terminal.
Preferably, the action space after the interaction between the unmanned aerial vehicle and the environment in S5 is:
Figure BDA0003859419790000111
wherein v is ui Is the real-time velocity vector information of the unmanned aerial vehicle,
Figure BDA0003859419790000112
is real-time course angle information, gamma, of the drone i Real-time pitch angle information of the unmanned aerial vehicle;
v ui
Figure BDA0003859419790000113
and gamma i The variation range of (A) is as follows:
v ui ∈[v min ,v max ]∩[v-v a Δt,v+v b Δt]
Figure BDA0003859419790000114
γ i ∈[0,π]∩[γ-γ a Δt,γ+γ b Δt]
wherein v is min And v max Minimum and maximum speed, v, of the drone a And v b For the maximum deceleration and the maximum acceleration of the drone in the direction of travel,
Figure BDA0003859419790000115
and
Figure BDA0003859419790000116
for maximum deceleration and maximum acceleration, gamma, of the drone at the heading angle a And gamma b The maximum deceleration and the maximum acceleration of the unmanned plane at the pitch angle.
In order to further implement the above technical solution, the reward function is:
R(s,a)=R 1 (t)+R 2 (t)+R 3 (t)+R 4 (t)
wherein R is 1 (t) is the reward that the distance between unmanned aerial vehicle and the barrier changes and produces, and in unmanned aerial vehicle and barrier safe distance, the nearer reward is littleer apart from the barrier, and when unmanned aerial vehicle was outside safe distance, the reward was the normal number.
R 2 (t) reward generated by speed change when the unmanned aerial vehicle detects that the obstacle threatens the flight of the unmanned aerial vehicle, and when the unmanned aerial vehicle and the obstacle have flight conflict, namely beta i ≤α i Then, a negative reward is obtained; otherwise, flight conflict does not exist, and a positive reward is obtained;
R 3 (t) rewards are generated by the distance change between the unmanned aerial vehicle and the navigation task target point, and the rewards are larger when the distance is closer;
R 4 (t) represents the reward generated by the change in distance between the drone and the temporary sub-target points, with rewards increasing with closer distance.
In the present embodiment, it is preferred that,
Figure BDA0003859419790000121
wherein R is 1 (t) reward for the change in distance between the drone and the obstacle, the greater the distance, in order to make the drone avoid the obstacle, r 1 Is a negative constant, representing the penalty obtained when the drone enters the obstacle threat zone, d o For the distance between the unmanned aerial vehicle and the edge of the non-bulked obstacle at different times, D o Width expanded for obstacle, D u Is the safe distance, k, between the unmanned aerial vehicle and the obstacle 1 Is a negative reward and punishment coefficient, v r The navigation speed of the unmanned aerial vehicle relative to the barrier;
Figure BDA0003859419790000122
wherein R is 2 (t) reward produced by speed change when the unmanned aerial vehicle detects that the obstacle threatens the flight of the unmanned aerial vehicle, in order to make the unmanned aerial vehicle sail in the range of non-flight conflict, r 2 The negative constant represents the punishment obtained by the unmanned aerial vehicle navigating in the flight conflict range; r is 3 The number is a normal number and represents the reward obtained when the unmanned plane sails in a non-flight conflict range;
Figure BDA0003859419790000123
wherein R is 3 (t) is the reward generated by the distance change between the unmanned aerial vehicle and the navigation task target point, the closer the distance is, the greater the reward is, the purpose is to ensure that the unmanned aerial vehicle deviates from the navigation task target point as little as possible in the obstacle avoidance process, r 4 Is a normal number representing the reward obtained when arriving at the target point of the navigation mission, k 2 Is a positive reward or punishment coefficient, d g For the distance between the unmanned aerial vehicle and the sailing task target point at different moments, when d g <d m1 When the unmanned aerial vehicle reaches the target point of the navigation task, v u The navigation speed of the unmanned aerial vehicle;
Figure BDA0003859419790000124
wherein R is 4 (t) represents the reward generated by the change of the distance between the unmanned aerial vehicle and the temporary sub-target points, the closer the distance is, the greater the reward is, the purpose is to ensure that the unmanned aerial vehicle can return to the global path planned by the improved bidirectional search A star algorithm by tracking the temporary sub-target points on the global path after obstacle avoidance is finished, and r is 5 Is a normal number, indicating that a temporary sub-goal is reachedReward for Point acquisition, k 3 Is a positive reward or punishment coefficient, d p Indicating the distance between the unmanned aerial vehicle and the temporary sub-target point at different moments, when d p <d m2 When, it means that the unmanned aerial vehicle has arrived at the temporary child target point, v u The navigation speed of the unmanned aerial vehicle.
In order to further implement the above technical solution, the specific content of performing the obstacle avoidance training by using the depth deterministic strategy gradient algorithm in S5 includes:
the algorithm comprises a strategy network based on a strategy, a target strategy network, a value network based on a value and a target value network;
the unmanned aerial vehicle obtains the current state s through a policy network t Action a of t Interacting with the environment to obtain the next state s t+1 Wherein the values of the state and the action meet the constraints of the state space and the action space, and the current reward r is calculated through the reward function t And the prize r in the next state t+1
Saving S t ,a t ,r t+1 ,S t+1 And (4) in an experience pool, randomly selecting N samples from the experience pool to train the network to complete parameter updating when the samples reach a training starting condition, and stopping parameter updating when a maximum number of rounds of parameter updating is reached or a network convergence condition is reached to obtain a planned local obstacle avoidance path.
In one embodiment, as shown in fig. 4, the specific content of the depth deterministic strategy gradient algorithm for obstacle avoidance training is as follows:
s51, initializing parameters theta and eta of a strategy network pi (S; theta) and a value network q (S, a; eta); initializing parameters theta 'and eta' of a target strategy network pi '(s; theta') and a target value network q '(s, a; eta'); initializing a target network learning rate epsilon, wherein the batch size of a learning sample at each time is N, and the size of an experience pool is Z;
s52, according to the state space and the action space, the current S is processed through the strategy network t Output action a under the State t =π(s t ) Wherein the selection of states and actions satisfies the constraints of the state space and the action space;
s53, executing action, and obtaining reward r from environment according to reward function t+1 And the state S at the next moment t+1
S54 saves S t ,a t ,r t+1 ,S t+1 And (4) in an experience pool, when the samples reach the condition of starting training, randomly selecting N samples from the experience pool,
Figure BDA0003859419790000131
s55, updating the policy network and the value network:
y i =r i+1 +γq(s i+1 ,π(s i+1 ,θ′),η′)
Figure BDA0003859419790000141
Figure BDA0003859419790000142
s56, updating the target value network target strategy network once at intervals:
η′←εη+(1-ε)η′
θ′←εθ+(1-ε)θ′
s57 returns to S52 to loop until the maximum number of rounds is reached or a network convergence condition is reached.
A multi-unmanned aerial vehicle global and local path intelligent planning system comprises a global path planning module, an acquisition module, a local path planning module and a training module;
the global path planning module is used for constructing a grid map according to the starting point and the target point of each unmanned aerial vehicle navigation task, obtaining a global path planning model by using an improved bidirectional search A star algorithm, planning a global path, obtaining a reference global navigation path of each unmanned aerial vehicle, and uniformly selecting a plurality of sub target points in the reference global navigation path;
the system comprises an acquisition module and an acquisition module, wherein the acquisition module is used for acquiring barrier motion state information in the surrounding environment of the unmanned aerial vehicle through a millimeter wave radar and acquiring flight state information of the unmanned aerial vehicle, and the barrier motion state information comprises the distance between a barrier and the unmanned aerial vehicle, and position information and motion speed information of the barrier relative to the unmanned aerial vehicle; the flight state information of the unmanned aerial vehicle comprises unmanned aerial vehicle position information and flight speed information;
the local path planning module is used for dividing a flight conflict range and a non-flight conflict range of the unmanned aerial vehicle relative to the barrier according to the barrier motion state information and the unmanned aerial vehicle flight state information, generating a corresponding obstacle avoidance strategy, uniformly selecting a plurality of sub-target points in the reference global navigation path, and tracking the sub-target points closest to the reference global navigation path as temporary sub-target points after the unmanned aerial vehicle obstacle avoidance is finished;
the training module is used for designing a reward function, acquiring a state space and an action space, performing obstacle avoidance training by adopting a depth certainty strategy gradient algorithm, realizing control over the action of the unmanned aerial vehicle on a continuous action space and acquiring a local path planning model;
the global path planning model and the local path planning model are used for planning paths of all unmanned planes in the unmanned plane cluster.
In the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A multi-unmanned aerial vehicle global and local path intelligent planning method is characterized by comprising the following steps:
s1, constructing a grid map according to a starting point and a target point of each unmanned aerial vehicle navigation task, obtaining a global path planning model by using an improved bidirectional search A star algorithm, and performing global path planning to obtain a reference global navigation path of each unmanned aerial vehicle;
s2, acquiring obstacle motion state information in the surrounding environment of the unmanned aerial vehicle through a millimeter wave radar, and acquiring flight state information of the unmanned aerial vehicle;
s3, judging whether to plan a local obstacle avoidance path according to the relative position relation between the unmanned aerial vehicle and the obstacle;
s4, judging whether the unmanned aerial vehicle and the obstacle generate flight conflicts or not according to the obstacle motion state information and the unmanned aerial vehicle flight state information, designing a reward function, and generating a corresponding local obstacle avoidance strategy;
s5, acquiring a state space and an action space, performing obstacle avoidance training by adopting a depth certainty strategy gradient algorithm, controlling the action of the unmanned aerial vehicle on a continuous action space, avoiding obstacles, acquiring a local path planning model and tracking a global navigation path;
and S6, simultaneously using the global path planning model and the local path planning model to plan paths for each unmanned aerial vehicle in the unmanned aerial vehicle cluster.
2. The method according to claim 1, wherein the improved bidirectional search a-star algorithm in S1 comprises:
s11, optimizing the path planning;
specifically, the method comprises the following steps: searching the shortest route to the destination by adjusting the heuristic function of the A star algorithm;
s12, smoothing the flight path;
specifically, the method comprises the following steps: and smoothing the navigation path by using a quasi-uniform B-spline curve, wherein the quasi-uniform B-spline curve adopts a segmented continuous polynomial.
3. The method according to claim 1, wherein the obstacle motion state information includes a distance between an obstacle and the drone, position information and motion speed information of the obstacle relative to the drone; the flight state information of the unmanned aerial vehicle comprises unmanned aerial vehicle position information and flight speed information.
4. The method according to claim 3, wherein the local path planning is performed when the speed direction of the UAV relative to the obstacle and the position direction of the obstacle relative to the UAV are detected to be in the same quadrant.
5. The method for intelligently planning the global and local paths of multiple unmanned aerial vehicles according to claim 1, wherein the specific method for judging whether the unmanned aerial vehicles and the obstacles generate flight conflicts in the step S4 is as follows:
simplifying the unmanned aerial vehicle into particle A, regarding the barrier as barrier circle O with safe radius, the distance between the two points A and O is d i The line segment OA is on the straight line l, and the relative velocity vector of the unmanned aerial vehicle and the obstacle is v uioi
v uioi =v ui -v oi
Wherein v is ui Velocity vector information for unmanned aerial vehicles, v oi Velocity vector information for the obstacle;
obtaining a tangent line l passing through the point A and tangent to the obstacle circle 1 And l 2 Straight line l and tangent line l 1 Or l 2 The included angle between them is alpha i Relative velocity vector v uioi And the straight line l form an angle beta i The radius of the obstacle circle is R i When is beta i ≤α i In time, a flight conflict exists between the unmanned aerial vehicle and the obstacle; otherwise, when beta ii There is no flight conflict.
6. The method according to claim 1, wherein the state space of the unmanned aerial vehicle after interaction with the environment in S5 is:
Figure FDA0003859419780000021
wherein u is i Real-time position information of the ith unmanned aerial vehicle; v. of ui Real-time velocity vector information of the ith unmanned aerial vehicle;
Figure FDA0003859419780000022
real-time course angle information, gamma, for the ith drone i For real-time pitch angle information of the unmanned aerial vehicle, o is the real-time position vector information of the obstacle detected by the millimeter wave radar, v o The real-time speed vector information of the obstacle detected by the millimeter wave radar, d is the real-time distance vector information of the obstacle and the millimeter wave radar, p is the position information of a real-time sub-target point on the global path, and g is the position information of a global path end point.
7. The method according to claim 1, wherein an action space after interaction between the unmanned aerial vehicle and the environment in S5 is:
Figure FDA0003859419780000031
wherein v is ui Is the real-time velocity vector information of the unmanned aerial vehicle,
Figure FDA0003859419780000032
is real-time course angle information, gamma, of the drone i Real-time pitch angle information of the unmanned aerial vehicle;
v ui
Figure FDA0003859419780000033
and gamma i The variation range of (A) is as follows:
v ui ∈[v min ,v max ]∩[v-v a Δt,v+v b Δt]
Figure FDA0003859419780000034
γ i ∈[0,π]∩[γ-γ a Δt,γ+γ b Δt]
wherein v is min And v max Minimum and maximum speed, v, of the drone a And v b For the maximum deceleration and the maximum acceleration of the unmanned aerial vehicle in the forward direction,
Figure FDA0003859419780000035
and
Figure FDA0003859419780000036
for maximum deceleration and maximum acceleration of the unmanned plane on the heading angle, gamma a And gamma b The maximum deceleration and the maximum acceleration of the unmanned aerial vehicle at the pitch angle.
8. The method of claim 1, wherein the reward function is:
R(s,a)=R 1 (t)+R 2 (t)+R 3 (t)+R 4 (t)
wherein R is 1 (t) reward generated by the distance change between the unmanned aerial vehicle and the obstacle, wherein the reward is smaller when the unmanned aerial vehicle is closer to the obstacle within the safe distance from the obstacle, and the reward is a normal number when the unmanned aerial vehicle is out of the safe distance;
R 2 (t) reward generated by speed change when the unmanned aerial vehicle detects that the obstacle threatens the flight of the unmanned aerial vehicle, and when flight conflict exists between the unmanned aerial vehicle and the obstacle, a negative is obtainedA reward; otherwise, flight conflict does not exist, and a positive reward is obtained;
R 3 (t) rewards generated by the distance change between the unmanned aerial vehicle and the navigation task target point, wherein the closer the distance is, the larger the rewards are;
R 4 (t) awards are generated for the distance change between the unmanned aerial vehicle and the temporary sub-target points, and the awards are larger when the distance is closer.
9. The method for intelligently planning the global and local paths of multiple unmanned aerial vehicles according to claim 1, wherein the specific content of performing obstacle avoidance training by using a depth deterministic strategy gradient algorithm in S5 comprises:
the algorithm comprises a strategy network based on a strategy, a target strategy network, a value network based on a value and a target value network;
the unmanned aerial vehicle obtains the current state S through the policy network t Action a of t Interacting with the environment to obtain the next state S t+1 Wherein the values of the state and the action meet the constraints of the state space and the action space, and the current reward r is calculated through the reward function t And the prize r in the next state t+1
Saving S t ,a t ,r t+1 ,S t+1 And (4) in an experience pool, when the samples reach the condition of starting training, randomly selecting N samples from the experience pool to train the network to finish parameter updating.
10. An intelligent planning system for global and local paths of multiple unmanned aerial vehicles based on the intelligent planning method for global and local paths of multiple unmanned aerial vehicles of any one of claims 1-9 is characterized by comprising a global path planning module, an acquisition module, a local path planning module and a training module;
the global path planning module is used for constructing a grid map according to the starting point and the target point of each unmanned aerial vehicle navigation task, obtaining a global path planning model by using an improved bidirectional search A star algorithm, and carrying out global path planning to obtain a reference global navigation path of each unmanned aerial vehicle;
the acquisition module is used for acquiring barrier motion state information in the surrounding environment of the unmanned aerial vehicle through a millimeter wave radar and acquiring flight state information of the unmanned aerial vehicle;
the local path planning module is used for judging whether to plan a local obstacle avoidance path according to the relative position relationship between the unmanned aerial vehicle and an obstacle, judging whether the unmanned aerial vehicle and the obstacle generate flight conflict according to the obstacle motion state information and the unmanned aerial vehicle flight state information, realizing control over the action of the unmanned aerial vehicle on a continuous action space by using an obstacle avoidance strategy of the local path planning model, avoiding the obstacle, and tracking the unmanned aerial vehicle to a global navigation path;
the training module is used for designing a reward function, acquiring a state space and an action space, performing obstacle avoidance training by adopting a depth certainty strategy gradient algorithm, realizing control over the action of the unmanned aerial vehicle on a continuous action space and acquiring a local path planning model;
the global path planning model and the local path planning model are used for planning paths of all unmanned planes in the unmanned plane cluster.
CN202211160137.4A 2022-09-22 2022-09-22 Multi-unmanned aerial vehicle global and local path intelligent planning method and system Pending CN115494866A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211160137.4A CN115494866A (en) 2022-09-22 2022-09-22 Multi-unmanned aerial vehicle global and local path intelligent planning method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211160137.4A CN115494866A (en) 2022-09-22 2022-09-22 Multi-unmanned aerial vehicle global and local path intelligent planning method and system

Publications (1)

Publication Number Publication Date
CN115494866A true CN115494866A (en) 2022-12-20

Family

ID=84470036

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211160137.4A Pending CN115494866A (en) 2022-09-22 2022-09-22 Multi-unmanned aerial vehicle global and local path intelligent planning method and system

Country Status (1)

Country Link
CN (1) CN115494866A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117631689A (en) * 2024-01-25 2024-03-01 成都航空职业技术学院 Unmanned aerial vehicle obstacle avoidance flight method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117631689A (en) * 2024-01-25 2024-03-01 成都航空职业技术学院 Unmanned aerial vehicle obstacle avoidance flight method
CN117631689B (en) * 2024-01-25 2024-04-16 成都航空职业技术学院 Unmanned aerial vehicle obstacle avoidance flight method

Similar Documents

Publication Publication Date Title
Wen et al. UAV online path planning algorithm in a low altitude dangerous environment
McGee et al. Optimal path planning in a constant wind with a bounded turning rate
Ali et al. Cooperative path planning of multiple UAVs by using max–min ant colony optimization along with cauchy mutant operator
CN110703804B (en) Layered anti-collision control method for fixed-wing unmanned aerial vehicle cluster
CN109871032A (en) A kind of multiple no-manned plane formation cooperative control method based on Model Predictive Control
CN110617818A (en) Unmanned aerial vehicle track generation method
CN109270933A (en) Unmanned barrier-avoiding method, device, equipment and medium based on conic section
CN109871031B (en) Trajectory planning method for fixed-wing unmanned aerial vehicle
CN111580548B (en) Unmanned aerial vehicle obstacle avoidance method based on spline-rrt and speed obstacle
CN112577506B (en) Automatic driving local path planning method and system
Ingersoll et al. UAV path-planning using Bezier curves and a receding horizon approach
CN112947594B (en) Unmanned aerial vehicle-oriented track planning method
CN115373399A (en) Ground robot path planning method based on air-ground cooperation
CN112824998A (en) Multi-unmanned aerial vehicle collaborative route planning method and device in Markov decision process
CN115494866A (en) Multi-unmanned aerial vehicle global and local path intelligent planning method and system
CN115903888A (en) Rotor unmanned aerial vehicle autonomous path planning method based on longicorn swarm algorithm
CN115145315A (en) Unmanned aerial vehicle path planning method suitable for chaotic environment and with improved A-star algorithm
Zhang et al. Collision Avoidance of Fixed-Wing UAVs in Dynamic Environments Based on Spline-RRT and Velocity Obstacle
CN115981377A (en) Unmanned aerial vehicle dynamic obstacle avoidance method and system
CN114879716A (en) Law enforcement unmanned aerial vehicle path planning method for countering low-altitude airspace aircraft
CN113985927A (en) Method for optimizing perching and stopping moving track of quad-rotor unmanned aerial vehicle
Qi et al. Optimal path planning for an unmanned aerial vehicle under navigation relayed by multiple stations to intercept a moving target
Oyana et al. Three-layer multi-uavs path planning based on ROBL-MFO
CN114460972B (en) Unmanned aerial vehicle urban operation control method
CN116795108B (en) Intelligent unmanned vehicle distribution method based on multi-source sensing signals

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination