CN115220473A - Multi-unmanned aerial vehicle swarm cooperative task dynamic allocation method - Google Patents
Multi-unmanned aerial vehicle swarm cooperative task dynamic allocation method Download PDFInfo
- Publication number
- CN115220473A CN115220473A CN202210822637.3A CN202210822637A CN115220473A CN 115220473 A CN115220473 A CN 115220473A CN 202210822637 A CN202210822637 A CN 202210822637A CN 115220473 A CN115220473 A CN 115220473A
- Authority
- CN
- China
- Prior art keywords
- task
- strategy
- agent
- allocation
- circle
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 56
- 238000005457 optimization Methods 0.000 claims abstract description 33
- 230000008569 process Effects 0.000 claims description 19
- 238000011156 evaluation Methods 0.000 claims description 7
- 230000007246 mechanism Effects 0.000 claims description 5
- 231100000735 select agent Toxicity 0.000 claims description 3
- 230000006798 recombination Effects 0.000 claims description 2
- 238000005215 recombination Methods 0.000 claims description 2
- 238000012804 iterative process Methods 0.000 abstract description 3
- 239000003795 chemical substances by application Substances 0.000 description 130
- 238000004891 communication Methods 0.000 description 7
- 230000008859 change Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 230000004888 barrier function Effects 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 238000005314 correlation function Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 230000003116 impacting effect Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
- G05D1/10—Simultaneous control of position or course in three dimensions
- G05D1/101—Simultaneous control of position or course in three dimensions specially adapted for aircraft
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Aviation & Aerospace Engineering (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Automation & Control Theory (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a multi-unmanned aerial vehicle swarm cooperative task dynamic allocation method, which comprises the steps of S1, establishing a cooperative task allocation model; s2, optimizing a task allocation strategy of the cooperative task allocation model in the S1 based on a selection optimization method; and making a circle with a certain radius by taking each task as a center, adjusting the selection of each task to the Agent in the circle to realize strategy optimization, ensuring that each iteration enables the allocation strategy of the Multi Agent system to move towards a more optimal direction, and obtaining the optimal allocation strategy by the Multi Agent system after limited iterations. The selective optimization method takes the selectable unmanned aerial vehicle in a certain range as an optimization object, reduces the possible solution scale and improves the optimization speed. The optimal selection algorithm ensures that the allocation strategy is continuously close to the optimal allocation strategy along with the iterative process. The requirement of rapid and dynamic task allocation is met, meanwhile, suboptimal strategies are rapidly obtained under specific conditions, and cooperative task allocation is achieved.
Description
Technical Field
The invention relates to the technical field of unmanned aerial vehicles, in particular to a dynamic allocation method for cooperative tasks of a multi-unmanned aerial vehicle swarm.
Background
The problems of high requirement and multiple factors needing to be considered exist in the multi-swarm cooperative task allocation process. In the actual battlefield environment, the task entering system is independent and random, and the time, the position, the type and the like of the task are difficult to predict, which brings a series of problems to the task allocation:
1. real-time performance: in the dynamic task allocation problem, the change of factors such as environment, members, tasks and the like requires an effective task allocation method to realize quick decision. Otherwise, the assignment algorithm takes too long resulting in a delay of the fighter plane. However, battlefield situations change instantaneously in the actual task allocation process, the appearance of new tasks is unpredictable, and the real-time response to the situations puts high requirements on a task allocation algorithm.
2. And (3) scale limitation: the application of drone swarm in future wars will be large-scale. However, the increase of the number of the unmanned aerial vehicles and the number of the targets enables the scale of the task allocation algorithm to increase at an exponential function speed, limits the scale of dynamic task allocation, and brings huge challenges to the solution of a satisfactory allocation strategy.
3. The coordination requirement is as follows: the large-scale unmanned aerial vehicle group aims at realizing multi-machine cooperation. The advantage that unmanned aerial vehicle carries out the task can be reflected in the cooperation of multimachine, gives play to unmanned aerial vehicle's performance better, improves the success rate of carrying out the task, reduce cost. For the collaboration, no suitable evaluation index is provided at present, and the collaboration is realized only from the aspects of grouping, executing task time sequence, carrying weapon grouping and the like.
4. Mixing property: the task distribution system should be able to manage different types of members, which may be different in terms of software structure, hardware composition, etc., and the tasks to be completed may also be different, requiring the task distribution system to be an open, extensible system.
The factors to be considered in the multi-unmanned aerial vehicle dynamic task allocation model are as follows:
1. threat: the threats existing in the process of executing the task by the unmanned aerial vehicle include known threats and unknown threats. Known threats are ground anti-air fire, radar and air threats that have been determined prior to performing a mission. Unknown threats are emerging threats and unforeseeable threats during the task execution of the unmanned aerial vehicle. The unmanned aerial vehicle task system needs to take corresponding measures for different threats, including threat assessment, threat reporting to other unmanned aerial vehicles and ground command centers, threat avoidance or attack implementation and the like.
2. Disorder: mainly refer to terrain obstacles, confirm before carrying out the task, need avoid the barrier when the task is distributed, bring new problem for unmanned aerial vehicle flight safety when low latitude is suddenly prevented: after entering an enemy defense area, a flight height that is too high will increase the probability of being threatened, and a flight height that is too low will increase the safety risk.
3. Own strength: when dynamic task allocation is carried out, own helicopters and the like need to be considered, and cooperation among all fighting forces is achieved.
In addition, in the problem of cooperative task allocation of the swarm, a strategy set which maximizes the total profit of the system is defined as an optimal strategy. The optimal strategy means that all unmanned aerial vehicles execute tasks in an optimal distribution mode, and cooperation of all unmanned aerial vehicles is achieved. The optimal strategy can be solved through monotonicity of a state space in a special problem by adopting a linear and nonlinear programming method and the like, but a plurality of constraint conditions are mutually coupled and have complex relations in multi-unmanned aerial vehicle cooperative task allocation, a plurality of variables are not found with monotonicity, and the optimal strategy is difficult to obtain by adopting the method.
Due to the change of battlefield situation in task allocation, no fixed allocation strategy can be circulated, and the decision at each moment has influence on the total benefit of the system. The optimal strategy is generally obtained by an intelligent optimization algorithm, and the optimization of the system structure also helps to obtain the optimal strategy.
The system has an optimal strategy. Because the total system time is a finite value, the state space is discrete and finite, the number of the unmanned aerial vehicles and the tasks is also finite, and the system selectable strategy set space is finite. Therefore, a certain strategy set in the system has performance not inferior to the performance of other possible solutions, and the solution is the optimal solution.
Although the optimal strategy certainly exists, the possible solution scale is huge, and the optimal strategy is extremely difficult to find. The method for obtaining the optimal solution by comparing all possible solutions in the possible solution space by adopting the traversal method cannot meet the requirement of rapid and dynamic task allocation, meanwhile, the global optimal strategy is not necessarily obtained under a specific condition, and the key for realizing the cooperative task allocation is to obtain the suboptimal strategy rapidly.
The invention is provided for overcoming the defects in the prior art.
Disclosure of Invention
In view of the above problems, the present invention provides a method for dynamically allocating cooperative tasks of a multi-drone swarm.
In order to realize the purpose of the invention, the technical scheme provided by the invention is as follows: a multi-unmanned aerial vehicle swarm cooperative task dynamic allocation method comprises the following steps:
s1, establishing a cooperative task allocation model;
based on a Multi-Agent system, regarding each unmanned aerial vehicle as one Agent, and distributing and executing tasks by endowing the agents with autonomous capacity;
the collaborative task allocation model is described as being composed of five recombinations:
{ Time, task, agent, policy _ Set, objective _ Function } (equation 1)
Wherein, time is Time, task is Task, agent is unmanned plane, policy _ Set is strategy Set, and obj partial _ Function is evaluation Function;
s2, optimizing a task allocation strategy of the collaborative task allocation model in the S1 based on a selection optimization method;
and making a circle with a certain radius by taking each task as a center, adjusting the selection of each task to the Agent in the circle to realize strategy optimization, ensuring that each iteration enables the distribution strategy of the Multi Agent system to move towards a more optimal direction by a selection mechanism, and obtaining the optimal distribution strategy by the Multi Agent system through limited iterations.
Wherein,
the task strategy optimization of the selection optimization method in the step S2 specifically comprises the following steps:
setting M agents and N tasks in a Multi Agent system;
step S21: by task T k At the position of the circle center, r k Making a circle for the radius; the radius of the circle is selected to satisfy: within each circle there is at least n min Each Agent and the radius of each circle is larger than r min 。
Step S22: generating an initial allocation strategy;
randomly selecting one Agent for each task in the Multi Agent system, and representing the decision variable at the moment as D t (0) And calculating a profit value:
step S23: randomly selecting a taskAdjust the task toAnd satisfies the following conditions: a is a k ≠a′ k The allocation policy at this time is represented as D t (1);
Step S24: checking the rationality of the strategy: judgment of D t (1) Whether or not there is: i, j belongs to N,if present, willIs adjusted toLet D be t (1) Satisfies the following conditions:the allocation policy at this time is denoted as D t (2) The strategic profit values are:
step S25: : for D t (2) For the rest of tasks other than k, iIn turn selectAgent in the circle and represent the newly adjusted strategy as D t (3) I is equal to N, such that
Satisfies the following conditions: v. of 3 ≥v 2 ;
Step S26: and updating the Multi Agent system optimal strategy. Defining an optimal policy as v max . If v is max <v 3 Then v is max =v 3 ;
Step S27: if v is max And meeting the requirement of the Multi Agent system, ending the optimizing process, or else, step S23.
The beneficial effects of the invention include:
the selective optimization method takes the selectable unmanned aerial vehicles in a certain range as the optimization objects, reduces the possible solution scale and improves the optimization speed. The optimal selection algorithm ensures that the allocation strategy is continuously close to the optimal allocation strategy along with the iterative process. The requirement of rapid and dynamic task allocation is met, meanwhile, a suboptimal strategy is rapidly obtained under a specific condition, and cooperative task allocation is realized.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic diagram illustrating the calculation of the meeting time and position of the unmanned aerial vehicle and the task;
FIG. 2 is a schematic diagram illustrating the effect of the number of drones and the number of tasks on the possible solution size;
FIG. 3 is a schematic diagram illustrating the effect of the number of drones and the number of targets on the possible solution size;
FIG. 4 is a schematic diagram of the effect of task type on the possible solution size;
FIG. 5 is a flowchart of a policy optimization step of the selection optimization method.
Detailed Description
The technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.
The invention discloses a multi-unmanned aerial vehicle swarm cooperative task allocation method, which comprises the following steps:
s1, establishing a collaborative task allocation model;
the Multi-unmanned aerial vehicle collaborative task allocation model is based on a Multi-Agent system, each unmanned aerial vehicle is regarded as one Agent, and tasks are allocated and executed by endowing the agents with autonomous capacity. Two types of agents are stored in the system: decision agents and member agents. The decision Agent has task allocation decision capability, acquires system state information and makes task allocation decision for the member Agent; and the member agents cooperatively execute the distribution task and feed back the state of the distribution task.
Each Agent in the collaborative task allocation model has different initial states, the Multi-Agent system allocates tasks according to task attributes and Agent states, and state information exchange and decision instruction issuing are completed through a data chain.
The collaborative task allocation model is described as follows:
based on a Multi Agent Markov decision process theory, with reference to a modeling method of the Markov theory, the model is established to be composed of the following five components:
{ Time, task, agent, policy _ Set, objective _ Function } (equation 1)
The parts are described as follows:
1. time (Time)
The Multi Agent system time is finite and has finite discrete time points t 0 ,t 1 ,t 2 ,...,t e Is represented by the formula (I) in which t 0 、t e Respectively start and end times.
2. Task (Task)
In the collaborative task allocation model, the unmanned aerial vehicle executes a certain task or attacks on a certain target and the like are collectively called as tasks. The task type has a significant impact on the size of the task allocation. The more task types, the task execution order to be considered when distributing tasks will increase the possible solution size sharply. A
The increase in task types will dramatically increase the possible solution size, severely impacting dynamic task allocation efficiency. To simplify problem modeling, a single task type is employed, and tasks assigned to each Agent are treated as the same type of task. When each target enters the Multi Agent system, all tasks are decoupled into different and independent tasks through the task evaluation system, and different initial states are set for each task. By decoupling the tasks and controlling the time and the sequence of the tasks entering the task queue, the problems of processing the types and the execution sequence of the multiple tasks can be avoided, the possible solution scale is effectively reduced, and the efficiency is improved.
the newly-appeared task number of the Multi Agent system at the time t is n (t), obeys the distribution theta (n (t)), and n (t) belongs to theta and meets the following conditions:
where P (-) represents the probability and D (-) is the correlation function. All tasks in the Multi Agent system at the time t are expressed as:
[T 1 ,T 2 ,...,T N ](formula 3)
T n At time t the state is:
wherein,respectively represent T time T n The position and the speed of the sensor are two-dimensional vectors,ζ n performing a task T for an Agent n Including threat costs, weapon consumptions, etc., at T n When entering the Multi Agent system. Zeta when different agents execute the same task n The same is true. To simplify the modeling process, it is assumed that each task enters the Multi Agent system with randomly determined speed and direction and is invariant in the course of executing the task. w is a n Performing a task T for an Agent n Is given a prize value of, and w n >ζ n Is greater than 0. The magnitude of the reward value reflects the importance of each task. A large value of reward indicates that the task is important, and the more revenue the Multi Agent system receives for performing the task.
The task queue Φ refers to the list of tasks that need to be executed currently:
Φ={T 1 ,T 2 ,…,T k waiter, k =1,2, 3. (formula 5)
3. Unmanned plane (Agent)
Regarding each unmanned aerial vehicle with the capability of executing tasks and making autonomous decisions as one Agent, the Agent is expressed as:
a m m =1,2,., M (formula 6)
The agents are set as follows: the Agent is a rigid body, the ground coordinates are inertia coordinates, the ground is regarded as a plane, and the gravity acceleration does not change along with the height. The state of each Agent changes with time in the process of executing the task, and the state information comprises:
wherein,is a m At the time of the state at the moment t,respectively, the position and the velocity information, is a m Attack capability at time t.Is a m The task state is assigned at time t.Indicating that the Agent is not currently assigned a task,shows that Agent has currently assigned T n 。
Setting:
(1) When the whole Agent executes the tasks, the speed is the same and is a fixed value, and when the tasks are not distributed, the Agent performs hovering flight at the current position at the fixed speed and is regarded as the position is unchanged;
(2) All agents consume the same in unit time;
(3) Each Agent can only execute one task at a time;
(4) Different agents have the same income value and different consumption values when completing the same task;
(5) The Agent is regarded as a mass point and does not consider the turning radius, the Agent communication capacity is limited, and the communication range is limited;
(6) The problem of collision avoidance between agents is not considered;
(7) The task allocation adopts a single-step planning mechanism;
(8) The route cost of an Agent to perform a task is proportional to the Agent's flight distance.
4. Policy Set (Policy _ Set)
the Agent and the task number in the Multi Agent system at the moment t are respectively marked as m and n, and the optional task allocation strategy can be expressed as follows:
D(t)={T 1 ',T 2 ',…,T m '} T (formula 8)
Where D (T) is a matrix of m × 1, T i '∈(T 1 ,T 2 ,…,T n ). Considering the number of tasks and drones, the scale of the possible solution is:
wherein:
(equation 9) satisfies the following constraint:
(1) Each Agent can only execute one task at any time and can not execute any task;
(2) Each task can be executed by only one Agent at any time, and can not be executed.
As can be seen from (equation 9), the possible solution size of the Multi Agent system increases sharply as m and n increase.
And a single-step planning mechanism is adopted, namely, only one task is allocated to each Agent in each task allocation process, and the task of each Agent is dynamically adjusted according to the change of the battlefield situation in the task allocation process. The set of policies in the whole task allocation process of the Multi-Agent system is defined as a policy set, and is expressed as:
Ω={D(t 0 ),D(t 1 ),D(t 2 ),…,D(t e ) } (formula 11)
The strategy set reflects the task execution condition of each Agent in the whole task execution process, and is a basis for evaluating the cooperation among the agents.
5. Evaluation Function (Objective _ Function)
The evaluation function refers to an objective function in task allocation, and factors such as the flight distance of the unmanned aerial vehicle, the income value of an attack target, the weapon consumption value and the threat cost need to be considered. For example, when the unmanned aerial vehicle is required to complete a task in the shortest time, an objective function is set to aim at the shortest flight distance; when the unmanned aerial vehicle is required to execute the minimum consumption in the task, the consumption value is the minimum. The above factors are coupled with each other, and trade-off between the factors is required, and generally, the task allocation evaluation function is a set of the factors, and the weights of the factors are distinguished by weights.
The method comprises the steps of taking the income value obtained by all agents executing tasks in the whole task allocation process of a Multi Agent system as an objective function, enabling the task allocation to be the maximum objective of the objective function value, and introducing the flight distance and the task consumption value into the objective function through a time discount factor and a task consumption discount factor.
the profit value of the Multi Agent system at the moment t in decision D (t) and state S (t) is defined as:
wherein beta is t Is a time discount factor, beta is more than or equal to 0 t ≤1。β t The smaller the revenue value obtained by the Multi Agent system decreases faster over time. Beta is a t Without considering the influence of the passage of time on the profit value, =1 t The profit value is not considered when = 0. Delta of m And t is the time consumed in the task allocation decision making process and the time for the Agent to move to the target position after the decision is made. The decision time is generally short, and may not be considered, the flight time is proportional to the distance, and Δ t may be represented as: delta of m t=d t (m,n)/V m In which d is t (m, n) is time a m Fly to and T n Flight distance at the time of encounter, V m Is a m The rate. d is a radical of t And (m, n) is determined by Agent and task position, speed magnitude and direction together. Delta. For the preparation of a coating t For performing task consumption, including processing (N-N) L ) Agent attack capability consumption zeta at individual task n And the communication cost and the decision cost are determined during task allocation, the communication cost is independent of the execution process, and the decision cost is related to the possible solution scale and the adopted allocation algorithm. Beta is a beta δ Discount primer for task consumption, beta δ ≥0,β δ =0 represents no consideration of task consumption. The communication cost only considers the communication cost when the Agent state is acquired for task distribution, and the cost is not counted by the communication between the agents when the task is executed.
ξ is a penalty function when n is present in a Multi Agent System L When an individual task is not assigned to it,
represents the sum of the reward values of the unassigned tasks, eta is a penalty function factor, and eta is greater than or equal to 0.η =0 indicates that the influence of the penalty function on the profit value is not considered, and the larger η is, the heavier the penalty is for unallocated tasks, ensuring that the Multi Agent system completes more tasks as much as possible.
Under the cooperative task allocation model, both the Agent and the task are in motion states, the flight distance when the Agent meets the task is not the linear distance between the Agent and the task, and d t The calculation of (m, n) is schematically shown in FIG. 1:
setting the initial time Agent position as (x) a ,y a ) Absolute value of velocity v a In a direction of θ(ii) a The task position is (x) T ,y T ) And is made ofThe Agent meets the task at the moment t, and meeting points meet the equation:
and (4) eliminating the variable theta to obtain an equation:
the condition for the solution of equation (equation 16) is:
for the solution to be meaningful, it must also satisfy: t is more than or equal to 0.
To obtain:
t=max(t 1 ,t 2 ) When t is more than or equal to 0, t is effective solution, and when t is less than 0, the equation has no solution.
The heading and required flight time when the Agent is performing the task from (equation 18) available, and the path length is:
in summary, the Multi Agent system yields at time t are:
the total income Γ of the Multi Agent system in the whole task allocation process is as follows:
the complexity of the task allocation strategy is set forth as the size of the tasks and the number of drones increases. The rule of the possible solution of the cooperative task allocation of the multiple drones is determined by the formula 9. The Agent number in the Multi Agent system is set to be m, the task number is set to be n, and the influence of the increase of the Agent number and the task number on the possible solution scale is shown in fig. 2:
it was found by calculation that when m = n =20, the possible solution size Num =1.73 × 10 21 . Finding an optimal strategy in such a large solution space is extremely difficult. To reduce the possible solution size, grouping of agents and tasks in a Multi-Agent system is an efficient method. For example, when the number of tasks is 20, the tasks are allocated after 20 agents are divided into 4 groups, and then the possible solution size becomes: num =4 × (3.13 × 10) 19 )=1.25×10 20 In this case, the possible solution size is only 7.2% of the size before the grouping, and the larger the number of groupings, the more advantageous the reduction of the possible solution size. Therefore, grouping agents using a distributed model is an effective way to reduce the complexity of the problem.
In order to simplify the model, a single task type model is adopted, and task allocation is only performed once for each task, so that the possible solution scale is reduced. To discuss the effect of task type on possible solution size, assume that when the number of Agents is m and the number of tasks is N, N m The smallest possible solution size for a particular task type is:
divide into N for each task m The number of tasks to be distributed is nN m There are m allocation patterns for each task, and the possible solution size of the Multi-Agent system is determined by equation (9) and is larger than the value calculated by equation (22). The influence of Agents and task numbers on the possible solution size is analyzed by equation (22), as shown in FIG. 3, whereN m =1。
Task type N m The effect on the possible solution size is shown in fig. 4:
as can be seen in FIG. 3, N m The increase in (c) will also cause the possible solution space to grow at the rate of an exponential function, as will the number of tasks and agents.
S2, optimizing a task allocation strategy of the collaborative task allocation model in the S1 based on a selection optimization method;
the possible solution size is related to the number of the alternative agents of each task, and the reduction of the number of the alternative agents of each task can reduce the possible solution size. Through analyzing the optimal task allocation strategy, each task under the optimal task allocation strategy is basically executed by the agents within a certain range around the task. On one hand, the shorter route between the task and the Agent is beneficial to reducing the cost, and on the other hand, the task execution delay and the threat cost brought by the shorter route are lower.
When the optimizing method is selected to carry out task allocation strategy optimizing, a circle with a certain radius is made by taking each task as a center, and the selection of each task on the Agent in the circle is adjusted to realize strategy optimizing. The selection mechanism ensures that each iteration moves the Multi Agent system allocation policy to a more optimal direction. After a limited number of iterations, the Multi Agent system will obtain an optimal allocation strategy.
As shown in fig. 5, the strategy optimization steps of the selection optimization method of the present invention are as follows:
m agents and N tasks exist in the Multi Agent system.
Step S21: by task T k At the position of the circle center, r k Making a circle for the radius, wherein the radius of the circle is selected to meet the following conditions:
1. within each circle there is at least n min Each Agent;
2. the radius of each circle is larger than r min 。
Step S22: generating an initial allocation strategy: randomly selecting one Agent for each task in the Multi Agent system, and representing the decision variable at the moment as D t (0) And calculating a profit value:
whereink belongs to N and satisfies:when M < N, there will be (N-M) tasks that are not allocated for execution.
Step S23: randomly selecting a taskAdjust the task toAnd satisfies the following conditions: a is a k ≠a′ k The allocation policy at this time is represented as D t (1)。
Step S24: checking the rationality of the strategy: judgment of D t (1) Whether or not there is: i, j ∈ N,k ≠ i, if present, willIs adjusted toLet D be t (1) Satisfies the following conditions:denote the allocation policy at this time as D t (2) The strategy profit value is:
step S25: : for D t (2) For other tasks than k, iIn turn selectAgent in the circle and represent the newly adjusted strategy as D t (3) I is equal to N, such that
Satisfies the following conditions: v. of 3 ≥v 2 。
Step S26: and updating the Multi Agent system optimal strategy. Defining an optimal policy as v max . If v is max <v 3 Then v is max =v 3 。
Step S27: if v is max And (5) meeting the requirement of the Multi Agent system, finishing the optimizing process, and otherwise, step S23.
The optimization method is selected, and the selectable agents in a certain range are used as optimization objects, so that the possible solution scale is reduced, and the optimization speed is improved. The most preferred selection algorithm ensures that the allocation strategy is continuously close to the optimal allocation strategy along with the iterative process.
By adopting a selection optimization method and limited iteration, the Multi Agent system can obtain an optimal distribution strategy. At each moment, the task number and the Agent number in the Multi Agent system are finite values, and all the selectable allocation strategy numbers are finite values. When r is k When the circle centered on each task is large enough to cover all the agents in the Multi Agent system, the rotation selection algorithm requires that the optimal strategy is selected in each selection process, which ensures that the Multi Agent system will obtain the optimal distribution strategy after a limited number of iterations.
The selection of the optimization algorithm can reduce the possible solution size. Suppose the Agent number in the Multi Agent system is m, a i I =1,2, \8230;, m, number of tasks T n And T is n N is less than or equal to N. The number of the agents distributed around each task is x, and x is more than or equal to 1 and less than or equal to m. The possible solution scale for the optimization method is chosen as:
in the selection optimization method, the selection of the radius has an important influence on the obtained optimal strategy. When the radius of selection is infinite, the selectable agents of each task are all the agents in the Multi Agent system, so that the possible solution scale of the selection optimization method is the same as that of strategy optimization based on a genetic algorithm, and the selection optimization method can obtain the global optimal solution of the Multi Agent system. When the selection radius is reduced so that the possible solution scale of the Multi-Agent system is reduced, the optimal agents of certain tasks in the Multi-Agent system are likely to be out of the selection circle, and the Multi-Agent system cannot obtain the optimal distribution strategy. Therefore, in the implementation process, the selection of the selection radius needs to balance the possible solution size and the strategic performance.
The described embodiments are only some embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Claims (3)
1. A multi-unmanned aerial vehicle swarm cooperative task dynamic allocation method is characterized by comprising the following steps:
s1, establishing a collaborative task allocation model;
based on a Multi-Agent system, regarding each unmanned aerial vehicle as one Agent, and distributing and executing tasks by endowing the agents with autonomous capacity;
the collaborative task allocation model is described as being composed of five recombinations:
{ Time, task, agent, policy _ Set, objective _ Function } (equation 1)
Wherein, time is Time, task is Task, agent is unmanned plane, policy _ Set is strategy Set, and obj partial _ Function is evaluation Function;
s2, optimizing a task allocation strategy of the collaborative task allocation model in the S1 based on a selection optimization method;
and making a circle with a certain radius by taking each task as a center, adjusting the selection of each task to the Agent in the circle to realize strategy optimization, ensuring that each iteration enables the distribution strategy of the Multi Agent system to move towards a more optimal direction by a selection mechanism, and obtaining the optimal distribution strategy by the Multi Agent system through limited iterations.
2. The method of claim 1, wherein the method for dynamically allocating the cooperative tasks of the multi-drone swarm comprises the following steps:
the task strategy optimization of the selection optimization method in the step S2 specifically comprises the following steps:
setting M agents and N tasks in a Multi Agent system;
step S21: by task T k At the position of the circle r k Making a circle for the radius;
step S22: generating an initial allocation strategy;
randomly selecting one Agent for each task in the Multi Agent system, and representing the decision variable at the moment as D t (0) And calculating a profit value:
step S23: randomly selecting a taskAdjust the task toAnd satisfies the following conditions: a is a k ≠a′ k The allocation policy at this time is denoted as D t (1);
Step S24: checking the rationality of the strategy: judgment of D t (1) Whether or not there is: i, j ∈ N,k ≠ i, if present, willIs adjusted toLet D t (1) Satisfies the following conditions:denote the allocation policy at this time as D t (2) The strategy profit value is:
step S25: : for D t (2) For other tasks than k, iIn turn selectAgent within the circle and represent the newly adjusted strategy as D t (3) I is equal to N, such that
Satisfies the following conditions: v. of 3 ≥v 2 ;
Step S26: and updating the Multi Agent system optimal strategy. Defining an optimal policy as v max . If v is max <v 3 Then v is max =v 3 ;
Step S27: if v is max And (5) meeting the requirement of the Multi Agent system, finishing the optimizing process, and otherwise, step S23.
3. The method of claim 2, wherein the method for dynamically allocating the cooperative tasks of the multi-drone swarm comprises the following steps:
in step S21, the radius of the circle is selected to satisfy: within each circle there is at least n min Each Agent and the radius of each circle is larger than r min 。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210822637.3A CN115220473A (en) | 2022-07-12 | 2022-07-12 | Multi-unmanned aerial vehicle swarm cooperative task dynamic allocation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210822637.3A CN115220473A (en) | 2022-07-12 | 2022-07-12 | Multi-unmanned aerial vehicle swarm cooperative task dynamic allocation method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115220473A true CN115220473A (en) | 2022-10-21 |
Family
ID=83612666
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210822637.3A Pending CN115220473A (en) | 2022-07-12 | 2022-07-12 | Multi-unmanned aerial vehicle swarm cooperative task dynamic allocation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115220473A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115903885A (en) * | 2022-10-26 | 2023-04-04 | 中国人民解放军陆军炮兵防空兵学院 | Unmanned aerial vehicle flight control method based on task traction bee colony Agent model |
-
2022
- 2022-07-12 CN CN202210822637.3A patent/CN115220473A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115903885A (en) * | 2022-10-26 | 2023-04-04 | 中国人民解放军陆军炮兵防空兵学院 | Unmanned aerial vehicle flight control method based on task traction bee colony Agent model |
CN115903885B (en) * | 2022-10-26 | 2023-09-29 | 中国人民解放军陆军炮兵防空兵学院 | Unmanned aerial vehicle flight control method of swarm Agent model based on task traction |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111880563B (en) | Multi-unmanned aerial vehicle task decision method based on MADDPG | |
Zhen et al. | Improved contract network protocol algorithm based cooperative target allocation of heterogeneous UAV swarm | |
CN108680063B (en) | A kind of decision-making technique for extensive unmanned plane cluster dynamic confrontation | |
CN113009934A (en) | Multi-unmanned aerial vehicle task dynamic allocation method based on improved particle swarm optimization | |
Rasmussen et al. | Tree search algorithm for assigning cooperating UAVs to multiple tasks | |
Wang et al. | Improving maneuver strategy in air combat by alternate freeze games with a deep reinforcement learning algorithm | |
CN110928329A (en) | Multi-aircraft track planning method based on deep Q learning algorithm | |
Li et al. | Autonomous maneuver decision-making for a UCAV in short-range aerial combat based on an MS-DDQN algorithm | |
CN108153328A (en) | A kind of more guided missiles based on segmentation Bezier cooperate with path planning method | |
Zhang et al. | Dynamic mission planning algorithm for UAV formation in battlefield environment | |
CN111091273A (en) | Multi-missile cooperative task planning method based on capability prediction | |
CN111859541A (en) | PMADDPG multi-unmanned aerial vehicle task decision method based on transfer learning improvement | |
CN114330115B (en) | Neural network air combat maneuver decision-making method based on particle swarm search | |
CN114063644B (en) | Unmanned fighter plane air combat autonomous decision-making method based on pigeon flock reverse countermeasure learning | |
CN113608546B (en) | Unmanned aerial vehicle group task distribution method based on quantum sea lion mechanism | |
Wu et al. | Heterogeneous mission planning for multiple uav formations via metaheuristic algorithms | |
Yan et al. | Multi-UAV objective assignment using Hungarian fusion genetic algorithm | |
CN111773722B (en) | Method for generating maneuver strategy set for avoiding fighter plane in simulation environment | |
CN115220473A (en) | Multi-unmanned aerial vehicle swarm cooperative task dynamic allocation method | |
CN113887919A (en) | Hybrid-discrete particle swarm algorithm-based multi-unmanned aerial vehicle cooperative task allocation method and system | |
CN115755963A (en) | Unmanned aerial vehicle group cooperative task planning method considering carrier delivery mode | |
CN113671825A (en) | Maneuvering intelligent decision missile avoidance method based on reinforcement learning | |
Rasmussen et al. | Branch and bound tree search for assigning cooperating UAVs to multiple tasks | |
Liu et al. | Discrete pigeon-inspired optimization-simulated annealing algorithm and optimal reciprocal collision avoidance scheme for fixed-wing UAV formation assembly | |
CN116088586B (en) | Method for planning on-line tasks in unmanned aerial vehicle combat process |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |