CN112070383B

CN112070383B - Dynamic task-oriented multi-agent distributed task allocation method

Info

Publication number: CN112070383B
Application number: CN202010898123.7A
Authority: CN
Inventors: 辛斌; 丁玉隆; 陈杰; 方浩; 杜鑫; 张昊
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2020-08-31
Filing date: 2020-08-31
Publication date: 2022-04-12
Anticipated expiration: 2040-08-31
Also published as: CN112070383A

Abstract

The invention provides a multi-agent distributed task allocation method for dynamic tasks, which takes the dynamic performance of the tasks into consideration, utilizes an auction-call algorithm to form a pre-allocation scheme, has strong practicability, can be used for completing tasks such as coordinated multipoint aggregation, coordinated multi-target reconnaissance, coordinated multi-target capture and the like, ensures that the decision among agents is free from conflict and the task completion efficiency is effectively ensured, and the task allocation rate reaches 100%; in addition, the behavior and the communication of the intelligent agent are based on a unified framework, the intelligent agent can be added or deleted based on the framework, the robustness of the whole system is improved, the intelligent agent is suitable for being damaged, and the scenes of the intelligent agent can be increased at any time.

Description

Dynamic task-oriented multi-agent distributed task allocation method

Technical Field

The invention belongs to the technical field of multi-agent task allocation, and particularly relates to a dynamic task-oriented multi-agent distributed task allocation method.

Background

With the development of theories such as unmanned systems and the like, the performance of intelligent agents is improved, and the dynamic tasks (such as forest fire fighting) which are highly dangerous, have dispersed task points, are time-sensitive and contain various uncertainties by using multiple intelligent agents become the future trend. In forest fire fighting missions, the fire (intensity of the mission) may change over time due to natural or man-made factors. In the initial stage of fire, one intelligent agent can extinguish the fire source, but the fire intensity is increased along with the increase of the fire, so that the requirement of complex fire fighting tasks cannot be met by a single intelligent agent, and the given fire fighting tasks need to be completed by utilizing a plurality of intelligent agents together. The cooperation of the multiple intelligent agents can effectively improve the rescue efficiency and reduce the required number and casualties of rescuers.

From the solution mechanism, the multi-agent task allocation can be divided into centralized type and distributed type. In the centralized distribution, one robot is used as a central planner, obtains the current task and the environmental state through communication with other robots, and then performs global task distribution. The method has good global characteristics, but relies on a single central planner. There is no central planner in the distributed task allocation. Each intelligent agent autonomously selects and decides tasks through self perception of the intelligent agents to the environmental information, and a plurality of intelligent agents form a task allocation scheme through mutual negotiation. Common distributed task allocation methods include a market mechanism-based method, an idle chain-based method, a threshold response method, and the like. The distributed decision is suitable for parallel computation, has better expansibility and robustness, and is suitable for a larger-scale system.

However, in the existing distributed task allocation methods, task modeling mostly assumes that the intensity of the task does not change, and does not consider the situation that the intensity of the task changes along with time.

Disclosure of Invention

In order to solve the problems, the invention provides a multi-agent distributed task allocation method for dynamic tasks, which can effectively and quickly allocate a plurality of target tasks and enable the dynamic tasks to be completed as soon as possible.

A multi-agent distributed task allocation method for dynamic tasks, which allocates distributed tasks for multi-agents based on an auction-levying algorithm, comprises the following steps:

s1: respectively taking each agent as a current agent to execute the following operations:

obtaining the bid value of the current agent i to each uncompleted task, and taking the uncompleted task corresponding to the maximum bid value as the task j which is best at the current agent i^*And constructing a bidding vector A corresponding to the current agent i according to the bidding vectors A⁽ⁱ⁾＝[i,j^*,fⁱ(j^*)]Then the bid vector A is added⁽ⁱ⁾Broadcast to other agents, wherein,fⁱ(j^*) Task j which is the best task for current agent i^*A bid value of;

s2: each agent compares the bid value in the received bid vector with the bid value in the bid vector, and takes the bid vector corresponding to the minimum value as the final bid vector A^PMAnd will broadcast a bid vector A^PMThe intelligent agent is used as a task manager, and the task manager takes the most adept task as the next task j to be executed in the task list of the task manager^PM；

S3: each agent determines whether the task manager can independently complete task j^PMIf the task manager cannot independently complete task j^PMEach agent obtains its own task j^PMThe difference between the bid value of (a) and the bid value of the task that is best at the user's own right; after receiving the difference value transmitted by each agent, the task manager sends task alliance invitation to the first m agents with the maximum difference value, so that the first m agents send task j^PMRespectively as the next task to be executed in its own task list, forming a task alliance, and then entering step S4, where m is the number of participating tasks j^PMIs greater than task j^PMThe minimum number of agents to be added to the intensity change rate of (c); if the task manager can independently complete the task j^PMIf so, the task manager forms a task alliance separately and directly enters the step S4;

s4: will task j^PMAnd (4) eliminating the incomplete tasks, and re-executing the steps S1-S3 until all the tasks are distributed.

Further, the task j is executed^PMBefore removing the incomplete tasks, the following steps are executed:

s3 a: task manager to not have task j^PMBroadcasting calling information as an agent of a next task to be executed in a task list, wherein the calling information comprises the number of each agent in a task alliance and a task j^PMCompletion time of

S3 b: each intelligent agent receiving the call information acquires the intelligent agent and the task alliance to complete the task j according to the position information and the call information of the intelligent agent^PMIs completed by time t_ccAnd will complete the time t_ccSending the data to a task manager;

s3 c: the task manager according to all the completion time t_ccMinimum value of

Calculating a time difference ratio TS:

s3 d: the task manager judges whether the time difference value ratio TS is larger than a preset threshold value or not, if so, the task manager sends task alliance invitation to the intelligent agent corresponding to the minimum value, and the intelligent agent enables the intelligent agent to send a task j^PMThe next task to be executed in the task list is taken as the task, the task alliance is updated, and the step S3e is carried out; if not, go to step S4;

s3 e: and re-executing the steps S3 a-S3 d by adopting the updated task alliance until the time difference value ratio TS is not larger than the preset threshold value.

Further, the completion time t_ccThe calculation method comprises the following steps:

wherein epsilon is a set threshold value,

for the s-th in the current task federation_iCapability value of the individual agent, and_i＝1,2,…,m，

is the s_iIndividual agent arrival task j^PMThe time at which the position is located,

reach task j for the ith agent^PMTime of location, b_lFor the ability value of the first agent that received the summons message,

as task j^PMThe rate of change of intensity state.

Further, after any agent completes all tasks in the task list, the agent obtains the tasks currently executed by other agents, obtains the completion time for completing each currently executed task according to the position information of the agent, the position information of each currently executed task, the task strength information and the corresponding task alliance, adds the currently executed task corresponding to the maximum completion time into the task list of the agent, and starts to execute the task.

Further, the method for acquiring the output value of each uncompleted task by the current agent i comprises the following steps:

wherein f isⁱ(j) Bid value, λ, for current agent i for jth task_jFor the intensity state change rate of the jth task,

for the moment when agent i reaches the location of the jth task,

as the current time of day, the time of day,

for the position coordinates of the current agent i,

is the position coordinate of the jth task,

is the current agent i speed.

Further, the intelligent agent is an unmanned aerial vehicle, a robot, a reconnaissance plane or an intelligent attack weapon.

Has the advantages that:

1. the invention provides a multi-agent distributed task allocation method for dynamic tasks, which takes the dynamic performance of the tasks into consideration, utilizes an auction-call algorithm to form a pre-allocation scheme, has strong practicability, can be used for completing tasks such as coordinated multipoint aggregation, coordinated multi-target reconnaissance, coordinated multi-target capture and the like, ensures that the decision among agents is free from conflict and the task completion efficiency is effectively ensured, and the task allocation rate reaches 100%;

in addition, the behavior and the communication of the intelligent agent are based on a unified framework, the intelligent agent can be added or deleted based on the framework, the robustness of the whole system is improved, the intelligent agent is suitable for being damaged, and the scenes of the intelligent agent can be increased at any time.

2. The invention provides a dynamic task-oriented multi-agent distributed task allocation method, which is characterized in that after a task alliance for completing a certain task is obtained, other agents capable of greatly shortening task completion time are found according to the calling information of the current task alliance and the information of agents except the task alliance, and the agents are requested to join the task alliance, so that the task completion time can be further shortened on the basis of effectively ensuring the task allocation efficiency.

3. The invention provides a dynamic task-oriented multi-agent distributed task allocation method, after each agent finishes self pre-allocation tasks, the agent dynamically adjusts self-planning to assist other agents to finish tasks according to task execution conditions, and can accelerate the completion of the whole task.

Drawings

FIG. 1 is a flow chart of a multi-agent distributed task assignment method provided by the present invention;

FIG. 2 is a schematic diagram of a task state change when an agent is at a task point;

FIG. 3 is a schematic diagram of a pre-allocation scheme generation process;

FIG. 4 is a state change diagram of task point 1 in an embodiment of the present invention;

FIG. 5 is a state change diagram of task point 2 in an embodiment of the present invention;

fig. 6 is a state change diagram of task point 3 in the embodiment of the present invention.

Detailed Description

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.

The invention provides a multi-agent distributed task allocation method which is suitable for multi-agent distributed task allocation. An agent is an abstract concept, and the agent generally has the following functions: dynamic conditions and information in the environment can be perceived; performing an action affects an environmental condition; reasoning is performed to solve the problem. The entities it maps may be: fire extinguishing unmanned aerial vehicle in forest fire; a recourse robot in a natural disaster recourse task; reconnaissance aircraft, intelligent striking weapons and the like in military striking missions.

In all the above scenarios, there are several features:

the ability of a single agent to accomplish a task is limited and varied. Many situations require multiple agents to collaborate to perform a task at a target task point. For example, in disaster relief, fire extinguishing ability, search ability, and the like of the robot are limited, and the values are given in advance according to expert experience, and hereinafter, are abstracted as "intelligent ability values".

The degree of difficulty varies for each task. For example, in a forest fire fighting task, the difficulty of a small fire starting task is small, and a single agent can complete the task; some tasks are large in ignition intensity and large in task difficulty, and a plurality of intelligent agents can complete the fire extinguishing task. Hereinafter, the degree of difficulty of the task is represented by "task strength value".

The intensity value of the task dynamically evolves over time, for example in a forest fire fighting task, the fire is relatively small at the beginning and the intensity value of the task is correspondingly relatively small. The fire behavior is continuously enhanced along with the time, and the task intensity is gradually increased along with the time.

Because a plurality of agents are required to perform and complete a task together, the task allocation scheme requires one agent to negotiate and allocate. Hereinafter, the agent responsible for coordinating the task allocation scheme is denoted "manager".

The invention provides a multi-agent and multi-task distributed task allocation method. Aiming at the dynamic task of the dynamic evolution of the task intensity value along with the time, the method generates a fast and reasonable task allocation scheme through multi-agent distributed negotiation. The invention takes a two-stage task planning algorithm as a solving algorithm of distributed task allocation. The algorithm has high task allocation speed and can effectively prevent task allocation conflict.

As shown in FIG. 1, the present invention provides a multi-agent distributed task allocation method, which comprises the following steps:

step 1: initializing agent information, the agent information comprising: an agent number, an agent position coordinate, an agent speed, an agent physical strength value, and an agent status value.

Wherein the intelligent physical ability value is a quantized value of the intelligent physical ability; the intelligent agent state value is a state indication value of the intelligent agent; the initialized agent state value is 0, i.e. the agent is indicated as idle state.

In the embodiment of the invention, the intelligent agent information comprises an intelligent agent number i and an intelligent agent position coordinate

Speed of agent

Intelligent physical ability value b_iState value of agent

The initial value is 0. When agent i reaches jth task point, the growth of task state is slowed down or the state is reduced, and the model of task strength state changes to

Wherein

Is the time that agent i predicts to reach the location of task j.

Initializing each task information, wherein the task information comprises: the method comprises the following steps of task number, task position, task initial state, task intensity, intensity state change rate of a task, a task change model and a task threshold epsilon; wherein the task strength value is the agent capability required to complete the task. Wherein the task strength and the intelligent ability value are in the same quantitative way, e.g. for a forest fire fighting task, its intelligent ability value may be its fire fighting ability quantized.

In the embodiment of the invention, the task information comprises a task number j and a task position coordinate (x)_j ^T、y_j ^T) Task intensity x_j. Task intensity x_jThe value of the intelligent physical ability required to complete task j is a variable that varies with time t, and the intensity model (assuming exponential growth) of task j is

Wherein alpha is_jIs the current period intensity state change rate for the jth task point. As shown in FIG. 2, when an agent is within the effective scope of a task point, the intensity of the task point is:

wherein λ is_jIs the intensity state change rate, x, of the jth task between time k and time k +1_j(k) Is the intensity state value of the jth task point at time k, m_jThe number of agents kept at the j-th task point between the time k and the time k +1, and epsilon is a threshold value of the intensity value.

Step 2: the calculated bid value f for all outstanding tasks j for each agent (e.g., agent i)ⁱ(j) The task with the highest bid value is the task j which is the best in self^*And will bid vector A⁽ⁱ⁾＝[i,j^*,fⁱ(j^*)]Broadcast to other agents.

In the embodiment of the invention, the bid value of the agent i to the task j is as follows:

wherein the content of the first and second substances,

the moment when the intelligent agent i estimates the position of the arrival task j is calculated, and the specific calculation formula is as follows:

wherein the content of the first and second substances,

it is the current time and is initialized to 0.

And step 3: agent received a of other agents⁽ⁱ⁾Is marked as

Smart body contrast

With its own A⁽ⁱ⁾(ii) a If it is not

Is/are as follows

Strictly less than A⁽ⁱ⁾F of (a)ⁱThen use

Replace own A⁽ⁱ⁾(ii) a Otherwise, the signal is kept unchanged. The above process continues until each agent has all its receptions

The comparison is completed.

That is, after each agent receives the most adept tasks and corresponding output values from other agents, the most adept tasks and the most adept tasks are compared with each other to determine the tasks to be distributed in the current round and a task manager; a of each agent by the above process⁽ⁱ⁾Will all be the same, at this point A⁽ⁱ⁾Is marked as

Agent i^PMIs considered to be task j^PMThe task manager.

And 4, step 4: and for each intelligent agent, judging whether the task manager can independently complete the task or not according to the capability value of the task manager. If the task manager can complete the task independently,

i.e. the task manager capacity value is greater than or equal to task j^PMThe intensity state change rate of (a) indicates that the task manager can complete the corresponding task j^PMIf yes, jumping to step 8, otherwise executing step 5;

and 5: for each agent, compute it completes task j with the task manager^PMThe bid value of (1) is marked as V₁. At the same time, calculate its bid value (denoted V) for completing its best task₂). Will V₁-V₂And transmitting to the task manager.

Step 6: and after receiving the bid difference value transmitted by each agent, the task manager sorts the bid values from high to low. According to the value, m agents (the number of which is as follows) before the value are selected

) Adding their ability values to the ability value of the task manager to complete the task

And issues invitations to these agents to form a task federation.

And 7: after the intelligent agent receives the invitation of forming task alliance, the task j is processed^PMAdding the state of the agent into a task list of the agent, modifying the state of the agent to be 1, and starting to execute the task.

And 8: to further speed up task completion time, the task manager does not execute task j^PMThe agent sends calling information, the information includes agent number of current task alliance

With the current task j^PMCompletion time of

And step 9: the intelligent agent (supposing intelligent agent l) receiving the calling information calculates the calling information and the position of the intelligent agent according to the calling information of the task manager and the position of the intelligent agent, and completes the task j together with the formed task alliance^PMTask completion time t_cc：

And will t_ccAnd sending the data to a task manager.

Step 10: renThe method comprises the following steps that a business manager selects a proper intelligent agent according to task completion time returned by the intelligent agent, and specifically comprises the following steps: after the task manager receives the task completion time sent by the intelligent agent to be called, the intelligent agent with the lowest task completion time is selected and recorded as the intelligent agent i^*Its corresponding task completion time is recorded as

Computing

If TS is larger than the set threshold, the intelligent agent i is sent^*And sending out a task alliance composition invitation.

In the embodiment of the present invention, the threshold is set to 0.25.

Step 11: after receiving the invitation of forming task alliance, the intelligent agent sends task k^PMAdding the task into the task list of the user, modifying the state of the agent to be 1, and starting to execute the task.

Step 12: based on not executing task j^PMUntil TS is less than the threshold, it will help to speed up task j^PMThe agents that complete the time all join the task federation.

Step 13: will task j^PMAnd (3) removing the incomplete tasks, and circularly executing the steps 2-13 by adopting the remaining incomplete tasks until all the tasks are distributed, wherein each intelligent agent has a pre-distributed task thereof at the moment, and the intelligent agents execute the tasks according to respective pre-distributed schemes.

Step 14: and after the intelligent agent finishes all pre-allocation tasks of the intelligent agent, modifying the state value of the intelligent agent to be 0.

Step 15: and the agent with the agent state of 0 dynamically adjusts self-planning to assist other agents to complete the tasks for all the tasks which are not completed at the moment, and the completion of the whole task is accelerated. Specifically, the method comprises the following steps: for all tasks that are not completed at this time, the task completion time to assist in completing the task is calculated. And selecting the task with the longest task completion time, adding the task into the task list of the user, starting to execute the task, and modifying the state of the agent to be 1.

Step 16: steps 14-16 are executed in a loop until all tasks have been performed.

The following is a specific example:

agent and task point distribution see figure 3. The parameters of the agent in this multi-agent collaborative task allocation scenario are shown in table 1:

TABLE 1

	Speed (m/s)	Position (m)	Capacity value
				Agent
1	2	(40,-30)	0.6
				Agent 2	2	(10,-30)	0.3
Agent 3	2	(-35,30)	0.4

The parameters of the task points in the multi-agent cooperative task allocation scenario are shown in table 2:

TABLE 2

	Position (m)	Rate of change of state
			Task point 1	(30,10)	0.4
Task point 2	(50,30)	0.3
			Task point 3	(100,50)	0.2

The agent for task allocation obtained in this embodiment executes a task sequence, as shown in table 3:

TABLE 3

	Order of execution of tasks
		Agent
1	Task point 1 → task point 3
		Agent 2	Task point 1 → task point 2 → task point 3
Agent 3	Task point 2 → task point 3

The task execution time obtained in this embodiment is 116.8 s. The state change of the task point in the task process is shown in fig. 4-6.

Therefore, the dynamic deduction of the tasks is considered, the algorithm forms a pre-allocation scheme by using an auction-levying algorithm, so that the decisions among the intelligent agents are free from conflict, and the task completion efficiency is effectively ensured. After each agent finishes the pre-distributed task, according to the task execution condition, the self-planning is dynamically adjusted to assist other agents to finish the task, and the completion of the whole task is accelerated. The algorithm is strong in practicability, can be used for completing tasks such as coordinated multi-point aggregation, coordinated multi-target reconnaissance and coordinated multi-target capture, does not generate a conflicting task allocation scheme, and the task allocation rate reaches 100%. The behavior and the communication of the intelligent agent are based on a unified framework, the intelligent agent can be added or deleted based on the framework, the robustness of the whole system is improved, the intelligent agent is suitable for being damaged, and the scenes of the intelligent agent can be increased at any time.

The present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof, and it will be understood by those skilled in the art that various changes and modifications may be made herein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A multi-agent distributed task allocation method for dynamic tasks is characterized in that distributed tasks are allocated to multi-agents based on an auction-levying algorithm, and the method specifically comprises the following steps:

obtaining the bid value of the current agent i to each uncompleted task, and taking the uncompleted task corresponding to the maximum bid value as the task j which is best at the current agent i^*And constructing a bidding vector A corresponding to the current agent i according to the bidding vectors A⁽ⁱ⁾＝[i,j^*,fⁱ(j^*)]Then the bid vector A is added⁽ⁱ⁾Broadcast to other agents, wherein fⁱ(j^*) Task j which is the best task for current agent i^*A bid value of;

2. A dynamic task oriented multi-agent distributed task assignment method as claimed in claim 1, wherein the task j is assigned^PMBefore removing the incomplete tasks, the following steps are executed:

Calculating a time difference ratio TS:

3. A dynamic task oriented multi-agent distributed task allocation method as recited in claim 2, wherein said completion time t is_ccThe calculation method comprises the following steps:

wherein epsilon is a set threshold value,

as task j^PMThe rate of change of intensity state.

4. The multi-agent distributed task allocation method for dynamic tasks as claimed in claim 1, wherein when any one agent completes all tasks in its task list, the agent obtains the tasks currently executed by other agents, obtains the completion time for itself to complete each currently executed task according to its location information, the location information of each currently executed task, the task strength information, and the corresponding task alliance, adds the currently executed task corresponding to the maximum completion time value to its own task list, and starts to execute the task.

5. A dynamic task oriented multi-agent distributed task allocation method as claimed in claim 1, wherein the method for obtaining the output value of each uncompleted task by the current agent i is:

for the moment when agent i reaches the location of the jth task,

as the current time of day, the time of day,

for the position coordinates of the current agent i,

is the position coordinate of the jth task,

is the current agent i speed.

6. A dynamic task oriented multi-agent distributed task allocation method as recited in claim 1, wherein the agents are drones, robots, scouts or intelligent weapons of percussion.