CN115809547A

CN115809547A - Multi-agent cooperative task allocation method based on non-dominated sorting and improved particle swarm algorithm

Info

Publication number: CN115809547A
Application number: CN202211459220.1A
Authority: CN
Inventors: 高阳; 彭张弛; 钱晨; 吴潇瑞; 黄卓; 陈庆伟; 吴益飞
Original assignee: Nanjing University of Science and Technology
Current assignee: Nanjing University of Science and Technology
Priority date: 2022-11-17
Filing date: 2022-11-17
Publication date: 2023-03-17

Abstract

The invention discloses a multi-agent cooperative task allocation method based on a non-dominated sorting improved particle swarm algorithm, which establishes a multi-agent cooperative task allocation model and realizes simultaneous optimization of multiple indexes such as hitting profit, resource consumption, damage probability and the like. Aiming at the characteristics of the problem of multi-agent cooperative task allocation, the multi-target particle swarm optimization and the non-dominated sorting algorithm are combined, the non-dominated sorting algorithm and a cross mutation mechanism are integrated, a nonlinear method for improving the value of the inertial weight is designed, a maximum distance method for obtaining the optimal solution of the pareto solution obtained by the algorithm is provided, and the global search optimization capability and the engineering application value of the algorithm are greatly improved. Compared with the traditional optimization method, the method has better convergence and accuracy when solving the multi-objective optimization problem in the dynamic environment.

Description

Multi-agent cooperative task allocation method based on non-dominated sorting and improved particle swarm algorithm

Technical Field

The invention belongs to the field of multi-agent cooperative control, and particularly relates to a non-dominated sorting-based multi-agent cooperative task allocation method based on an improved particle swarm algorithm.

Background

With the development of unmanned technology, the Agent can gradually replace human beings to execute various complex tasks in boring, severe and dangerous environments, such as rescue detection, large-range search, air defense suppression and striking, electronic attack, information reconnaissance and monitoring and the like. In the face of such complex task, it is impossible to independently execute and complete by means of a single agent, and often multiple agents are required to cooperate and cooperate with each other to complete together. Therefore, the rationality and effectiveness of the task allocation scheme is of great importance throughout the job.

In order to effectively perform task allocation on multiple intelligent agents, scholars at home and abroad carry out some researches. The existing research is mainly to solve the problems based on algorithms such as intelligent optimization algorithm, intelligent search and the like. For example, when a genetic algorithm is used for solving a problem, the defects of irregular codes, premature convergence of the algorithm and the like are easily caused, and the most common particle swarm algorithm solves the problem of irregular codes in the genetic algorithm, but the parameter adjustment is difficult, and the global and local searching capability of particles cannot be well balanced. Moreover, for a specific multi-objective optimization problem, a common intelligent optimization algorithm generally converts multiple objectives into a single-objective optimization problem in a weighting manner, and due to the subjectivity of weight selection, a problem that multiple optimization objectives cannot be well balanced easily occurs.

Disclosure of Invention

The invention aims to solve the problems in the prior art and provides a multi-agent cooperative task allocation method based on a non-dominated sorting improved particle swarm algorithm, which comprises the following steps:

step 1, establishing a target profit model for multi-agent task allocation by combining operation environment information;

step 2, establishing a loss cost model for multi-agent task allocation by combining the operation environment information;

step 3, establishing a multi-agent cooperative task allocation model based on the multi-target function obtained in the step 1 and the step 2 and by combining constraint conditions in the task execution process of the agents;

step 4, solving the model obtained in the step 3 by using an improved particle swarm algorithm based on non-dominated sorting to obtain a pareto solution set;

and 5, obtaining a pareto optimal solution by a maximum distance method based on the pareto solution set obtained in the step 4.

Preferably, the step 1 of establishing a target profit model when the multi-agent performs the task according to the operating environment information specifically includes:

the equipment model adopted when the ith agent executes the task on the target j is recorded as

The equipment has the suitability degree of

The hit rate to the target j is

The damage P to the target j when the ith agent performs a task on the target j _ij Determined by both, i.e.

Value of V corresponding to target j _j And then the target value and income of the formation of the intelligent agent is as follows:

P _ij ×V _j

thus, the overall revenue model for the overall multi-agent system task allocation is shown by the following equation:

where M is the number of agents, N is the number of targets,

V _j is the value of target j, V _max Represents the target maximum value; x _ij The allocation scheme for agents, represented by a task allocation decision matrix, is defined as follows:

preferably, the step 2 of establishing a loss cost function of task allocation of the agent according to the operating environment information specifically includes:

(1) Shortest flight distance index f ₂

Is provided with

For the length of flight of the ith agent in selecting path p, since all agents may be confronted with multiple target tasks, note that target k is the first target point to which the agent flies,

represents the other target points after k, where T _max Representing the maximum target number of executables in one task of the intelligent agent; note D _ik Is the flight path of the ith agent from the initial position to the k target point,

is the flight path of the ith agent from the k target point to the r target point; the agent formation flight distance is then expressed as:

in the formula, L _max Is the maximum flight distance of a unit agent in executing a task, M is the number of participating agents, N is the number of targets, L _max M is a normalization factor;

in the formula, D _ikmax The maximum flight range of the ith intelligent agent from the initial position to the k target point is shown;

the maximum flight range of the ith intelligent agent from the k target point to the r target point is shown; t is _max -1 refers to the maximum number of times the agent can perform a task;

therefore, the shortest flight distance index of the agent is:

(2) Minimum index f of self-loss cost ₃

The loss cost minimum index is formalized as shown in the following formula:

in the formula, the model of the equipment used for executing the task on the target j is recorded as

Is composed of

The unit cost of the model equipment is,

(3) Sub-target coverage maximum index f ₄

The formalization of the index with the maximum sub-target coverage degree is shown as the following formula:

preferably, the step 3 of building an overall model for multi-agent cooperative task allocation by adding various constraint conditions, which are faced by agent formation when the agent formation executes a task, based on the models obtained in the steps 1 and 2 specifically includes:

and synthesizing the four indexes, performing multi-objective optimization, and obtaining an overall evaluation function, namely an overall model for multi-agent cooperative task allocation, as follows:

min f＝[f ₁ ；f ₂ ；f ₃ ；f ₄ ]

the constraint conditions include:

(1) Multi-agent cooperative constraint c ₁ : in order to ensure the cooperativity of the multiple agents in the task execution process and prevent the cooperation problems of invalid tasks, repeated tasks, mismatching of the number of agents and the target number and the like, certain constraint conditions need to be added to the established model, and the following two main points are provided:

for the agent, any one target point can only be executed by the agent once at most, namely:

for the target point, the number of tasks when the agent executes the tasks cannot exceed the task load that the agent can bear, namely:

wherein Z is _imax The task load which can be borne by the ith agent;

for a task, all the number of tasks must be executed, namely:

wherein N is _type The number of the executed task types is represented;

(2) Multi-agent operating radius constraint c ₂ : in the process of cooperatively executing tasks by multiple agents, certain constraints exist on the radius of the executed tasks, namely:

wherein R is _i I =1,2, M represents the working radius of the i-th agent.

Preferably, the step 4 is to solve the model obtained in the step 3, and the specific steps are as follows:

step 4.1, initializing a particle swarm according to the constraint condition of multi-agent task allocation input in the step 3, randomly setting the speed and the position of each particle, setting t =0, and randomly generating an initial solution;

step 4.2, solving the fitness of each particle according to the overall model distributed by the multi-agent cooperative task, and storing the position and the fitness value of the particle in the individual extreme value p of the particle _best In (1), all p are _best The individual position of the optimum adaptive value and the adaptive value are stored in the global extreme value g _best Performing the following steps;

step 4.3, update particle position and velocity

x _i,j (t+1)＝x _i,j (t)+v _i,j (t+1)

v _i,j (t+1)＝ω·v _i,j (t)+c ₁ r ₁ [p _i,j -x _i,j (t)]+c ₂ r ₂ [p _g,j -x _i,j (t)]

Wherein x is position information of the particle, v represents velocity information of the particle, c ₁ And c ₂ Is a learning factor of a particle, r ₁ And r ₂ Is a random number between (0,1), p _i,j For local optima of particles, p _g,j Is the global optimum of the particle, w is the inertial weight; guiding the value of the inertia weight w by the difference value of the particle position and the current optimal position, and adjusting the size of the inertia weight w and the difference value of the ith particle j and the global optimal particle in a non-linear way according to the difference value of the value

Is composed of

Wherein D is the solution space dimension; x is the number of _max 、x _min Upper and lower bounds, respectively, of the particle position component;

indicating the globally optimal particle position at time j,

represents the position of the ith particle at time j;

is the inertial weight of particle i at time j; w is a _start 、w _end Respectively an initial value and a final value of w;

step 4.4, comparing the adaptive value of each particle with the best position of the particle, and if the difference value of the adaptive value of each particle and the best position of the particle is within a preset range, taking the adaptive value of the current particle as the best position of the particle; and comparing all current p _best And g _best Update g _best ；

And 4.5, introducing cross and mutation operations: judging whether cross variation exists according to the difference value X between the particle position component and the global optimal position, so that the particles can quickly jump out of the local optimal position; the method comprises the following specific steps:

(1) determining a threshold X of X _min Cross ratio p _c And the rate of variation p _m ；

(2) Judging the difference value X of the particles i _i If the value is smaller than the threshold value, executing downwards, otherwise jumping out and not executing;

(3) choosing [0,1 ] for each dimension of particle i]The j-th dimension of the random number r is r _ij If r is _ij ＜p _m Then, performing mutation operation:

x _ij ＝x _min +(x _max -x _min )r

(4) then judging the random number r corresponding to the j dimension of the particle _ij Whether or not less than the crossover rate p _c If so, performing cross operation on the jth dimension, wherein a cross object is a global optimal solution, and assigning the global optimal solution of the jth dimension to the jth dimension of the particle;

4.6, sorting the target functions by adopting a non-dominated sorting method, and selecting a next generation population according to a sorting result;

step 4.7, stopping searching and outputting a result when the algorithm reaches a preset stop condition; otherwise go to step 4.3 to continue the search.

Preferably, the pareto optimal solution is obtained by a maximum distance method, which specifically includes:

for each non-dominant solution in the pareto solution set, calculating the maximum value of each objective function, and determining the vector of the worst index function as follows:

wherein,

a value representing an nth objective function in an ith non-dominated solution;

the distance formula for each non-dominated solution and worst case indicator function is established as follows:

the set of distances D = { D can be obtained ₁ ,D ₂ ,...,D _i In which D is _i The distance value obtained by the i-th non-dominant solution is shown. The solution with the largest distance D value will be selected as the final solution, worst _ F denotes the Worst index function,

represents the value of the nth objective function in the ith non-dominated solution, with n representing the order of the objective function.

The invention also discloses a multi-agent cooperative task allocation system based on the non-dominated sorting improved particle swarm algorithm, which comprises the following steps:

the first construction module is used for establishing a task allocation target profit model in the task execution process of the intelligent agent according to the operation environment information;

the second construction module is used for establishing a loss cost model in the process of executing the task by the intelligent agent according to the operation environment information;

the third construction module is used for adding constraint conditions in the task execution process of the intelligent agent based on the models obtained by the first construction module and the second construction module, and establishing an integral model for multi-intelligent-agent cooperative task distribution;

and the solving module is used for solving the model obtained by the third building module by utilizing the improved particle swarm optimization algorithm based on the non-dominated sorting.

The invention also discloses computer equipment which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the steps of the method are realized when the processor executes the computer program.

The invention also discloses a computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of the invention.

Has the advantages that:

the invention is improved on the basis of a multi-objective particle swarm algorithm, the value of the inertial weight is guided according to the difference degree between the particles and the current optimal particles, a non-dominated sorting algorithm and a cross variation mechanism are integrated, and a maximum distance method is designed to select the optimal solution from a pareto solution set.

Drawings

FIG. 1 is a flow chart of a multi-agent cooperative task allocation method based on a non-dominated sorting improved particle swarm algorithm.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments.

In one embodiment, in combination with fig. 1, a multi-agent cooperative task allocation method based on non-dominated sorting improved particle swarm optimization is provided, which comprises the following steps:

and 4, solving the model obtained in the step 3 by using an improved particle swarm algorithm based on non-dominated sorting.

Further, in one embodiment, the step 1 of establishing a target profit model in the task execution process of the multi-agent according to the operating environment information specifically includes:

the value benefit of the multi-agent attacking a single sub-target is the product of the damage degree of the sub-target and the value of the sub-target, and when the multi-agent executes tasks in a cluster unit, the damage degree of the sub-target refers to the damage degree of the multi-agent cluster on the sub-target.

The equipment has the suitability degree of

The hit rate to the target j is

The value corresponding to target j is V _j Then the target value gain of the intelligent agent formation is:

P _ij ×V _j

where M is the number of agents, N is the number of targets,

V _j for each sub-target's value, V _max Represents the target maximum value; x _ij The allocation scheme for agents can be represented by a task allocation decision matrix, defined as follows:

further, in one embodiment, the step 2 of establishing a loss cost model during the task execution process of the multi-agent according to the operating environment information specifically includes:

(1) Shortest flight distance index f ₂

Is provided with

For the length of flight of the ith agent formation when selecting path p, the k target node is the first target point on this path,

represents the other nodes after the k target node, where T _max Representing the maximum number of targets that the agent can execute in one task. Note D _ik Is the flight path of the ith agent from the initial position to the k target nodes.

Is the flight path of the ith agent from the k target node to the r target node. The agent formation flight distance can be expressed as:

in the formula, L _max Is the maximum flight distance of a unit agent in executing a task, M is the number of participating agents, L _max M is a normalization factor.

In the formula, D _ikmax The maximum flight path of the ith agent from the initial position to the k target nodes is shown.

The maximum flight path of the ith agent from the k target node to the r target node is shown. T is _max -1 refers to the maximum number of times the agent can perform the task.

Therefore, the shortest flight distance index of the agent is:

(2) Minimum index f of self-loss cost ₃

The loss amount cost minimum index is formalized as shown in the following formula:

Is composed of

The unit cost of the model equipment is,

(3) Sub-target coverage maximum index f ₄

further, in one embodiment, based on the models obtained in step 1 and step 2, adding constraints in the task execution process of the multi-agent, and establishing an overall model for multi-agent cooperative task allocation specifically includes:

min f＝[f ₁ ；f ₂ ；f ₃ ；f ₄ ]

the constraint conditions include:

for the agent, any one target point can be executed by the agent only once at most, namely:

wherein Z is _imax The task load that the ith agent can bear.

For a task, all the number of tasks must be executed, namely:

wherein N is _type The number of types of tasks performed is represented.

(2) Multi-agent operating radius constraint c ₂ : in the process of cooperatively executing the task by the multiple intelligent agents, factors such as fuel which can be carried by the intelligent agents are considered, the radius of the executed task has certain constraint, and the multi-intelligent agent is also considered in the modeling process. The flight course of the executed task should meet the self operation radius of the intelligent agent, namely:

wherein R is _i I =1,2, M represents the working radius of the i-th agent.

Further, in one embodiment, the solving of the model obtained in step 3 by using the improved particle swarm optimization algorithm based on the non-dominated sorting in step 4 specifically includes:

step 4.3, update particle position and velocity

x _i,j (t+1)＝x _i,j (t)+v _i,j (t+1)

Wherein x is position information of the particle, v represents velocity information of the particle, c ₁ And c ₂ Is a learning factor of a particle, r ₁ And r ₂ Is a random number between (0,1), p _i,j For local optima of particles, p _g,j For the global optimum of the particle, w is the inertial weight. Because the value of the common inertia weight w is generally linearly decreased or increased and guidance of particles on the value of w is lacked in the iterative process, the value of the inertia weight w is guided by the difference value of the position of the particle and the current optimal position in the algorithm, the size of the inertia weight w is nonlinearly adjusted according to the difference value of the value, and the difference value of the ith particle j and the global optimal particle is adjusted

Is composed of

indicating the globally optimal particle position at time j,

represents the position of the ith particle at time j;

and (4) performing a step (4.4),comparing the adaptive value of each particle with the best position of the particle, and if the difference value of the adaptive value of each particle and the best position of the particle is within a preset range, taking the adaptive value of the current particle as the best position of the particle; and comparing all current p _best And g _best Update g _best ；

And 4.5, introducing cross and mutation operations: and judging whether cross variation exists according to the difference value X between the particle position component and the global optimal position, so that the particles can quickly jump out of the local optimal position. The method comprises the following specific steps:

(1) determining a threshold X of X _min Cross rate p _c And the rate of variation p _m ；

x _ij ＝x _min +(x _max -x _min )r

(4) then judging the random number r corresponding to the j dimension of the particle _ij Whether or not less than the crossover rate p _c If so, performing cross operation on the jth dimension, wherein the cross object is a global optimal solution, and assigning the global optimal solution of the jth dimension to the jth dimension of the particle.

Further, in one embodiment, the maximum value of each objective function is calculated according to each non-dominant solution in the resulting pareto solution set, and the vector of the worst indicator function is determined as follows:

wherein,

representing the value of the nth objective function in the ith non-dominated solution.

The distance formula for each non-dominated solution and worst case index function is established as follows:

the set of distances D = { D can be obtained ₁ ,D ₂ ,...,D _i In which D is _i The distance value obtained by the i-th non-dominant solution is shown. Will select D _i The solution with the largest value of D in the set, worst _ F representing the Worst indicator function,

The invention solves the task allocation problem of the multi-agent based on the improved particle swarm optimization of non-dominated sorting. A cost function for measuring the quality of a task allocation scheme is established, an improved particle swarm algorithm is designed based on the model, a value taking method of inertial weight is designed, a non-dominated sorting and cross variation mechanism is introduced, a solution method of an optimal solution in a pareto solution set is provided, and compared with the traditional particle swarm algorithm, the convergence precision and the convergence speed are improved, and a better convergence result can be obtained.

The foregoing illustrates and describes the principles, general features, and advantages of the present invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are given by way of illustration of the principles of the present invention, but that various changes and modifications may be made without departing from the spirit and scope of the invention, and such changes and modifications are within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A multi-agent cooperative task allocation method based on a non-dominated sorting improved particle swarm algorithm, the method comprising the following steps:

and 5, obtaining a pareto optimal solution through a maximum distance method based on the pareto solution set obtained in the step 4.

2. The method for distributing the multi-agent cooperative tasks based on the non-dominated sorting improved particle swarm algorithm according to claim 1, wherein the step 1 of establishing a target profit model when the multi-agent executes the tasks according to the operating environment information specifically comprises the following steps:

The equipment has the suitability degree of

The hit rate to the target j is

The value corresponding to target j is V _j And then the target value and income of the formation of the intelligent agent is as follows:

P _ij ×V _j

where M is the number of agents, N is the number of targets,

3. the method for distributing multi-agent cooperative tasks based on the non-dominated sorting improved particle swarm algorithm according to claim 2, wherein the step 2 of establishing the loss cost function of the agent task distribution according to the operating environment information specifically comprises:

(1) Shortest flight distance index f ₂

Is provided with

in the formula, L _max Unit agent in executing taskM is the number of participating agents, N is the number of targets, L _max M is a normalization factor;

the maximum flight range of the ith intelligent agent from the k target point to the r target point is shown; t is _max -1 refers to the maximum number of tasks that an agent can perform;

therefore, the shortest flight distance index of the agent is:

(2) Minimum index f of self-loss cost ₃

The loss cost minimum index is formalized as shown in the following formula:

Is composed of

The unit cost of the model equipment is,

(3) Sub-target coverage maximum index f ₄

4. the multi-agent cooperative task allocation method based on the non-dominated sorting improved particle swarm algorithm as claimed in claim 3, wherein the step 3 is to add each constraint condition faced by agent formation in executing the task based on the models obtained in the steps 1 and 2, and establish an overall model of multi-agent cooperative task allocation, which specifically comprises:

min f＝[f ₁ ；f ₂ ；f ₃ ；f ₄ ]

the constraint conditions include:

wherein Z is _imax The task load which can be borne by the ith agent;

for a task, all the number of tasks must be executed, namely:

wherein N is _type The number of the executed task types is represented;

wherein R is _i I =1,2, M represents the working radius of the i-th agent.

5. The method for distributing multi-agent cooperative tasks based on the non-dominated sorting improved particle swarm algorithm according to claim 4, wherein the step 4 is to solve the model obtained in the step 3, and the specific steps are as follows:

step 4.2, solving the fitness of each particle according to the overall model distributed by the multi-agent cooperative task, and storing the position and the fitness value of the particle in the individual extreme value p of the particle _best In (1), all p are _best The individual position of the optimal adaptive value and the adaptive value are stored in the global extreme value g _best Performing the following steps;

step 4.3, update particle position and velocity

x _i,j (t+1)＝x _i,j (t)+v _i,j (t+1)

Is composed of

indicating the globally optimal particle position at time j,

indicating the ith particle j timeThe position of (a);

x _ij ＝x _min +(x _max -x _min )r

(4) then judging the random number r corresponding to the j dimension of the particle _ij Whether or not less than the crossover rate p _c If so, performing cross operation on the jth dimension, wherein the cross object is a global optimal solution, and assigning the global optimal solution of the jth dimension to the jth dimension of the particle;

step 4.7, stopping searching and outputting a result when the algorithm reaches a preset stop condition; otherwise go to step 4.3 to continue searching.

6. The method according to claim 5, wherein the pareto optimal solution is obtained by a maximum distance method, and the method includes:

wherein,

the set of distances D = { D can be obtained ₁ ,D ₂ ,...,D _i In which D is _i Represents a distance value obtained by the i-th non-dominant solution; the solution with the largest distance D value will be selected as the final solution, worst _ F denotes the Worst index function,

7. The system for multi-agent cooperative task allocation method based on the non-dominated sorting improved particle swarm algorithm of any one of claims 1 to 6, which is characterized in that the system comprises:

and the solving module is used for solving the model obtained by the third building module by using the improved particle swarm optimization algorithm based on the non-dominated sorting.

8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 6 are implemented when the computer program is executed by the processor.

9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.