CN115616913A - Model prediction leaderless formation control method based on distributed evolutionary game - Google Patents

Model prediction leaderless formation control method based on distributed evolutionary game Download PDF

Info

Publication number
CN115616913A
CN115616913A CN202211320956.0A CN202211320956A CN115616913A CN 115616913 A CN115616913 A CN 115616913A CN 202211320956 A CN202211320956 A CN 202211320956A CN 115616913 A CN115616913 A CN 115616913A
Authority
CN
China
Prior art keywords
agent
formation
evolutionary game
representing
distributed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211320956.0A
Other languages
Chinese (zh)
Inventor
戴荔
霍达
周小婷
蔡普申
黄腾
孙中奇
夏元清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Publication of CN115616913A publication Critical patent/CN115616913A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
    • G05B13/042Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a model prediction leaderless formation control method based on a distributed evolutionary game, which can overcome the defects in a leader-follower formation control algorithm. The invention adopts a leaderless formation control algorithm, namely all agents have the same role and function, and utilizes a model prediction control algorithm to construct a global optimization problem, and realizes the formation purpose by designing a formation error function in a global model prediction cost function. The collision avoidance function is realized by constructing a safe distance set for each intelligent agent by using a Voronoi diagram, converting a formation control problem into an evolution game problem to realize distributed solution, and simultaneously ensuring that each intelligent agent cannot collide in the moving process by using the property of an invariant set in the evolution game. In addition, the invention is also suitable for a time-varying communication network, improves the control performance and the safety performance, reduces the complexity of calculation and reduces the communication burden.

Description

Model prediction leaderless formation control method based on distributed evolutionary game
Technical Field
The invention belongs to the technical field of multi-agent formation control, and particularly relates to a model prediction leaderless formation control method based on a distributed evolutionary game.
Background
In recent years, with the continuous development of multi-agent systems, formation control becomes a hot problem for the current multi-agent system research. Formation control means that a plurality of intelligent agents such as unmanned vehicles and unmanned aerial vehicles can keep expected positions with each other in the process of moving towards a target position, and meanwhile, the intelligent agents are adaptive to environmental constraints (such as avoiding obstacles). The method can complete specific and complex tasks without manual participation, thereby being widely applied to various fields such as military, aerospace, industry and the like and having good development prospect. However, in practical applications, one difficult problem with the control of multi-agent formation is that all agents must have the ability to avoid collisions with obstacles or other agents, and the communication topology may be time-varying during agent movement. In addition, when some sort of formation is formed in a distributed manner, each agent needs to know the state of the other agents, but when the communication topology changes, communication between the agents may not exist.
A leader-follower control method is used as a method for solving the current formation control problem, and the basic principle is that one of the agents is used as a leader to track a reference track, and other agents are used as followers to keep a certain distance from the leader, so that the formation control is realized. The principle is simple, and the method is widely applied to multi-agent formation, but the leader-follower formation problem has the following two disadvantages: 1) The whole system is too dependent on the lead team, and when the lead team cannot track the reference track, the whole multi-agent formation deviates from the reference track; 2) The leader agent does not take the formation following of the follower agent into account, and it may happen that the leader agent moves too fast and the follower agent cannot follow this situation.
Disclosure of Invention
In view of the above, the invention provides a distributed model prediction leaderless formation control method based on a distributed evolutionary game, all agents have the same role and function, and under the condition of communication constraint, each agent can form a formation without collision only by acquiring local information of neighbors.
In order to achieve the purpose, the invention discloses a distributed model prediction leaderless formation control method based on a distributed evolutionary game, which comprises the following steps of:
step 1, establishing a multi-agent system, determining initial positions and target positions of agents, and constructing a dynamic model of the agents, and optimal control problems of obstacle avoidance constraints, control constraints of the agents and state constraints among the agents; the optimization problem is that under the condition that the final target state is known, the state of the intelligent agent in a future period of time is predicted through a prediction model, so that the distance between the position of the intelligent agent and the target position in the future period of time is minimum, and the optimal control input quantity at the current moment is obtained;
step 2, establishing a safe distance set for each intelligent agent to ensure that each intelligent agent can not collide as long as the intelligent agent moves in the set safe distance set;
step 3, two group evolution games constrained by coupling are provided, and a correction protocol is selected to construct an evolution kinetic equation, so that the evolution kinetic equation of each group can reach Nash equilibrium solution of the games through continuous iteration and optimization and has the property of invariant set;
and 4, converting the constructed multi-agent formation problem into two groups of coupled and constrained evolutionary game problems, and solving the multi-agent formation optimization problem by using an evolutionary kinetic equation of the evolutionary game.
In the step 4, the positions of the agents in the formation control are converted into the population state in the evolutionary game, all the agents in the formation control are converted into the strategy in the evolutionary game, the cost function in the formation control problem is combined with the benefit function in the evolutionary game, and then the optimal control problem in the step 1 is solved by using the evolutionary dynamic equation.
Wherein, the optimization problem in the step 1 is as follows:
min u(k) J(k)
s.t.form=0,1,…,H p -1
Figure BDA0003910379230000031
Figure BDA0003910379230000032
Figure BDA0003910379230000033
Figure BDA0003910379230000034
wherein:
Figure BDA0003910379230000035
indicating the location information of the ith agent,
Figure BDA0003910379230000036
indicating the speed information of the ith agent,
Figure BDA0003910379230000037
a state variable representing the ith agent,
Figure BDA0003910379230000038
a control variable representing the ith agent,
Figure BDA0003910379230000039
representing the set of collision avoidance constraints for the ith agent,
Figure BDA00039103792300000310
indicating the range of mobility of the multi-agent,
Figure BDA00039103792300000311
indicating the allowable control output range of a single agent.
Wherein the safe distance set in the step 2 is defined as:
Figure BDA00039103792300000312
Figure BDA00039103792300000313
Figure BDA00039103792300000314
Figure BDA00039103792300000315
wherein R is a prescribed safe distance, set
Figure BDA00039103792300000316
Is a closed set of polyhedrons, for arbitrary
Figure BDA00039103792300000317
And
Figure BDA00039103792300000318
satisfy | c i (k)-c j (k)‖≥R,
Figure BDA00039103792300000319
Set of neighbor agents, δ, representing agent i ij (k)、ε ij (k) And ω ij (k) Representing intermediate variables for the calculation.
The step 2 adopts a distributed evolutionary game with two populations with coupling constraints, and comprises the following specific steps: solving the optimization problem of the evolutionary game by searching Nash balance points; substituting the optimization problem solved by searching for the Nash equilibrium point into the average dynamics to obtain the distributed Smith dynamics equation of the two populations with the coupling constraint.
Has the advantages that:
1. the method disclosed by the invention is used for popularizing the average dynamics in the evolutionary game to the coupling constraint condition between two groups, and proves that the evolutionary dynamics finally reaches the Nash equilibrium point of the game through continuous iteration and optimization, and the two groups of the evolutionary game under the coupling constraint have invariant constraint, namely under the condition that the initial condition is met, the constraint condition can be always met in the evolutionary game. The multi-agent formation control problem is converted into an evolutionary game problem, so that the centralized optimization problem is divided into a plurality of sub-problems, and then the sub-problems are distributed to each sub-agent to be solved. Each agent solves the subproblems by utilizing the information, the local model and the available neighbor information, so that the calculated amount and the complexity are greatly reduced; in addition, the problem of performance reduction caused by insufficient information interaction capacity of the traditional distributed control is solved, the control performance is kept at a higher level, and meanwhile, the flexibility and the expandability of the system are improved; the invention adopts the leader-free formation control algorithm, and all the agents have the same role and function, thereby solving the defects in the leader-follower formation control algorithm.
2. The invention utilizes a model prediction control algorithm to construct a global optimization problem, and realizes the purpose of formation by designing a formation error function in a global model prediction cost function. And the property of introducing an invariant set is introduced to ensure that each intelligent agent does not collide in the moving process.
3. The invention is equally applicable to time-varying communication networks. The method has the advantages that the control performance and the safety performance are improved, meanwhile, the complexity of calculation is reduced, the communication burden is reduced, and the problem that the existing partial formation control algorithm cannot process communication constraint or time-varying communication networks is solved.
Drawings
FIG. 1 is a diagram of the conversion between the formation control problem and the evolutionary gaming problem of the present invention;
FIG. 2 is a two-dimensional actual trajectory diagram of 6 agents of the present invention;
FIG. 3 is a graph of position coordinates versus time for each agent in the present invention;
FIG. 4 is a graph of safe distance versus time for each agent pair in the present invention;
FIG. 5 is a graph of control input versus time for each agent in the present invention.
Detailed Description
The invention is described in detail below by way of example with reference to the accompanying drawings.
The evolutionary game algorithm is introduced into multi-agent formation, and the evolutionary game is used as a mathematical tool and can describe the behaviors of decision makers under the condition that only part information of part participants is known. Through continuous iteration and optimization, the local behavior of the participants can reach an overall goal. Therefore, the evolutionary game is suitable for solving the problem of distributed multi-agent formation control. The invention provides a distributed model prediction leaderless formation control method based on a distributed evolutionary game, which comprises the following steps of:
in a first part, a multi-agent system is constructed, comprising the sub-steps of:
step 11, design of system architecture
Consider a device having
Figure BDA0003910379230000051
Formation and retrieval of multiple agents
Figure BDA0003910379230000052
Indicating the location information of the ith agent,
Figure BDA0003910379230000053
representing speed information of the ith agent for any agent
Figure BDA0003910379230000054
The dynamic model expression is
Figure BDA0003910379230000055
Wherein,
Figure BDA0003910379230000056
a state variable representing the ith agent,
Figure BDA0003910379230000057
representing the control variables of the ith agent.
Step 12, determining the communication topology and the target of each agent. The communication range of each multi-agent is
Figure BDA0003910379230000058
When the time-varying communication topology is
Figure BDA0003910379230000059
Here node set
Figure BDA00039103792300000510
Corresponding to the intelligent agent set
Figure BDA00039103792300000511
Set of vertices
Figure BDA00039103792300000512
Representing a pair of agents that can interact with information, A (k) = [ a = ij (k)] M×M Represents a adjacency matrix in which a is a when agent i and agent j can exchange information ij (k) =1, otherwise a ij (k) And =0. Order to
Figure BDA00039103792300000513
Representing the desired state of agent i, for any agent i and agent j, the following needs to be satisfied:
(1) And (3) controlling the target:
Figure BDA0003910379230000061
(2) Obstacle avoidance and restraint: d ij (k)=||c i (k)-c j (k) | ≧ R, where the minimum safe distance
Figure BDA0003910379230000062
(3) And (4) position constraint:
Figure BDA0003910379230000063
wherein
Figure BDA0003910379230000064
Is the area that the agent is allowed to reach;
(4) Inputting constraints:
Figure BDA0003910379230000065
wherein
Figure BDA0003910379230000066
Is the range allowed by the control input;
(5) The expected state requires:
Figure BDA0003910379230000067
for all
Figure BDA0003910379230000068
i.e. expected target position between different agentsThe distance is greater than the safe distance;
Figure BDA0003910379230000069
a set of neighbor agents representing agent i.
And step 13, designing a safe distance set for each agent. At each time k, the positions c of the i agent and all the neighbor j agents are obtained i (k) And c j (k) By reconstructing the constraint set using Voronoi diagrams
Figure BDA00039103792300000610
Wherein
Figure BDA00039103792300000611
Figure BDA00039103792300000612
Figure BDA00039103792300000613
Wherein, delta ij (k)、ε ij (k) And ω ij (k) Representing intermediate variables, sets, for computation
Figure BDA00039103792300000614
Is a closed set of polyhedrons, i.e. collision-free set, and for arbitrary
Figure BDA00039103792300000615
And
Figure BDA00039103792300000616
all can satisfy | c i (k)-c j (k)‖≥R。
And 14, constructing a model prediction optimization problem. To achieve the control objective, let
Figure BDA00039103792300000617
Figure BDA00039103792300000618
Representing the position deviation of agent i, a cost function is defined as:
Figure BDA00039103792300000619
wherein,
Figure BDA00039103792300000620
Figure BDA0003910379230000071
and
Figure BDA0003910379230000072
are all symmetric positive definite matrices, H p To predict the time domain, the optimal control problem for drone formation is described as:
min u(k) J(k) (8a)
s.t.form=0,1,…,H p -1 (8b)
Figure BDA0003910379230000073
Figure BDA0003910379230000074
Figure BDA0003910379230000075
Figure BDA0003910379230000076
when a feasible solution exists in the optimization problem (8), the optimal control input in a period of time in the future can be obtained, and the solved optimal control sequence is not completely applied to the system one by one but the first element in the optimal control sequence is used in the actual system in consideration of the reasons that the model is mismatched and interfered in the actual application. At the next time k +1, the current state of the system is resampled, the optimization problem (8) is reconstructed and solved, and the steps are continuously repeated. However, the optimization problem constructed at this time is still a centralized optimization problem, and in the next step, the optimization problem is solved in a distributed manner by a distributed evolutionary game method.
Since the collision avoidance constraint is non-convex in nature, it may lead to non-convex optimization problems. To solve this computational problem, the idea of introducing Voronoi diagrams creates a set of safe distances for each agent. And each intelligent agent is ensured not to collide as long as the intelligent agents move intensively at the specified safe distance.
And in the second part, two group evolution games constrained by coupling. Two populations p e (1,2) were constructed with a large and limited number of participants in each population and the same set of strategies S in both populations. Let s i E S represents the ith strategy and represents a strategy set which comprises n strategies and m p,i Represents the number of individuals in the population p who receive the policy i, and
Figure BDA0003910379230000077
taking the proportion of the received strategy i in the population p as rho p,i =m p,i /m p Is not less than 0, and p can be obtained p =[ρ p,1p,2 ,…,ρ p,n ] T And pi p =∑ i∈S ρ p,i And =1. At the same time, let the fitness function of the population p be F p (p p )=[f p,1 (p p ),f p,2 (p p ),…,f p,n (p p )] T . Here, x is uniformly defined i :=ρ 1,i ,y i :=ρ 2,i ,x:=p 1 ,y:=p 2 ,f i x :=f 1,i (p 1 ),f i y :=f 2,i (p 2 ),
Figure BDA0003910379230000081
And
Figure BDA0003910379230000082
and step 21, setting a communication topological graph in the evolutionary game. For both populations (x, y), to maintain a certain balance, the set xi = { (x, y) | Ax + By ≦ C } needs to be satisfied, where A = diag { a ≦ C } 1 ,a 2 ,…,a n },B=diag{b 1 ,b 2 ,…,b n } and C = [ C 1 c 2 … c n ] T . In the course of evolution, the set Λ: = { (x, y) | ∑ i∈S x i =π 1 ,∑ i∈S y i =π 2 ,x i ≥0,y i ≧ 0} encompasses all possible states of the population. For the first population, the policy interactions between individuals can be undirected graphs
Figure BDA0003910379230000083
Is represented by a set of nodes
Figure BDA0003910379230000084
Representing all sets of policies, vertex sets
Figure BDA0003910379230000085
Different strategies can be adopted on behalf of individuals in the population x, a (k) = [ a = [) ij (k)] M×M Representing an adjacency matrix where a is taken when an individual takes policy i, and may also take policy j ij (k) =1, otherwise a ij (k) And =0. Similarly, for a second population, the policy interactions between individuals can be undirected
Figure BDA0003910379230000086
To indicate.
The optimization problem of the evolutionary game is solved by finding nash equilibrium points and can be described as:
max x,y W(x,y) (9a)
s.t.Ax+By≤C (9b)
Figure BDA0003910379230000087
Figure BDA0003910379230000088
x i ≥0 (9e)
y i ≥0 (9f)
wherein the cost function W (x, y) is a strictly continuous differentiable concave function, (x) i ,y i ) Is the population state.
The proportional change evolution process of the population x and the population y by adopting the strategy i can be described by distributed evolutionary dynamics, and the expression is as follows:
Figure BDA0003910379230000089
Figure BDA0003910379230000091
this kinetic is also referred to as the mean kinetic. In addition, the protocol phi is modified ij And taking the current income and the summary behavior as input, and outputting conversion frequency, namely, according to the current overall state and income, switching the frequency of adopting the strategy j by the individual to the strategy i.
And step 21, setting a communication protocol. For any given x and y, use
Figure BDA0003910379230000092
Representing a group of ternary numbers, and satisfying any q ∈ S
Figure BDA0003910379230000093
Then
Figure BDA0003910379230000094
Is a coefficient corresponding to the minimum element of the vector C- (Ax + By). Thus, the correction protocol for the population p can be designed as:
Figure BDA0003910379230000095
substitution of (12) into (10) and (11) can give
Figure BDA0003910379230000096
Figure BDA0003910379230000097
This is the distributed smith dynamics (DSD 2 PC) of the two populations with coupling constraints, and the evolutionary game with such dynamics is referred to as the distributed evolutionary game (DEG 2 PC) of the two populations with coupling constraints.
Order to
Figure BDA0003910379230000098
Figure BDA0003910379230000099
Then (13) and (14) are re-expressed as:
Figure BDA00039103792300000910
Figure BDA0003910379230000101
expressing the evolution dynamics in a form of tight set, wherein the expression is
Figure BDA0003910379230000102
Wherein,
Figure BDA0003910379230000103
and
Figure BDA0003910379230000104
are respectively about the drawings
Figure BDA0003910379230000105
And
Figure BDA0003910379230000106
Figure BDA0003910379230000107
the laplacian matrix of.
And S10, proving that the evolutionary game constrained by the two groups has the property of invariant. Given (x, y) ∈ N ^ n
Figure BDA0003910379230000108
To obtain
Figure BDA0003910379230000109
And order
Figure BDA00039103792300001010
To obtain
Figure BDA00039103792300001011
Figure BDA00039103792300001012
Thus, r x (i,j)=r x (j, i) ≧ 0. Adjacency matrix
Figure BDA00039103792300001013
Can be expressed as:
Figure BDA00039103792300001014
from the relation of Laplace matrices
Figure BDA00039103792300001015
Can obtain
Figure BDA00039103792300001016
According to r x Non-negativity of (i, j), laplace matrix
Figure BDA00039103792300001017
Is semi-positive. The same theory can prove
Figure BDA00039103792300001018
Is semi-positive and
Figure BDA00039103792300001019
s11, according to lemma 1
Figure BDA00039103792300001020
And
Figure BDA00039103792300001021
can obtain
Figure BDA00039103792300001022
And
Figure BDA00039103792300001023
that is to say that
Figure BDA00039103792300001024
And
Figure BDA00039103792300001025
is a constant. In addition, when x i =0 or y i When =0, according to (13) and (14), the compound (I) is obtained
Figure BDA0003910379230000111
Figure BDA0003910379230000112
Thus for x i Not less than 0 and y i ≥0,(x(t),y(t))∈Λ。
When (x (0), y (0)). Epsilon.xi, once the track (x (t), y (t)) reaches the set xi boundary, for i ∈ S, satisfy a i x i +b i y i =c i . According to the theorem 1, the method,
Figure BDA0003910379230000113
and is
Figure BDA0003910379230000114
A is to be i And b i Substituting into (13) and (14) to obtain
Figure BDA0003910379230000115
Figure BDA0003910379230000116
In the following four cases discussion
Figure BDA0003910379230000117
And
Figure BDA0003910379230000118
Figure BDA0003910379230000119
if a i >0,b i >0
Figure BDA00039103792300001110
If a i >0,b i ≤0
Figure BDA00039103792300001111
If a i ≤0,b i >0
Figure BDA00039103792300001112
If a i ≤0,b i ≤0
Non-negativity is always satisfied
Figure BDA00039103792300001113
And non-growth a i x i +b i y i ≤c i . Due to the continuity of the trajectory (x, y), it is found (x (t), y (t)). Epsilon.. Lambda.in all subsequent time steps. Thus, the set xi and Λ is an invariant set.
S12, selecting E (x, y): = W (x) * ,y * ) W (x, y) as a Lyapunov function and E (x, y) ≧ 0, the derivative of which can be expressed as
Figure BDA0003910379230000121
Thus, when the initial values (x (0), y (0)) ∈ xi evolve along (13) and (14), DEG2PC approaches the nash equilibrium point, and the nash equilibrium point is locally asymptotically stable.
And in the third part, a distributed model predictive control algorithm based on the DEG2PC theory:
step 31, a transition diagram between the formation control problem and the evolutionary game problem in the invention is shown in fig. 1, and a population state (x) in the DEG2PC theory is calculated by using a method of the evolutionary game theory i ,y i ) And position component in optimal control problem
Figure BDA0003910379230000122
Are related by the relation of
Figure BDA0003910379230000123
Figure BDA0003910379230000124
According to the kinetic model in (6), u i (k + m | k) and v i (k + m + 1|k) can be re-expressed as:
u i (k+m|k)=c i (k+m+1|k)-2c i (k+m|k)+c i (k+m-1|k) (19)
v i (k+m+1|k)=c i (k+m+1|k)-c i (k+m|k) (20)
and substituting (19) and (20) into the optimization problem (8). Since the problem (8) is to minimize the cost function J (k) and the problem (9) is to maximize the concave function W (x, y), the fitness function for each strategy can be described as f x =- x J and f y =- y J. Further, constraints (8 d), (8 e), and (8 f) in question (8) may be converted to
Figure BDA0003910379230000125
Corresponds to the constraint (9 b) in the question (9).
At step 32, for populations x and y, a correction protocol as in (12) is selected, and using the dynamic evolution of (15) and (16), the population results will tend towards the nash equilibrium point. Thereafter, an optimal position trajectory (x) can be obtained at time k * (k),y * (k) ) and an optimal control input sequence u * (k) In that respect Thus, the formation control problem is solved in a distributed manner by the DEG2PC (9).
In summary, the distributed model prediction leaderless formation control method based on the distributed evolutionary game can be described as follows: given the inputs: desired position
Figure BDA0003910379230000126
Predicting time domain H p Safe distance R, alternating current range theta, weight matrix Q i 、P i And R i . Demand transfusionAnd (3) discharging: (x) * (k),y * (k) ) and u i * (k|k)
(1) At time k, a sample z is given i (k) And communication topology
Figure BDA0003910379230000131
(2) Constructing a formation control problem (8); selecting
Figure BDA0003910379230000132
Designing a revision protocol (12);
(3) For each strategy f x And f y Obtaining a moderate function;
(4) Through (13) and (14), the optimal position track (x) is solved * (k),y * (k) ) and an optimal control input sequence u * (k);
(5) Will u i * (k | k) is substituted into each agent, and the above operation is repeated.
And fourthly, theoretical simulation. Selecting a multi-agent system having six agents, for each agent
Figure BDA0003910379230000133
The system model is
Figure BDA0003910379230000134
Input constraints for each agent
Figure BDA0003910379230000135
Alternating current range theta =2.3, safe distance R =0.5, prediction time domain H p =20, weight matrix Q i =R i =P i =I 4×4 The initial and expected velocities for each agent are set to 0, the initial position is
c 1 (0)=[3 3] T ,c 2 (0)=[1 4] T ,c 3 (0)=[2 0] T
c 4 (0)=[4 1] T ,c 5 (0)=[0 2] T ,c 6 (0)=[3 5] T
To form a formation, the expected locations for each agent are:
Figure BDA0003910379230000136
Figure BDA0003910379230000137
the results of simulation experiments performed on MATLAB using ICLOCS and PDToolbox solving tools are shown in the accompanying drawings, fig. 2 is a two-dimensional actual trajectory diagram of 6 agents of the present invention, fig. 3 is a position coordinate-time curve diagram of each agent of the present invention, fig. 4 is a safety distance-time curve diagram of each agent pair of the present invention, and fig. 5 is a control input-time curve diagram of each agent of the present invention. The simulation results in fig. 2 show that, under the control of the algorithm, each agent can finally reach the specified target point. Fig. 3 shows the position of each agent during the movement. Each sub-graph in fig. 4 shows the relative position between the agents, and it can be seen that the relative position between the agents is always greater than the safety distance 0.5, that is, the agents have the collision avoidance effect. The results of fig. 4 show that the agent can guarantee the satisfaction of the input constraints during the move.
In summary, the above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (5)

1. A distributed model prediction leaderless formation control method based on a distributed evolutionary game is characterized by comprising the following steps of:
step 1, establishing a multi-agent system, determining an initial position and a target position of an agent, and constructing a dynamic model of the agent, and an optimal control problem of obstacle avoidance constraint, agent control constraint and state constraint among the agents; the optimization problem is that under the condition that the final target state is known, the state of the intelligent agent in a future period of time is predicted through a prediction model, so that the distance between the position of the intelligent agent and the target position in the future period of time is minimum, and the optimal control input quantity at the current moment is obtained;
step 2, a safe distance set is established for each intelligent agent, and each intelligent agent is guaranteed not to collide as long as the intelligent agent moves in the specified safe distance set;
step 3, two group evolution games constrained by coupling are provided, and a correction protocol is selected to construct an evolution kinetic equation, so that the evolution kinetic equation of each group can reach Nash equilibrium solution of the games through continuous iteration and optimization and has the property of invariant set;
and 4, converting the constructed multi-agent formation problem into two group evolutionary game problems which are constrained by coupling, and solving the multi-agent formation optimization problem by using an evolutionary kinetic equation of the evolutionary game.
2. The method according to claim 1, wherein in step 4, the positions of the agents in the formation control are converted into the population states in the evolutionary game, each agent in the formation control is converted into a strategy in the evolutionary game, the cost function in the formation control problem is combined with the benefit function in the evolutionary game, and then the optimal control problem in step 1 is solved by using an evolutionary dynamic equation.
3. The method according to claim 1 or 2, characterized in that the optimization problem in step 1 is:
min u(k) J(k)
s.t.for m=0,1,…,H p -1
Figure FDA0003910379220000021
Figure FDA0003910379220000022
Figure FDA0003910379220000023
Figure FDA0003910379220000024
wherein:
Figure FDA0003910379220000025
indicating the location information of the ith agent,
Figure FDA0003910379220000026
indicating the speed information of the ith agent,
Figure FDA0003910379220000027
a state variable representing the ith agent,
Figure FDA0003910379220000028
a control variable representing the ith agent,
Figure FDA0003910379220000029
representing the set of collision avoidance constraints for the ith agent,
Figure FDA00039103792200000210
representing the range of mobility of the multi-agent,
Figure FDA00039103792200000211
representing the allowable control output range of a single agent.
4. The method according to claim 3, wherein the safe distance set in step 2 is defined as:
Figure FDA00039103792200000212
Figure FDA00039103792200000213
Figure FDA00039103792200000214
Figure FDA00039103792200000215
wherein R is a prescribed safe distance, set
Figure FDA00039103792200000216
Is a closed set of polyhedrons, for arbitrary
Figure FDA00039103792200000217
And
Figure FDA00039103792200000218
satisfy | c i (k)-c j (k)‖≥R,
Figure FDA00039103792200000219
Set of neighbor agents, δ, representing agent i ij (k)、ε ij (k) And ω ij (k) Representing intermediate variables for the calculation.
5. The method according to claim 1,2 or 4, wherein the step 2 adopts a distributed evolutionary game with two populations having coupling constraints, and comprises the following specific steps: solving the optimization problem of the evolutionary game by searching Nash balance points; substituting the optimization problem solved by searching for the Nash equilibrium point into the average dynamics to obtain the distributed Smith kinetic equation of the two populations with coupling constraint.
CN202211320956.0A 2022-07-05 2022-10-26 Model prediction leaderless formation control method based on distributed evolutionary game Pending CN115616913A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210794170 2022-07-05
CN2022107941706 2022-07-05

Publications (1)

Publication Number Publication Date
CN115616913A true CN115616913A (en) 2023-01-17

Family

ID=84863865

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211320956.0A Pending CN115616913A (en) 2022-07-05 2022-10-26 Model prediction leaderless formation control method based on distributed evolutionary game

Country Status (1)

Country Link
CN (1) CN115616913A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117891259A (en) * 2024-03-14 2024-04-16 中国科学院数学与系统科学研究院 Multi-agent formation control method with multi-graph configuration and related product
CN118092151A (en) * 2023-12-26 2024-05-28 四川大学 Multi-missile cooperative guidance method based on distributed model predictive control

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112464991A (en) * 2020-11-04 2021-03-09 西北工业大学 Multi-sensor evidence evolution game fusion recognition method based on multi-population dynamics
CN112558471A (en) * 2020-11-24 2021-03-26 西北工业大学 Spacecraft formation discrete distributed non-cooperative game method based on dynamic event triggering
CN113359437A (en) * 2021-05-14 2021-09-07 北京理工大学 Hierarchical model prediction control method for multi-agent formation based on evolutionary game
CN114047758A (en) * 2021-11-08 2022-02-15 南京云智控产业技术研究院有限公司 Q-learning-based multi-mobile-robot formation method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112464991A (en) * 2020-11-04 2021-03-09 西北工业大学 Multi-sensor evidence evolution game fusion recognition method based on multi-population dynamics
CN112558471A (en) * 2020-11-24 2021-03-26 西北工业大学 Spacecraft formation discrete distributed non-cooperative game method based on dynamic event triggering
CN113359437A (en) * 2021-05-14 2021-09-07 北京理工大学 Hierarchical model prediction control method for multi-agent formation based on evolutionary game
CN114047758A (en) * 2021-11-08 2022-02-15 南京云智控产业技术研究院有限公司 Q-learning-based multi-mobile-robot formation method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
关志华, 寇纪淞, 李敏强: "基于ε-约束方法的增广Lagrangian多目标协同进化算法", 系统工程与电子技术, no. 09, 20 September 2002 (2002-09-20), pages 1 - 5 *
谢能刚;潘创业;李锐;王璐;: "基于多种群进化算法的多目标并行博弈设计", 数值计算与计算机应用, no. 02, 14 June 2010 (2010-06-14), pages 1 - 5 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118092151A (en) * 2023-12-26 2024-05-28 四川大学 Multi-missile cooperative guidance method based on distributed model predictive control
CN117891259A (en) * 2024-03-14 2024-04-16 中国科学院数学与系统科学研究院 Multi-agent formation control method with multi-graph configuration and related product
CN117891259B (en) * 2024-03-14 2024-05-14 中国科学院数学与系统科学研究院 Multi-agent formation control method with multi-graph configuration and related product

Similar Documents

Publication Publication Date Title
Qu et al. A novel hybrid grey wolf optimizer algorithm for unmanned aerial vehicle (UAV) path planning
CN112947562B (en) Multi-unmanned aerial vehicle motion planning method based on artificial potential field method and MADDPG
CN115616913A (en) Model prediction leaderless formation control method based on distributed evolutionary game
Kim et al. Message-dropout: An efficient training method for multi-agent deep reinforcement learning
Dutta et al. A decentralized formation and network connectivity tracking controller for multiple unmanned systems
CN113900445A (en) Unmanned aerial vehicle cooperative control training method and system based on multi-agent reinforcement learning
CN113495578A (en) Digital twin training-based cluster track planning reinforcement learning method
Yan et al. Collision-avoiding flocking with multiple fixed-wing UAVs in obstacle-cluttered environments: a task-specific curriculum-based MADRL approach
Hafez et al. Unmanned aerial vehicles formation using learning based model predictive control
Feng et al. Towards human-like social multi-agents with memetic automaton
CN114815882B (en) Unmanned aerial vehicle autonomous formation intelligent control method based on reinforcement learning
Pillai et al. Extreme learning ANFIS for control applications
Wang et al. Pattern-rl: Multi-robot cooperative pattern formation via deep reinforcement learning
Zhou et al. A novel mean-field-game-type optimal control for very large-scale multiagent systems
Wu et al. Heterogeneous mission planning for multiple uav formations via metaheuristic algorithms
CN111983923A (en) Formation control method, system and equipment for limited multi-agent system
Dutta et al. Multi-agent formation control with maintaining and controlling network connectivity
Lee Federated Reinforcement Learning‐Based UAV Swarm System for Aerial Remote Sensing
Zhao et al. Graph-based multi-agent reinforcement learning for large-scale UAVs swarm system control
Gong et al. Reinforcement learning for multi-agent formation navigation with scalability
CN116203987A (en) Unmanned aerial vehicle cluster collaborative obstacle avoidance method based on deep reinforcement learning
Zhang et al. Pipo: Policy optimization with permutation-invariant constraint for distributed multi-robot navigation
Yang et al. Multi-actor-attention-critic reinforcement learning for central place foraging swarms
CN114545777A (en) Multi-agent consistency reinforcement learning method and system based on improved Q function
CN114185273A (en) Design method of distributed preposed time consistency controller under saturation limitation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination