CN115616913A

CN115616913A - A Model Predictive Leaderless Formation Control Method Based on Distributed Evolutionary Game

Info

Publication number: CN115616913A
Application number: CN202211320956.0A
Authority: CN
Inventors: 戴荔; 霍达; 周小婷; 蔡普申; 黄腾; 孙中奇; 夏元清
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2022-07-05
Filing date: 2022-10-26
Publication date: 2023-01-17

Abstract

The invention provides a model prediction leaderless formation control method based on a distributed evolutionary game, which can overcome the defects in a leader-follower formation control algorithm. The invention adopts a leaderless formation control algorithm, namely all agents have the same role and function, and utilizes a model prediction control algorithm to construct a global optimization problem, and realizes the formation purpose by designing a formation error function in a global model prediction cost function. The collision avoidance function is realized by constructing a safe distance set for each intelligent agent by using a Voronoi diagram, converting a formation control problem into an evolution game problem to realize distributed solution, and simultaneously ensuring that each intelligent agent cannot collide in the moving process by using the property of an invariant set in the evolution game. In addition, the invention is also suitable for a time-varying communication network, improves the control performance and the safety performance, reduces the complexity of calculation and reduces the communication burden.

Description

A Model Predictive Leaderless Formation Control Method Based on Distributed Evolutionary Game

技术领域technical field

本发明属于多智能体编队控制技术领域，具体涉及一种基于分布式演化博弈的模型预测无领导者编队控制方法。The invention belongs to the technical field of multi-agent formation control, and in particular relates to a model prediction leaderless formation control method based on distributed evolutionary game.

背景技术Background technique

近些年，随着多智能体系统的不断发展，编队控制成为了当前多智能体系统研究的热点问题。编队控制是指多个智能体如无人车、无人机在朝着目标位置移动的过程中，相互之间能保期望的位置，同时又要适应环境约束(如避开障碍物)。它能够实现在无人工参与的情况下完成特定复杂的任务，因而在军事、航天、工业等各个领域得到了广泛的应用，具有良好的发展前景。但是，在实际应用中，多智能体编队控制的一个难点问题是，所有智能体必须具有与障碍物或者其他智能体避碰能力，且在智能体移动过程中通信拓扑可能是时变的。另外，当采用分布式的方式形成某种编队时，每个智能体都需要知道其他智能体的状态，但是当通信拓扑发生改变时，该智能体间的通信可能不存在。In recent years, with the continuous development of multi-agent systems, formation control has become a hot issue in the current multi-agent system research. Formation control means that multiple agents such as unmanned vehicles and unmanned aerial vehicles can keep the desired position with each other while moving towards the target position, and at the same time adapt to environmental constraints (such as avoiding obstacles). It can complete specific and complex tasks without human participation, so it has been widely used in various fields such as military, aerospace, and industry, and has a good development prospect. However, in practical applications, a difficult problem of multi-agent formation control is that all agents must have the ability to avoid collisions with obstacles or other agents, and the communication topology may be time-varying during the movement of agents. In addition, when a formation is formed in a distributed manner, each agent needs to know the state of other agents, but when the communication topology changes, the communication between the agents may not exist.

leader-follower控制方法作为解决当前编队控制问题的一种方法，其基本原理为将其中一个智能体作为领队以跟踪参考轨迹，其他的智能体作为跟随者与领队保持一定的距离，从而实现编队控制的目的。因其原理简单被广泛应用到多智能体编队中，但是leader-follower编队问题中存以下两个缺点：1)整个系统过于依赖领队，当领队无法跟踪参考轨迹时，整个多智能体编队都会偏离参考轨迹；2)leader智能体没有将follower智能体的编队跟随情况考虑在内，可能会出现leader智能体移动过快，follower智能体跟不上这种情况出现。The leader-follower control method is a method to solve the current formation control problem. Its basic principle is to use one of the agents as the leader to track the reference trajectory, and the other agents as followers to keep a certain distance from the leader to achieve formation control. the goal of. Because of its simple principle, it is widely used in multi-agent formations, but there are two shortcomings in the leader-follower formation problem: 1) The whole system is too dependent on the leader. When the leader cannot track the reference trajectory, the entire multi-agent formation will deviate from Refer to the trajectory; 2) The leader agent does not take into account the formation following of the follower agent, and it may happen that the leader agent moves too fast and the follower agent cannot keep up.

发明内容Contents of the invention

有鉴于此，本发明提供了一种基于分布式演化博弈的分布式模型预测无领导者编队控制方法，所有智能体具有相同的角色和功能，并且能够实现在受到通讯约束的条件下，每个智能体只需要获得邻居的局部信息就可以无碰撞形成编队。In view of this, the present invention provides a distributed model prediction based on distributed evolutionary game leaderless formation control method, all agents have the same role and function, and can achieve under the condition of communication constraints, each Agents only need to obtain the local information of neighbors to form formations without collision.

为实现上述目的，本发明一种基于分布式演化博弈的分布式模型预测无领导者编队控制方法，包括以下步骤：In order to achieve the above object, a distributed model based on distributed evolutionary game of the present invention predicts a leaderless formation control method, comprising the following steps:

步骤1，建立多智能体系统，明确智能体的初始位置及目标位置，构建智能体的动力学模型，多智能体之间的避障约束、智能体的控制约束以及状态约束的最优控制问题；所述优化问题是最终目标状态已知情况下，通过预测模型来预测未来一段时间内智能体的状态，使未来一段时间内智能体的位置和目标位置距离最小，获得当前时刻最优控制输入量；Step 1, establish a multi-agent system, clarify the initial position and target position of the agent, construct the dynamic model of the agent, the obstacle avoidance constraints among the multi-agents, the control constraints of the agents and the optimal control problem of the state constraints ; The optimization problem is that when the final target state is known, the prediction model is used to predict the state of the agent in a period of time in the future, so that the distance between the position of the agent and the target position in a period of time in the future is minimized, and the optimal control input at the current moment is obtained quantity;

步骤2，为每个智能体创建安全距离集，保证每个智能体只要在规定的安全距离集中移动时不会发生碰撞；Step 2, create a safe distance set for each agent to ensure that each agent will not collide as long as it moves within the specified safe distance set;

步骤3，提出受耦合约束的两种群演化博弈，选择修正协议构建演化动力学方程，使得每个种群的演化动力学方程经过不断的迭代和优化能达到博弈的纳什均衡解，并且具有不变集的性质；In step 3, an evolutionary game of two populations subject to coupling constraints is proposed, and the revised protocol is selected to construct the evolutionary dynamic equation, so that the evolutionary dynamic equation of each population can reach the Nash equilibrium solution of the game through continuous iteration and optimization, and has an invariant set the nature of

步骤4，将所构建的多智能体编队问题转变成为受耦合约束的两种群演化博弈问题，利用演化博弈的演化动力学方程来对多智能体编队优化问题进行求解。Step 4, transform the constructed multi-agent formation problem into a two-group evolutionary game problem subject to coupling constraints, and use the evolutionary dynamics equation of the evolutionary game to solve the multi-agent formation optimization problem.

其中，所述步骤4中，将编队控制中的智能体位置转变成为演化博弈中的种群状态，将编队控制中的各个智能体转变成为演化博弈中的策略，将编队控制问题中的代价函数和演化博弈的效益函数相结合，进而利用演化动力学方程对步骤1中的最优控制问题进行求解。Wherein, in the step 4, the position of the agent in the formation control is transformed into the population state in the evolutionary game, each agent in the formation control is transformed into a strategy in the evolutionary game, and the cost function and The benefit function of the evolutionary game is combined, and then the optimal control problem in step 1 is solved by using the evolutionary dynamics equation.

其中，所述步骤1中的优化问题为：Wherein, the optimization problem in the step 1 is:

min_u(k)J(k)min _u(k) J(k)

s.t.form＝0,1,…,H_p-1stform＝0,1,...,H _p -1

其中：

表示第i个智能体的位置信息，

表示第i个智能体的速度信息，

表示第i个智能体的状态变量，

表示第i个智能体的控制变量，

表示第i个智能体的避碰约束集，

表示多智能体的可移动范围，

表示单个智能体的允许控制输出范围。in:

Indicates the location information of the i-th agent,

Indicates the speed information of the i-th agent,

Represents the state variable of the i-th agent,

Denotes the control variable of the i-th agent,

Represents the collision avoidance constraint set of the i-th agent,

Indicates the movable range of the multi-agent,

Indicates the allowed range of control outputs for a single agent.

其中，所述步骤2中的安全距离集的定义为：Wherein, the definition of the safety distance set in the step 2 is:

其中，R为规定的安全距离，集合

是多面体闭集，对于任意

和

满足‖c_i(k)-c_j(k)‖≥R，

表示智能体i的邻居智能体集合，δ_ij(k)、ε_ij(k)以及ω_ij(k)表示用于计算的中间变量。Among them, R is the specified safety distance, set

is a polyhedron closed set, for any

and

Satisfy ‖c _i (k)-c _j (k)‖≥R,

Represents the set of neighbor agents of agent i, δ _ij (k), ε _ij (k) and ω _ij (k) represent intermediate variables for calculation.

其中，所述步骤2中采用具有耦合约束的两个种群的分布式进化博弈，具体步骤为：通过寻找纳什平衡点进行进化博弈的优化问题求解；将寻找纳什平衡点求解的优化问题代入到平均动力学，得到具有耦合约束的两个种群的分布史密斯动力学方程。Wherein, in the step 2, a distributed evolutionary game of two populations with coupling constraints is adopted, and the specific steps are: solving the optimization problem of the evolutionary game by finding the Nash equilibrium point; substituting the optimization problem for finding the Nash equilibrium point solution into the average Dynamics, to obtain the Smith Dynamics equations for the distribution of two populations with coupled constraints.

有益效果：Beneficial effect:

1、本发明将演化博弈中的平均动力学推广到两个种群间的耦合约束条件下，并证明了该演化动力学经过不断的迭代和优化最终会达到博弈的Nash均衡点，以及受耦合约束的两种群演化博弈具有不变集的约束，即在初始条件满足的情况下，在演化博弈演化的过程中始终能够保持约束条件的满足。将多智能体编队控制问题转化成为演化博弈问题，从而将集中式的优化问题拆分成若干个子问题，然后分配给每个子智能体进行求解。每个自智能体利用自己信息、局部的模型以及可以获得的邻居信息对子问题进行求解，从而大大降低了计算量和复杂度；另外也弥补了传统的分散式控制由于信息交互能力不足所带来的性能下降问题，使控制性能保持在较高的水平，同时提高了系统的灵活性，可扩展性；本发明采用了无领导者的编队控制算法，所有智能体具有相同的角色和功能，因而能够解决leader-follower编队控制算法中的缺点。1. The present invention extends the average dynamics in the evolutionary game to the coupling constraints between two populations, and proves that the evolutionary dynamics will eventually reach the Nash equilibrium point of the game through continuous iteration and optimization, and is subject to coupling constraints The two-group evolutionary game of has invariant set constraints, that is, in the case of satisfying the initial conditions, the constraints can always be satisfied during the evolution of the evolutionary game. The multi-agent formation control problem is transformed into an evolutionary game problem, so that the centralized optimization problem is split into several sub-problems, and then assigned to each sub-agent to solve. Each self-agent uses its own information, local models and available neighbor information to solve sub-problems, thus greatly reducing the amount of calculation and complexity; in addition, it also makes up for the traditional decentralized control due to insufficient information interaction capabilities. In order to keep the control performance at a high level, the flexibility and scalability of the system are improved; the present invention adopts a leaderless formation control algorithm, and all agents have the same role and function. Therefore, the shortcomings in the leader-follower formation control algorithm can be solved.

2、本发明利用模型预测控制算法为构建全局优化问题，通过在全局模型预测成本函数中设计一个编队误差函数，来实现编队的目的。通过引引入不变集的性质来保证每个智能体在移动的过程中不会发生碰撞。2. The present invention utilizes the model predictive control algorithm to construct the global optimization problem, and realizes the goal of formation by designing a formation error function in the global model prediction cost function. By introducing the property of the invariant set, it is guaranteed that each agent will not collide in the process of moving.

3、本发明对于时变的通信网络也同样适用。在提高了控制性能和安全性能的同时，降低了计算的复杂程度，减少了通信负担，解决了现有的部分编队控制算法不能处理具有通讯约束或时变通讯网络的问题。3. The present invention is also applicable to time-varying communication networks. While improving the control performance and safety performance, it reduces the complexity of calculation, reduces the communication burden, and solves the problem that some existing formation control algorithms cannot deal with communication constraints or time-varying communication networks.

附图说明Description of drawings

图1为本发明中编队控制问题和演化博弈问题之间的转换图；Fig. 1 is the transition figure between formation control problem and evolutionary game problem among the present invention;

图2为本发明中6个智能体的二维实际轨迹图；Fig. 2 is the two-dimensional actual locus diagram of 6 intelligent bodies in the present invention;

图3为本发明中每个智能体的位置坐标—时间曲线图；Fig. 3 is the positional coordinates-time graph of each intelligent body in the present invention;

图4为本发明中每个智能体对的安全距离—时间曲线；Fig. 4 is the safe distance-time curve of each intelligent body in the present invention;

图5为本发明中每个智能体的控制输入—时间曲线图。Fig. 5 is the control input-time graph of each agent in the present invention.

具体实施方式detailed description

下面结合附图并举实施例，对本发明进行详细描述。The present invention will be described in detail below with reference to the accompanying drawings and examples.

本发明在多智能体编队中引入了演化博弈算法，演化博弈作为一种数学工具，能够在仅知道部分参与者的部分信息情况下，描述决策者的行为。通过不断迭代和优化，使得参与者的局部行为可以达到一个整体的目标。因此演化博弈适合解决分布式多智能体编队控制问题。本发明提供的一种基于分布式演化博弈的分布式模型预测无领导者编队控制方法，包括以下步骤：The invention introduces an evolutionary game algorithm in the multi-agent formation. As a mathematical tool, the evolutionary game can describe the behavior of a decision maker under the condition of only knowing part of the information of some participants. Through continuous iteration and optimization, the partial behavior of participants can achieve an overall goal. Therefore, evolutionary game is suitable for solving the problem of distributed multi-agent formation control. The present invention provides a distributed model based on distributed evolutionary game prediction leaderless formation control method, comprising the following steps:

第一部分，构建多智能体系统，包括如下子步骤:The first part, building a multi-agent system, includes the following sub-steps:

步骤11，系统架构的设计Step 11, design of system architecture

考虑一个具有

个多智能体的编队，取

表示第i个智能体的位置信息，

表示第i个智能体的速度信息，对于任意智能体

其动力学模型表达式为consider a

multi-agent formation, take

Indicates the location information of the i-th agent,

Represents the speed information of the i-th agent, for any agent

Its dynamic model expression is

其中，

表示第i个智能体的状态变量，

表示第i个智能体的控制变量。in,

Represents the state variable of the i-th agent,

Denotes the control variable of the i-th agent.

步骤12，确定每个智能体的通信拓扑和目标。每个多智能体的通讯范围是

其时变通信拓扑为

这里节点集

对应的是智能体集

顶点集

代表着可以交互信息的智能体对，A(k)＝[a_ij(k)]_M×M表示邻接矩阵，其中当智能体i和智能体j可以交互信息时，a_ij(k)＝1，否则a_ij(k)＝0。令

表示智能体i的期望状态，对于任意智能体i和智能体j，需要满足：Step 12, determine the communication topology and target of each agent. The communication range of each multi-agent is

Its time-varying communication topology is

Here node set

Corresponding to the agent set

vertex set

Represents the agent pair that can exchange information, A(k)=[a _ij (k)] _M×M represents the adjacency matrix, where when agent i and agent j can exchange information, a _ij (k)=1 , otherwise a _ij (k)=0. make

Indicates the desired state of agent i, for any agent i and agent j, it needs to satisfy:

(1)控制目标：

(1) Control objectives:

(2)避障约束：d_ij(k)＝||c_i(k)-c_j(k)||≥R，其中最小安全距离

(2) Obstacle avoidance constraints: d _ij (k)=|| _{ci (k)-c j} ₍ k)||≥R, where the minimum safe distance

(3)位置约束：

其中

是智能体允许到达的区域；(3) Position constraints:

in

is the area that the agent is allowed to reach;

(4)输入约束：

其中

是控制输入允许的范围；(4) Input constraints:

in

is the range allowed by the control input;

(5)期待状态要求：

for all

即不同智能体之间期待的目标位置距离大于安全距离；

表示智能体i的邻居智能体集合。(5) Expected state requirements:

for all

That is, the expected target position distance between different agents is greater than the safe distance;

Represents the set of neighbor agents of agent i.

步骤13，为每个智能体设计安全距离集。在每个时刻k,获取到i智能体和其所有的邻居j智能体的位置c_i(k)和c_j(k)，采用Voronoi图重新构造约束集即Step 13, design a safe distance set for each agent. At each moment k, the positions c _i (k) and c _j (k) of the agent i and all its neighbors j are obtained, and the constraint set is reconstructed using the Voronoi diagram as

其中in

其中，δ_ij(k)、ε_ij(k)以及ω_ij(k)表示用于计算的中间变量，集合

是多面体闭集，即无碰撞集，并且对于任意

和

都会满足‖c_i(k)-c_j(k)‖≥R。Among them, δ _ij (k), ε _ij (k) and ω _ij (k) represent the intermediate variables used for calculation, and the set

is a polyhedral closed set, that is, a collision-free set, and for any

and

will satisfy ‖c _i (k)-c _j (k)‖≥R.

步骤14，构建模型预测优化问题。为了实现控制目标，令

代表智能体i的位置偏差，定义代价函数为：Step 14, building a model to predict the optimization problem. In order to achieve the control objective, the

Represents the position deviation of agent i, and defines the cost function as:

其中，

和

都是对称正定矩阵，H_p为预测时域，无人机编队的最优控制问题描述为：in,

and

Both are symmetric positive definite matrices, H _p is the prediction time domain, the optimal control problem of UAV formation is described as:

min_u(k)J(k) (8a)min _u(k) J(k) (8a)

s.t.form＝0,1,…,H_p-1 (8b)stform=0,1,...,H _p -1 (8b)

当优化问题(8)存在可行解时，会求得未来一段时间内的最优控制输入，考虑到实际应用中存在模型失配，受到干扰等原因，并不是将所求解到的最优控制序列逐一全部应用到系统中，而是将最优控制序列中的第一个元素被用于实际系统中。在下一时刻k+1，重新采样系统的当前状态，重新构造优化问题(8)并进行求解，继续重复前述步骤。但是此时构建的优化问题还是集中式的优化问题，在一下步骤中，将通过分布式演化博弈的方法来对上述的优化问题进行分布式求解。When the optimization problem (8) has a feasible solution, the optimal control input for a period of time in the future will be obtained. Considering the model mismatch and interference in the actual application, it is not the optimal control sequence that is solved. All are applied to the system one by one, but the first element in the optimal control sequence is used in the actual system. At the next moment k+1, re-sample the current state of the system, reconstruct the optimization problem (8) and solve it, and continue to repeat the preceding steps. However, the optimization problem constructed at this time is still a centralized optimization problem. In the following steps, the above-mentioned optimization problem will be solved in a distributed manner through the distributed evolutionary game method.

由于避碰约束本质上是非凸的，它可能导致非凸优化问题。为了解决这个计算问题，引入Voronoi图的思想为每个智能体创建安全距离集。保证每个智能体只要在规定的安全距离集中移动时，一定不会发生碰撞。Since the collision avoidance constraint is non-convex in nature, it can lead to a non-convex optimization problem. In order to solve this calculation problem, the idea of Voronoi diagram is introduced to create a safe distance set for each agent. It is guaranteed that each agent will not collide as long as it moves in a concentrated safe distance.

第二部分，受耦合约束的两种群演化博弈。建设有两个种群p∈(1,2)，在每个种群中都具有大量且有限的参与者，且在两个种群中具有相同的策略集S。令s_i∈S表示第i个策略，表示策略集，其中包含n个策略，令m_p,i表示种群p中接收策略i的个体数，并且

取种群p中接受策略i的比例为ρ_p,i＝m_p,i/m_p≥0，并且可以得到p_p＝[ρ_p,1,ρ_p,2,…,ρ_p,n]^T和π_p＝∑_i∈Sρ_p,i＝1。同时，令种群p的适度函数为F_p(p_p)＝[f_p,1(p_p),f_p,2(p_p),…,f_p,n(p_p)]^T。这里，统一定义x_i:＝ρ_1,i，y_i:＝ρ_2,i，x:＝p₁，y:＝p₂，f_i ^x:＝f_1,i(p₁)，f_i ^y:＝f_2,i(p₂)，

和

The second part is the evolutionary game of two populations subject to coupling constraints. There are two populations p∈(1,2), each of which has a large number of limited participants, and has the same strategy set S in both populations. Let s _i ∈ S denote the i-th strategy, denote the strategy set, which contains n strategies, let m _p,i denote the number of individuals receiving strategy i in population p, and

Take the proportion of acceptance strategy i in population p as ρ _p,i =m _p,i /m _p ≥0, and we can get p _p =[ρ _p,1 ,ρ _p,2 ,…,ρ _p,n ] ^T and π _p =∑ _i∈S ρ _p,i =1. Meanwhile, let the fitness function of the population p be F _p (p _p )=[f _p,1 (p _p ),f _p,2 (p _p ),…,f _p,n (p _p )] ^T . Here, uniformly define x _i :=ρ _1,i , y _i :=ρ _2,i , x:=p ₁ , y:=p ₂ , f _i ^x :=f _1,i (p ₁ ), f _i ^y := f _2,i (p ₂ ),

and

步骤21，演化博弈中的通讯拓扑图设置。对于两个种群(x,y)，为保持一定平衡，需要满足集合Ξ＝{(x,y)∣Ax+By≤C}，其中A＝diag{a₁,a₂,…,a_n}，B＝diag{b₁,b₂,…,b_n}和C＝[c₁ c₂ … c_n]^T。在进化过程中，集合Λ:＝{(x,y)∣∑_i∈Sx_i＝π₁,∑_i∈Sy_i＝π₂,x_i≥0,y_i≥0}包含了种群的所有可能状态。对于第一个种群，各个个体之间的策略交互可以用无向图

来表示，其中节点集

代表所有的策略集，顶点集

代表着种群x中个体可以采取不同的策略，A(k)＝[a_ij(k)]_M×M表示邻接矩阵，其中当个体采取策略i，并且也可以采取策略j时，a_ij(k)＝1，否则a_ij(k)＝0。同理，对于第二个种群，各个个体之间的策略交互可以用无向图

来表示。Step 21, setting the communication topology map in the evolutionary game. For two populations (x,y), in order to maintain a certain balance, it is necessary to satisfy the set Ξ={(x,y)∣Ax+By≤C}, where A=diag{a ₁ ,a ₂ ,…,a _n } , B=diag{b ₁ ,b ₂ ,...,b _n } and C=[c ₁ c ₂ . . . c _n ] ^T . In the process of evolution, the set Λ:={(x,y)∣∑ _i∈S x _i =π ₁ ,∑ _i∈S y _i =π ₂ , _xi ≥0,y _i ≥0} contains the all possible states. For the first population, the strategy interaction between individuals can be used as an undirected graph

to represent, where the node set

Represents all strategy sets, vertex sets

Represents that individuals in population x can adopt different strategies, A(k)=[a _ij (k)] _M×M represents the adjacency matrix, where when an individual adopts strategy i and can also adopt strategy j, a _ij (k )=1, otherwise a _ij (k)=0. In the same way, for the second population, the strategy interaction between individuals can use the undirected graph

To represent.

进化博弈的优化问题通过寻找纳什平衡点进行求解，并可以描述为：The optimization problem of the evolutionary game is solved by finding the Nash equilibrium point, and can be described as:

max_x,yW(x,y) (9a)max _x,y W(x,y) (9a)

s.t.Ax+By≤C (9b)s.t.Ax+By≤C (9b)

x_i≥0 (9e)x _i ≥ 0 (9e)

y_i≥0 (9f)y _i ≥ 0 (9f)

其中，代价函数W(x,y)是严格连续可微的凹函数，(x_i,y_i)为种群状态。Among them, the cost function W(x, y) is a strictly continuous and differentiable concave function, and ( _xi , y _i ) is the state of the population.

种群x与种群y中采用策略i的比例变化进化过程可以由分布式进化动力学描述，其表达式为：The evolution process of the proportional change of strategy i in population x and population y can be described by distributed evolution dynamics, and its expression is:

这种动力学也称为平均动力学。此外，修正协议φ_ij将当前的收益和汇总行为作为输入，并输出转换频率，即根据当前的总体状态和收益，个体采用策略i转向采用策略j的频率。This kinetics is also known as mean kinetics. In addition, the revised protocol φ _ij takes the current payoff and aggregation behavior as input, and outputs the switching frequency, that is, the frequency at which an individual adopts strategy i and switches to adopting strategy j according to the current overall state and payoff.

步骤21，通讯协议的设定。对于任意给定的x和y，使用

表示一组三元数，并对任意q∈S，满足

则

是关于矢量C-(Ax+By)最小元素对应的系数。因此，对于种群p的修正协议可以设计为：Step 21, setting of the communication protocol. For any given x and y, use

Represents a set of ternary numbers, and for any q∈S, satisfies

but

is the coefficient corresponding to the smallest element of the vector C-(Ax+By). Therefore, the modified protocol for population p can be designed as:

将(12)代入到(10)和(11)，可以得到Substituting (12) into (10) and (11), we can get

这就是具有耦合约束的两个种群的分布式史密斯动力学(DSD2PC)，具有这种动力学的进化博弈被称为具有耦合约束的两个种群的分布式进化博弈(DEG2PC)。This is Distributed Smith Dynamics of Two Populations with Coupling Constraints (DSD2PC), and an evolutionary game with such dynamics is called a Distributed Evolutionary Game of Two Populations with Coupling Constraints (DEG2PC).

令

则(13)和(14)重新表达为：make

Then (13) and (14) are re-expressed as:

将进化动力学表达为紧集的形式，表达式为Express the evolutionary dynamics as a compact form, the expression is

其中，

和

分别是关于图

和

的拉普拉斯矩阵。in,

and

are about graph

and

The Laplace matrix of .

S10、证明受两种群约束的演化博弈具有不变集的性质。若给定(x,y)∈Ξ∩Λ，由

得到

并且令

得到S10. Prove that the evolutionary game constrained by two groups has the property of an invariant set. Given (x,y)∈Ξ∩Λ, by

get

and make

get

因此，r^x(i,j)＝r^x(j,i)≥0。邻接矩阵

可以表示为：Therefore, r ^x (i, j)=r ^x (j, i)≧0. adjacency matrix

It can be expressed as:

由拉普拉斯矩阵的关系式

可以得到By the relation of Laplacian matrix

can get

根据r^x(i,j)的非负性，拉普拉斯矩阵

是半正定的。同理可证

是半正定的且

According to the non-negativity of r ^x (i,j), the Laplacian matrix

is positive semi-definite. The same reason can be proved

is positive semidefinite and

S11、根据引理1的

和

可以得到

和

也就是说

和

是常数。另外，当x_i＝0或y_i＝0时，根据(13)和(14)，得到S11. According to Lemma 1

and

can get

and

That is to say

and

is a constant. In addition, when x _i =0 or y _i =0, according to (13) and (14), we get

因此对于x_i≥0和y_i≥0，(x(t),y(t))∈Λ。Thus for x _i ≥ 0 and y _i ≥ 0, (x(t), y(t))∈Λ.

当(x(0),y(0))∈Ξ时，一旦轨迹(x(t),y(t))到达集合Ξ边界，对于i∈S，满足a_ix_i+b_iy_i＝c_i。根据定理1，

并且

将a_i和b_i代入(13)和(14)，得到When (x(0), y(0))∈Ξ, once the trajectory (x(t), y(t)) reaches the boundary of the set Ξ, for i∈S, a _i x _i +b _i y _i = c _i . According to Theorem 1,

and

Substituting a _i and b _i into (13) and (14), we get

在以下四种情况讨论

和

Discuss in the following four situations

and

若a_i>0,b_i>0

If a _i >0, b _i >0

若a_i>0,b_i≤0

If a _i >0, b _i ≤0

若a_i≤0,b_i>0

If a _i ≤0, b _i >0

若a_i≤0,b_i≤0

If a _i ≤ 0, b _i ≤ 0

则始终满足非负性

和非增长性a_ix_i+b_iy_i≤c_i。由于轨迹(x,y)的连续性，得出在后续的所有时间步长中(x(t),y(t))∈Λ。因此，集合Ξ∩Λ是不变集。then always satisfies the non-negativity

and non-increasing a _i x _i +b _i y _i ≤ c _i . Due to the continuity of the trajectory (x, y), it follows that (x(t), y(t)) ∈ Λ in all subsequent time steps. Therefore, the set Ξ∩Λ is an invariant set.

S12、选取E(x,y):＝W(x^*,y^*)-W(x,y)作为李雅普诺夫函数，并且E(x,y)≥0，其导数可以表示为S12, select E(x, y):=W(x ^* , y ^* )-W(x, y) as the Lyapunov function, and E(x, y)≥0, its derivative can be expressed as

因此，当初始值(x(0),y(0))∈Ξ沿着(13)和(14)进化时，DEG2PC趋近于纳什平衡点，并且纳什平衡点是局部渐近稳定的。Therefore, when the initial value (x(0), y(0)) ∈ Ξ evolves along (13) and (14), DEG2PC approaches the Nash equilibrium point, and the Nash equilibrium point is locally asymptotically stable.

第三部分，基于DEG2PC理论的分布式模型预测控制算法：The third part, the distributed model predictive control algorithm based on DEG2PC theory:

步骤31，本发明中编队控制问题和演化博弈问题之间的转换图如图1所示，使用演化博弈论的方法，将DEG2PC理论中的种群状态(x_i,y_i)和最优控制问题里的位置分量

相关联，关系式为Step 31, the transition diagram between the formation control problem and the evolutionary game problem in the present invention is shown in Figure 1, using the method of evolutionary game theory, the population state ( _xi , y _i ) in the DEG2PC theory and the optimal control problem positional components in

related, the relationship is

根据(6)中的动力学模型，u_i(k+m|k)和v_i(k+m+1|k)可以重新表达为：According to the kinetic model in (6), u _i (k+m|k) and v _i (k+m+1|k) can be re-expressed as:

并将(19)和(20)代入优化问题(8)中。由于问题(8)中是最小化代价函数J(k)，问题(9)是最大化凹函数W(x,y)，因此对于每种策略的适应度函数可以描述为f^x＝-_xJ和f^y＝-_yJ。此外，问题(8)中的约束(8d)、(8e)和(8f)可以转换为

的形式，与问题(9)中的约束(9b)相对应。And substitute (19) and (20) into the optimization problem (8). Since problem (8) is to minimize the cost function J(k), and problem (9) is to maximize the concave function W(x,y), the fitness function for each strategy can be described as f ^x =- _x J and f ^y =- _y J . Furthermore, constraints (8d), (8e) and (8f) in problem (8) can be transformed into

of the form, corresponding to constraint (9b) in problem (9).

步骤32，对于种群x和y，选择如(12)中的修正协议，使用(15)和(16)的动态演化，种群结果会趋向于纳什平衡点。之后，在时刻k可以得到最优位置轨迹(x^*(k),y^*(k))以及最优控制输入序列u^*(k)。因此，通过DEG2PC以分布式的方式求解了编队控制问题(9)。Step 32, for the populations x and y, choose the modified protocol as in (12), and use the dynamic evolution of (15) and (16), the population result will tend to the Nash equilibrium point. After that, the optimal position trajectory (x ^* (k), y ^* (k)) and the optimal control input sequence u ^* (k) can be obtained at time k. Therefore, the formation control problem (9) is solved in a distributed manner by DEG2PC.

综上，基于分布式进化博弈的分布式模型预测无领队编队控制方法可以描述为：给定输入：期望位置

预测时域H_p、安全距离R、交流范围θ、权重矩阵Q_i、P_i和R_i。需求输出：(x^*(k),y^*(k))和u_i ^*(k|k)In summary, the distributed model based on distributed evolutionary game predicts the leaderless formation control method can be described as: given input: expected position

Prediction time domain H _p , safety distance R, communication range θ, weight matrix Q _i , _Pi and R _i . Demand output: (x ^* (k), y ^* (k)) and u _i ^* (k|k)

(1)在时刻k给定样本z_i(k)和通信拓扑结构

(1) Given sample z _i (k) and communication topology at time k

(2)构造编队控制问题(8)；选取

设计修正协议(12)；(2) Structural formation control problem (8); select

Design Amendment Agreement (12);

(3)对于每个策略f^x和f^y，获取适度函数；(3) For each strategy f ^x and f ^y , obtain a fitness function;

(4)通过(13)和(14)，求解最优位置轨迹(x^*(k),y^*(k))以及最优控制输入序列u^*(k)；(4) Through (13) and (14), solve the optimal position trajectory (x ^* (k), y ^* (k)) and the optimal control input sequence u ^* (k);

(5)将u_i ^*(k|k)代入到每个智能体中，重复上述操作。(5) Substitute u _i ^* (k|k) into each agent, and repeat the above operation.

第四部分，理论仿真。选择一个具有六个智能体的多智能体系统，对于每个智能体

系统模型为The fourth part is theoretical simulation. Choose a multi-agent system with six agents, for each agent

The system model is

每个智能体的输入约束

交流范围θ＝2.3，安全距离R＝0.5，预测时域H_p＝20，权重矩阵Q_i＝R_i＝P_i＝I_4×4，每个智能体的初始速度和期待速度都设置为0，初始位置为Input constraints for each agent

Communication range θ=2.3, safety distance R=0.5, prediction time domain H _p =20, weight matrix Q _i ＝R _i ＝P _i ＝I _4×4 , the initial speed and expected speed of each agent are set to 0 , the initial position is

c₁(0)＝[3 3]^T,c₂(0)＝[1 4]^T,c₃(0)＝[2 0]^T c ₁ (0)=[3 3] ^T , c ₂ (0)=[1 4] ^T , c ₃ (0)=[2 0] ^T

c₄(0)＝[4 1]^T,c₅(0)＝[0 2]^T,c₆(0)＝[3 5]^T c ₄ (0)=[4 1] ^T , c ₅ (0)=[0 2] ^T , c ₆ (0)=[3 5] ^T

为了形成一个编队队形，每个智能体的期待位置为：To form a formation formation, the desired position of each agent is:

在MATLAB上运用ICLOCS和PDToolbox求解工具进行仿真实验，结果如附图说明所示，图2为本发明中6个智能体的二维实际轨迹图，图3为本发明中每个智能体的位置坐标—时间曲线图，图4为本发明中每个智能体对的安全距离—时间曲线，图5为本发明中每个智能体的控制输入—时间曲线图。图2中仿真结果表明，在该算法控制下，各个智能体最终能够到达指定的目标点。图3为各个智能体在移动过程中的位置。图4中的各个子图分别展示了智能体之间的相对位置，可以看出智能体之间的相对位置始终大于安全距离0.5，即智能体具有避碰的效果。图4的结果显示了智能体在移动的过程中能够保证输入约束的满足。Use ICLOCS and PDToolbox solving tool on MATLAB to carry out emulation experiment, the result is as shown in the accompanying drawing description, and Fig. 2 is the two-dimensional actual locus figure of 6 intelligent bodies in the present invention, and Fig. 3 is the position of each intelligent body in the present invention Coordinate-time graph, Fig. 4 is the safety distance-time curve of each agent pair in the present invention, and Fig. 5 is the control input-time graph of each agent in the present invention. The simulation results in Figure 2 show that under the control of the algorithm, each agent can finally reach the designated target point. Figure 3 shows the position of each agent during the movement. Each sub-graph in Figure 4 shows the relative positions of the agents respectively. It can be seen that the relative positions of the agents are always greater than the safety distance of 0.5, that is, the agents have the effect of avoiding collisions. The results in Fig. 4 show that the agent can guarantee the satisfaction of the input constraints in the process of moving.

综上所述，以上仅为本发明的较佳实施例而已，并非用于限定本发明的保护范围。凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。To sum up, the above are only preferred embodiments of the present invention, and are not intended to limit the protection scope of the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included within the protection scope of the present invention.

Claims

1. A distributed model prediction leaderless formation control method based on a distributed evolutionary game is characterized by comprising the following steps of:

step 1, establishing a multi-agent system, determining an initial position and a target position of an agent, and constructing a dynamic model of the agent, and an optimal control problem of obstacle avoidance constraint, agent control constraint and state constraint among the agents; the optimization problem is that under the condition that the final target state is known, the state of the intelligent agent in a future period of time is predicted through a prediction model, so that the distance between the position of the intelligent agent and the target position in the future period of time is minimum, and the optimal control input quantity at the current moment is obtained;

step 2, a safe distance set is established for each intelligent agent, and each intelligent agent is guaranteed not to collide as long as the intelligent agent moves in the specified safe distance set;

step 3, two group evolution games constrained by coupling are provided, and a correction protocol is selected to construct an evolution kinetic equation, so that the evolution kinetic equation of each group can reach Nash equilibrium solution of the games through continuous iteration and optimization and has the property of invariant set;

and 4, converting the constructed multi-agent formation problem into two group evolutionary game problems which are constrained by coupling, and solving the multi-agent formation optimization problem by using an evolutionary kinetic equation of the evolutionary game.

2. The method according to claim 1, wherein in step 4, the positions of the agents in the formation control are converted into the population states in the evolutionary game, each agent in the formation control is converted into a strategy in the evolutionary game, the cost function in the formation control problem is combined with the benefit function in the evolutionary game, and then the optimal control problem in step 1 is solved by using an evolutionary dynamic equation.

3. The method according to claim 1 or 2, characterized in that the optimization problem in step 1 is:

min _u(k) J(k)

s.t.for m＝0,1,…,H _p -1

wherein:

indicating the location information of the ith agent,

indicating the speed information of the ith agent,

a state variable representing the ith agent,

a control variable representing the ith agent,

representing the set of collision avoidance constraints for the ith agent,

representing the range of mobility of the multi-agent,

representing the allowable control output range of a single agent.

4. The method according to claim 3, wherein the safe distance set in step 2 is defined as:

wherein R is a prescribed safe distance, set

Is a closed set of polyhedrons, for arbitrary

And

satisfy | c _i (k)-c _j (k)‖≥R，

Set of neighbor agents, δ, representing agent i _ij (k)、ε _ij (k) And ω _ij (k) Representing intermediate variables for the calculation.

5. The method according to claim 1,2 or 4, wherein the step 2 adopts a distributed evolutionary game with two populations having coupling constraints, and comprises the following specific steps: solving the optimization problem of the evolutionary game by searching Nash balance points; substituting the optimization problem solved by searching for the Nash equilibrium point into the average dynamics to obtain the distributed Smith kinetic equation of the two populations with coupling constraint.