CN115616913A - A Model Predictive Leaderless Formation Control Method Based on Distributed Evolutionary Game - Google Patents
A Model Predictive Leaderless Formation Control Method Based on Distributed Evolutionary Game Download PDFInfo
- Publication number
- CN115616913A CN115616913A CN202211320956.0A CN202211320956A CN115616913A CN 115616913 A CN115616913 A CN 115616913A CN 202211320956 A CN202211320956 A CN 202211320956A CN 115616913 A CN115616913 A CN 115616913A
- Authority
- CN
- China
- Prior art keywords
- agent
- formation
- evolutionary game
- evolutionary
- formation control
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000015572 biosynthetic process Effects 0.000 title claims abstract description 50
- 238000000034 method Methods 0.000 title claims abstract description 22
- 238000005457 optimization Methods 0.000 claims abstract description 26
- 238000004364 calculation method Methods 0.000 claims abstract description 7
- 230000008878 coupling Effects 0.000 claims description 12
- 238000010168 coupling process Methods 0.000 claims description 12
- 238000005859 coupling reaction Methods 0.000 claims description 12
- 230000008901 benefit Effects 0.000 claims description 2
- 238000004891 communication Methods 0.000 abstract description 18
- 238000004422 calculation algorithm Methods 0.000 abstract description 10
- 238000010586 diagram Methods 0.000 abstract description 5
- 230000008569 process Effects 0.000 abstract description 5
- 230000007547 defect Effects 0.000 abstract 1
- 238000005755 formation reaction Methods 0.000 description 37
- 239000011159 matrix material Substances 0.000 description 8
- 238000013461 design Methods 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Automation & Control Theory (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
技术领域technical field
本发明属于多智能体编队控制技术领域,具体涉及一种基于分布式演化博弈的模型预测无领导者编队控制方法。The invention belongs to the technical field of multi-agent formation control, and in particular relates to a model prediction leaderless formation control method based on distributed evolutionary game.
背景技术Background technique
近些年,随着多智能体系统的不断发展,编队控制成为了当前多智能体系统研究的热点问题。编队控制是指多个智能体如无人车、无人机在朝着目标位置移动的过程中,相互之间能保期望的位置,同时又要适应环境约束(如避开障碍物)。它能够实现在无人工参与的情况下完成特定复杂的任务,因而在军事、航天、工业等各个领域得到了广泛的应用,具有良好的发展前景。但是,在实际应用中,多智能体编队控制的一个难点问题是,所有智能体必须具有与障碍物或者其他智能体避碰能力,且在智能体移动过程中通信拓扑可能是时变的。另外,当采用分布式的方式形成某种编队时,每个智能体都需要知道其他智能体的状态,但是当通信拓扑发生改变时,该智能体间的通信可能不存在。In recent years, with the continuous development of multi-agent systems, formation control has become a hot issue in the current multi-agent system research. Formation control means that multiple agents such as unmanned vehicles and unmanned aerial vehicles can keep the desired position with each other while moving towards the target position, and at the same time adapt to environmental constraints (such as avoiding obstacles). It can complete specific and complex tasks without human participation, so it has been widely used in various fields such as military, aerospace, and industry, and has a good development prospect. However, in practical applications, a difficult problem of multi-agent formation control is that all agents must have the ability to avoid collisions with obstacles or other agents, and the communication topology may be time-varying during the movement of agents. In addition, when a formation is formed in a distributed manner, each agent needs to know the state of other agents, but when the communication topology changes, the communication between the agents may not exist.
leader-follower控制方法作为解决当前编队控制问题的一种方法,其基本原理为将其中一个智能体作为领队以跟踪参考轨迹,其他的智能体作为跟随者与领队保持一定的距离,从而实现编队控制的目的。因其原理简单被广泛应用到多智能体编队中,但是leader-follower编队问题中存以下两个缺点:1)整个系统过于依赖领队,当领队无法跟踪参考轨迹时,整个多智能体编队都会偏离参考轨迹;2)leader智能体没有将follower智能体的编队跟随情况考虑在内,可能会出现leader智能体移动过快,follower智能体跟不上这种情况出现。The leader-follower control method is a method to solve the current formation control problem. Its basic principle is to use one of the agents as the leader to track the reference trajectory, and the other agents as followers to keep a certain distance from the leader to achieve formation control. the goal of. Because of its simple principle, it is widely used in multi-agent formations, but there are two shortcomings in the leader-follower formation problem: 1) The whole system is too dependent on the leader. When the leader cannot track the reference trajectory, the entire multi-agent formation will deviate from Refer to the trajectory; 2) The leader agent does not take into account the formation following of the follower agent, and it may happen that the leader agent moves too fast and the follower agent cannot keep up.
发明内容Contents of the invention
有鉴于此,本发明提供了一种基于分布式演化博弈的分布式模型预测无领导者编队控制方法,所有智能体具有相同的角色和功能,并且能够实现在受到通讯约束的条件下,每个智能体只需要获得邻居的局部信息就可以无碰撞形成编队。In view of this, the present invention provides a distributed model prediction based on distributed evolutionary game leaderless formation control method, all agents have the same role and function, and can achieve under the condition of communication constraints, each Agents only need to obtain the local information of neighbors to form formations without collision.
为实现上述目的,本发明一种基于分布式演化博弈的分布式模型预测无领导者编队控制方法,包括以下步骤:In order to achieve the above object, a distributed model based on distributed evolutionary game of the present invention predicts a leaderless formation control method, comprising the following steps:
步骤1,建立多智能体系统,明确智能体的初始位置及目标位置,构建智能体的动力学模型,多智能体之间的避障约束、智能体的控制约束以及状态约束的最优控制问题;所述优化问题是最终目标状态已知情况下,通过预测模型来预测未来一段时间内智能体的状态,使未来一段时间内智能体的位置和目标位置距离最小,获得当前时刻最优控制输入量;
步骤2,为每个智能体创建安全距离集,保证每个智能体只要在规定的安全距离集中移动时不会发生碰撞;
步骤3,提出受耦合约束的两种群演化博弈,选择修正协议构建演化动力学方程,使得每个种群的演化动力学方程经过不断的迭代和优化能达到博弈的纳什均衡解,并且具有不变集的性质;In
步骤4,将所构建的多智能体编队问题转变成为受耦合约束的两种群演化博弈问题,利用演化博弈的演化动力学方程来对多智能体编队优化问题进行求解。
其中,所述步骤4中,将编队控制中的智能体位置转变成为演化博弈中的种群状态,将编队控制中的各个智能体转变成为演化博弈中的策略,将编队控制问题中的代价函数和演化博弈的效益函数相结合,进而利用演化动力学方程对步骤1中的最优控制问题进行求解。Wherein, in the
其中,所述步骤1中的优化问题为:Wherein, the optimization problem in the
minu(k)J(k)min u(k) J(k)
s.t.form=0,1,…,Hp-1stform=0,1,...,H p -1
其中:表示第i个智能体的位置信息,表示第i个智能体的速度信息,表示第i个智能体的状态变量,表示第i个智能体的控制变量,表示第i个智能体的避碰约束集,表示多智能体的可移动范围,表示单个智能体的允许控制输出范围。in: Indicates the location information of the i-th agent, Indicates the speed information of the i-th agent, Represents the state variable of the i-th agent, Denotes the control variable of the i-th agent, Represents the collision avoidance constraint set of the i-th agent, Indicates the movable range of the multi-agent, Indicates the allowed range of control outputs for a single agent.
其中,所述步骤2中的安全距离集的定义为:Wherein, the definition of the safety distance set in the
其中,R为规定的安全距离,集合是多面体闭集,对于任意和满足‖ci(k)-cj(k)‖≥R,表示智能体i的邻居智能体集合,δij(k)、εij(k)以及ωij(k)表示用于计算的中间变量。Among them, R is the specified safety distance, set is a polyhedron closed set, for any and Satisfy ‖c i (k)-c j (k)‖≥R, Represents the set of neighbor agents of agent i, δ ij (k), ε ij (k) and ω ij (k) represent intermediate variables for calculation.
其中,所述步骤2中采用具有耦合约束的两个种群的分布式进化博弈,具体步骤为:通过寻找纳什平衡点进行进化博弈的优化问题求解;将寻找纳什平衡点求解的优化问题代入到平均动力学,得到具有耦合约束的两个种群的分布史密斯动力学方程。Wherein, in the
有益效果:Beneficial effect:
1、本发明将演化博弈中的平均动力学推广到两个种群间的耦合约束条件下,并证明了该演化动力学经过不断的迭代和优化最终会达到博弈的Nash均衡点,以及受耦合约束的两种群演化博弈具有不变集的约束,即在初始条件满足的情况下,在演化博弈演化的过程中始终能够保持约束条件的满足。将多智能体编队控制问题转化成为演化博弈问题,从而将集中式的优化问题拆分成若干个子问题,然后分配给每个子智能体进行求解。每个自智能体利用自己信息、局部的模型以及可以获得的邻居信息对子问题进行求解,从而大大降低了计算量和复杂度;另外也弥补了传统的分散式控制由于信息交互能力不足所带来的性能下降问题,使控制性能保持在较高的水平,同时提高了系统的灵活性,可扩展性;本发明采用了无领导者的编队控制算法,所有智能体具有相同的角色和功能,因而能够解决leader-follower编队控制算法中的缺点。1. The present invention extends the average dynamics in the evolutionary game to the coupling constraints between two populations, and proves that the evolutionary dynamics will eventually reach the Nash equilibrium point of the game through continuous iteration and optimization, and is subject to coupling constraints The two-group evolutionary game of has invariant set constraints, that is, in the case of satisfying the initial conditions, the constraints can always be satisfied during the evolution of the evolutionary game. The multi-agent formation control problem is transformed into an evolutionary game problem, so that the centralized optimization problem is split into several sub-problems, and then assigned to each sub-agent to solve. Each self-agent uses its own information, local models and available neighbor information to solve sub-problems, thus greatly reducing the amount of calculation and complexity; in addition, it also makes up for the traditional decentralized control due to insufficient information interaction capabilities. In order to keep the control performance at a high level, the flexibility and scalability of the system are improved; the present invention adopts a leaderless formation control algorithm, and all agents have the same role and function. Therefore, the shortcomings in the leader-follower formation control algorithm can be solved.
2、本发明利用模型预测控制算法为构建全局优化问题,通过在全局模型预测成本函数中设计一个编队误差函数,来实现编队的目的。通过引引入不变集的性质来保证每个智能体在移动的过程中不会发生碰撞。2. The present invention utilizes the model predictive control algorithm to construct the global optimization problem, and realizes the goal of formation by designing a formation error function in the global model prediction cost function. By introducing the property of the invariant set, it is guaranteed that each agent will not collide in the process of moving.
3、本发明对于时变的通信网络也同样适用。在提高了控制性能和安全性能的同时,降低了计算的复杂程度,减少了通信负担,解决了现有的部分编队控制算法不能处理具有通讯约束或时变通讯网络的问题。3. The present invention is also applicable to time-varying communication networks. While improving the control performance and safety performance, it reduces the complexity of calculation, reduces the communication burden, and solves the problem that some existing formation control algorithms cannot deal with communication constraints or time-varying communication networks.
附图说明Description of drawings
图1为本发明中编队控制问题和演化博弈问题之间的转换图;Fig. 1 is the transition figure between formation control problem and evolutionary game problem among the present invention;
图2为本发明中6个智能体的二维实际轨迹图;Fig. 2 is the two-dimensional actual locus diagram of 6 intelligent bodies in the present invention;
图3为本发明中每个智能体的位置坐标—时间曲线图;Fig. 3 is the positional coordinates-time graph of each intelligent body in the present invention;
图4为本发明中每个智能体对的安全距离—时间曲线;Fig. 4 is the safe distance-time curve of each intelligent body in the present invention;
图5为本发明中每个智能体的控制输入—时间曲线图。Fig. 5 is the control input-time graph of each agent in the present invention.
具体实施方式detailed description
下面结合附图并举实施例,对本发明进行详细描述。The present invention will be described in detail below with reference to the accompanying drawings and examples.
本发明在多智能体编队中引入了演化博弈算法,演化博弈作为一种数学工具,能够在仅知道部分参与者的部分信息情况下,描述决策者的行为。通过不断迭代和优化,使得参与者的局部行为可以达到一个整体的目标。因此演化博弈适合解决分布式多智能体编队控制问题。本发明提供的一种基于分布式演化博弈的分布式模型预测无领导者编队控制方法,包括以下步骤:The invention introduces an evolutionary game algorithm in the multi-agent formation. As a mathematical tool, the evolutionary game can describe the behavior of a decision maker under the condition of only knowing part of the information of some participants. Through continuous iteration and optimization, the partial behavior of participants can achieve an overall goal. Therefore, evolutionary game is suitable for solving the problem of distributed multi-agent formation control. The present invention provides a distributed model based on distributed evolutionary game prediction leaderless formation control method, comprising the following steps:
第一部分,构建多智能体系统,包括如下子步骤:The first part, building a multi-agent system, includes the following sub-steps:
步骤11,系统架构的设计Step 11, design of system architecture
考虑一个具有个多智能体的编队,取表示第i个智能体的位置信息,表示第i个智能体的速度信息,对于任意智能体其动力学模型表达式为consider a multi-agent formation, take Indicates the location information of the i-th agent, Represents the speed information of the i-th agent, for any agent Its dynamic model expression is
其中,表示第i个智能体的状态变量,表示第i个智能体的控制变量。in, Represents the state variable of the i-th agent, Denotes the control variable of the i-th agent.
步骤12,确定每个智能体的通信拓扑和目标。每个多智能体的通讯范围是其时变通信拓扑为这里节点集对应的是智能体集顶点集代表着可以交互信息的智能体对,A(k)=[aij(k)]M×M表示邻接矩阵,其中当智能体i和智能体j可以交互信息时,aij(k)=1,否则aij(k)=0。令表示智能体i的期望状态,对于任意智能体i和智能体j,需要满足:
(1)控制目标: (1) Control objectives:
(2)避障约束:dij(k)=||ci(k)-cj(k)||≥R,其中最小安全距离 (2) Obstacle avoidance constraints: d ij (k)=|| ci (k)-c j ( k)||≥R, where the minimum safe distance
(3)位置约束:其中是智能体允许到达的区域;(3) Position constraints: in is the area that the agent is allowed to reach;
(4)输入约束:其中是控制输入允许的范围;(4) Input constraints: in is the range allowed by the control input;
(5)期待状态要求:for all即不同智能体之间期待的目标位置距离大于安全距离;表示智能体i的邻居智能体集合。(5) Expected state requirements: for all That is, the expected target position distance between different agents is greater than the safe distance; Represents the set of neighbor agents of agent i.
步骤13,为每个智能体设计安全距离集。在每个时刻k,获取到i智能体和其所有的邻居j智能体的位置ci(k)和cj(k),采用Voronoi图重新构造约束集即Step 13, design a safe distance set for each agent. At each moment k, the positions c i (k) and c j (k) of the agent i and all its neighbors j are obtained, and the constraint set is reconstructed using the Voronoi diagram as
其中in
其中,δij(k)、εij(k)以及ωij(k)表示用于计算的中间变量,集合是多面体闭集,即无碰撞集,并且对于任意和都会满足‖ci(k)-cj(k)‖≥R。Among them, δ ij (k), ε ij (k) and ω ij (k) represent the intermediate variables used for calculation, and the set is a polyhedral closed set, that is, a collision-free set, and for any and will satisfy ‖c i (k)-c j (k)‖≥R.
步骤14,构建模型预测优化问题。为了实现控制目标,令 代表智能体i的位置偏差,定义代价函数为:
其中, 和都是对称正定矩阵,Hp为预测时域,无人机编队的最优控制问题描述为:in, and Both are symmetric positive definite matrices, H p is the prediction time domain, the optimal control problem of UAV formation is described as:
minu(k)J(k) (8a)min u(k) J(k) (8a)
s.t.form=0,1,…,Hp-1 (8b)stform=0,1,...,H p -1 (8b)
当优化问题(8)存在可行解时,会求得未来一段时间内的最优控制输入,考虑到实际应用中存在模型失配,受到干扰等原因,并不是将所求解到的最优控制序列逐一全部应用到系统中,而是将最优控制序列中的第一个元素被用于实际系统中。在下一时刻k+1,重新采样系统的当前状态,重新构造优化问题(8)并进行求解,继续重复前述步骤。但是此时构建的优化问题还是集中式的优化问题,在一下步骤中,将通过分布式演化博弈的方法来对上述的优化问题进行分布式求解。When the optimization problem (8) has a feasible solution, the optimal control input for a period of time in the future will be obtained. Considering the model mismatch and interference in the actual application, it is not the optimal control sequence that is solved. All are applied to the system one by one, but the first element in the optimal control sequence is used in the actual system. At the next moment k+1, re-sample the current state of the system, reconstruct the optimization problem (8) and solve it, and continue to repeat the preceding steps. However, the optimization problem constructed at this time is still a centralized optimization problem. In the following steps, the above-mentioned optimization problem will be solved in a distributed manner through the distributed evolutionary game method.
由于避碰约束本质上是非凸的,它可能导致非凸优化问题。为了解决这个计算问题,引入Voronoi图的思想为每个智能体创建安全距离集。保证每个智能体只要在规定的安全距离集中移动时,一定不会发生碰撞。Since the collision avoidance constraint is non-convex in nature, it can lead to a non-convex optimization problem. In order to solve this calculation problem, the idea of Voronoi diagram is introduced to create a safe distance set for each agent. It is guaranteed that each agent will not collide as long as it moves in a concentrated safe distance.
第二部分,受耦合约束的两种群演化博弈。建设有两个种群p∈(1,2),在每个种群中都具有大量且有限的参与者,且在两个种群中具有相同的策略集S。令si∈S表示第i个策略,表示策略集,其中包含n个策略,令mp,i表示种群p中接收策略i的个体数,并且取种群p中接受策略i的比例为ρp,i=mp,i/mp≥0,并且可以得到pp=[ρp,1,ρp,2,…,ρp,n]T和πp=∑i∈Sρp,i=1。同时,令种群p的适度函数为Fp(pp)=[fp,1(pp),fp,2(pp),…,fp,n(pp)]T。这里,统一定义xi:=ρ1,i,yi:=ρ2,i,x:=p1,y:=p2,fi x:=f1,i(p1),fi y:=f2,i(p2),和 The second part is the evolutionary game of two populations subject to coupling constraints. There are two populations p∈(1,2), each of which has a large number of limited participants, and has the same strategy set S in both populations. Let s i ∈ S denote the i-th strategy, denote the strategy set, which contains n strategies, let m p,i denote the number of individuals receiving strategy i in population p, and Take the proportion of acceptance strategy i in population p as ρ p,i =m p,i /m p ≥0, and we can get p p =[ρ p,1 ,ρ p,2 ,…,ρ p,n ] T and π p =∑ i∈S ρ p,i =1. Meanwhile, let the fitness function of the population p be F p (p p )=[f p,1 (p p ),f p,2 (p p ),…,f p,n (p p )] T . Here, uniformly define x i :=ρ 1,i , y i :=ρ 2,i , x:=p 1 , y:=p 2 , f i x :=f 1,i (p 1 ), f i y := f 2,i (p 2 ), and
步骤21,演化博弈中的通讯拓扑图设置。对于两个种群(x,y),为保持一定平衡,需要满足集合Ξ={(x,y)∣Ax+By≤C},其中A=diag{a1,a2,…,an},B=diag{b1,b2,…,bn}和C=[c1 c2 … cn]T。在进化过程中,集合Λ:={(x,y)∣∑i∈Sxi=π1,∑i∈Syi=π2,xi≥0,yi≥0}包含了种群的所有可能状态。对于第一个种群,各个个体之间的策略交互可以用无向图来表示,其中节点集代表所有的策略集,顶点集代表着种群x中个体可以采取不同的策略,A(k)=[aij(k)]M×M表示邻接矩阵,其中当个体采取策略i,并且也可以采取策略j时,aij(k)=1,否则aij(k)=0。同理,对于第二个种群,各个个体之间的策略交互可以用无向图来表示。Step 21, setting the communication topology map in the evolutionary game. For two populations (x,y), in order to maintain a certain balance, it is necessary to satisfy the set Ξ={(x,y)∣Ax+By≤C}, where A=diag{a 1 ,a 2 ,…,a n } , B=diag{b 1 ,b 2 ,...,b n } and C=[c 1 c 2 . . . c n ] T . In the process of evolution, the set Λ:={(x,y)∣∑ i∈S x i =π 1 ,∑ i∈S y i =π 2 , xi ≥0,y i ≥0} contains the all possible states. For the first population, the strategy interaction between individuals can be used as an undirected graph to represent, where the node set Represents all strategy sets, vertex sets Represents that individuals in population x can adopt different strategies, A(k)=[a ij (k)] M×M represents the adjacency matrix, where when an individual adopts strategy i and can also adopt strategy j, a ij (k )=1, otherwise a ij (k)=0. In the same way, for the second population, the strategy interaction between individuals can use the undirected graph To represent.
进化博弈的优化问题通过寻找纳什平衡点进行求解,并可以描述为:The optimization problem of the evolutionary game is solved by finding the Nash equilibrium point, and can be described as:
maxx,yW(x,y) (9a)max x,y W(x,y) (9a)
s.t.Ax+By≤C (9b)s.t.Ax+By≤C (9b)
xi≥0 (9e)x i ≥ 0 (9e)
yi≥0 (9f)y i ≥ 0 (9f)
其中,代价函数W(x,y)是严格连续可微的凹函数,(xi,yi)为种群状态。Among them, the cost function W(x, y) is a strictly continuous and differentiable concave function, and ( xi , y i ) is the state of the population.
种群x与种群y中采用策略i的比例变化进化过程可以由分布式进化动力学描述,其表达式为:The evolution process of the proportional change of strategy i in population x and population y can be described by distributed evolution dynamics, and its expression is:
这种动力学也称为平均动力学。此外,修正协议φij将当前的收益和汇总行为作为输入,并输出转换频率,即根据当前的总体状态和收益,个体采用策略i转向采用策略j的频率。This kinetics is also known as mean kinetics. In addition, the revised protocol φ ij takes the current payoff and aggregation behavior as input, and outputs the switching frequency, that is, the frequency at which an individual adopts strategy i and switches to adopting strategy j according to the current overall state and payoff.
步骤21,通讯协议的设定。对于任意给定的x和y,使用表示一组三元数,并对任意q∈S,满足则是关于矢量C-(Ax+By)最小元素对应的系数。因此,对于种群p的修正协议可以设计为:Step 21, setting of the communication protocol. For any given x and y, use Represents a set of ternary numbers, and for any q∈S, satisfies but is the coefficient corresponding to the smallest element of the vector C-(Ax+By). Therefore, the modified protocol for population p can be designed as:
将(12)代入到(10)和(11),可以得到Substituting (12) into (10) and (11), we can get
这就是具有耦合约束的两个种群的分布式史密斯动力学(DSD2PC),具有这种动力学的进化博弈被称为具有耦合约束的两个种群的分布式进化博弈(DEG2PC)。This is Distributed Smith Dynamics of Two Populations with Coupling Constraints (DSD2PC), and an evolutionary game with such dynamics is called a Distributed Evolutionary Game of Two Populations with Coupling Constraints (DEG2PC).
令 则(13)和(14)重新表达为:make Then (13) and (14) are re-expressed as:
将进化动力学表达为紧集的形式,表达式为Express the evolutionary dynamics as a compact form, the expression is
其中,和分别是关于图和 的拉普拉斯矩阵。in, and are about graph and The Laplace matrix of .
S10、证明受两种群约束的演化博弈具有不变集的性质。若给定(x,y)∈Ξ∩Λ,由得到并且令得到S10. Prove that the evolutionary game constrained by two groups has the property of an invariant set. Given (x,y)∈Ξ∩Λ, by get and make get
因此,rx(i,j)=rx(j,i)≥0。邻接矩阵可以表示为:Therefore, r x (i, j)=r x (j, i)≧0. adjacency matrix It can be expressed as:
由拉普拉斯矩阵的关系式可以得到By the relation of Laplacian matrix can get
根据rx(i,j)的非负性,拉普拉斯矩阵是半正定的。同理可证是半正定的且 According to the non-negativity of r x (i,j), the Laplacian matrix is positive semi-definite. The same reason can be proved is positive semidefinite and
S11、根据引理1的和可以得到和也就是说和是常数。另外,当xi=0或yi=0时,根据(13)和(14),得到S11. According to
因此对于xi≥0和yi≥0,(x(t),y(t))∈Λ。Thus for x i ≥ 0 and y i ≥ 0, (x(t), y(t))∈Λ.
当(x(0),y(0))∈Ξ时,一旦轨迹(x(t),y(t))到达集合Ξ边界,对于i∈S,满足aixi+biyi=ci。根据定理1,并且将ai和bi代入(13)和(14),得到When (x(0), y(0))∈Ξ, once the trajectory (x(t), y(t)) reaches the boundary of the set Ξ, for i∈S, a i x i +b i y i = c i . According to
在以下四种情况讨论和 Discuss in the following four situations and
若ai>0,bi>0 If a i >0, b i >0
若ai>0,bi≤0 If a i >0, b i ≤0
若ai≤0,bi>0 If a i ≤0, b i >0
若ai≤0,bi≤0 If a i ≤ 0, b i ≤ 0
则始终满足非负性和非增长性aixi+biyi≤ci。由于轨迹(x,y)的连续性,得出在后续的所有时间步长中(x(t),y(t))∈Λ。因此,集合Ξ∩Λ是不变集。then always satisfies the non-negativity and non-increasing a i x i +b i y i ≤ c i . Due to the continuity of the trajectory (x, y), it follows that (x(t), y(t)) ∈ Λ in all subsequent time steps. Therefore, the set Ξ∩Λ is an invariant set.
S12、选取E(x,y):=W(x*,y*)-W(x,y)作为李雅普诺夫函数,并且E(x,y)≥0,其导数可以表示为S12, select E(x, y):=W(x * , y * )-W(x, y) as the Lyapunov function, and E(x, y)≥0, its derivative can be expressed as
因此,当初始值(x(0),y(0))∈Ξ沿着(13)和(14)进化时,DEG2PC趋近于纳什平衡点,并且纳什平衡点是局部渐近稳定的。Therefore, when the initial value (x(0), y(0)) ∈ Ξ evolves along (13) and (14), DEG2PC approaches the Nash equilibrium point, and the Nash equilibrium point is locally asymptotically stable.
第三部分,基于DEG2PC理论的分布式模型预测控制算法:The third part, the distributed model predictive control algorithm based on DEG2PC theory:
步骤31,本发明中编队控制问题和演化博弈问题之间的转换图如图1所示,使用演化博弈论的方法,将DEG2PC理论中的种群状态(xi,yi)和最优控制问题里的位置分量相关联,关系式为Step 31, the transition diagram between the formation control problem and the evolutionary game problem in the present invention is shown in Figure 1, using the method of evolutionary game theory, the population state ( xi , y i ) in the DEG2PC theory and the optimal control problem positional components in related, the relationship is
根据(6)中的动力学模型,ui(k+m|k)和vi(k+m+1|k)可以重新表达为:According to the kinetic model in (6), u i (k+m|k) and v i (k+m+1|k) can be re-expressed as:
ui(k+m|k)=ci(k+m+1|k)-2ci(k+m|k)+ci(k+m-1|k) (19)u i (k+m|k)=c i (k+m+1|k)-2c i (k+m|k)+c i (k+m-1|k) (19)
vi(k+m+1|k)=ci(k+m+1|k)-ci(k+m|k) (20)v i (k+m+1|k)= ci (k+m+1|k)-ci ( k +m|k) (20)
并将(19)和(20)代入优化问题(8)中。由于问题(8)中是最小化代价函数J(k),问题(9)是最大化凹函数W(x,y),因此对于每种策略的适应度函数可以描述为fx=-xJ和fy=-yJ。此外,问题(8)中的约束(8d)、(8e)和(8f)可以转换为的形式,与问题(9)中的约束(9b)相对应。And substitute (19) and (20) into the optimization problem (8). Since problem (8) is to minimize the cost function J(k), and problem (9) is to maximize the concave function W(x,y), the fitness function for each strategy can be described as f x =- x J and f y =- y J . Furthermore, constraints (8d), (8e) and (8f) in problem (8) can be transformed into of the form, corresponding to constraint (9b) in problem (9).
步骤32,对于种群x和y,选择如(12)中的修正协议,使用(15)和(16)的动态演化,种群结果会趋向于纳什平衡点。之后,在时刻k可以得到最优位置轨迹(x*(k),y*(k))以及最优控制输入序列u*(k)。因此,通过DEG2PC以分布式的方式求解了编队控制问题(9)。Step 32, for the populations x and y, choose the modified protocol as in (12), and use the dynamic evolution of (15) and (16), the population result will tend to the Nash equilibrium point. After that, the optimal position trajectory (x * (k), y * (k)) and the optimal control input sequence u * (k) can be obtained at time k. Therefore, the formation control problem (9) is solved in a distributed manner by DEG2PC.
综上,基于分布式进化博弈的分布式模型预测无领队编队控制方法可以描述为:给定输入:期望位置预测时域Hp、安全距离R、交流范围θ、权重矩阵Qi、Pi和Ri。需求输出:(x*(k),y*(k))和ui *(k|k)In summary, the distributed model based on distributed evolutionary game predicts the leaderless formation control method can be described as: given input: expected position Prediction time domain H p , safety distance R, communication range θ, weight matrix Q i , Pi and R i . Demand output: (x * (k), y * (k)) and u i * (k|k)
(1)在时刻k给定样本zi(k)和通信拓扑结构 (1) Given sample z i (k) and communication topology at time k
(2)构造编队控制问题(8);选取设计修正协议(12);(2) Structural formation control problem (8); select Design Amendment Agreement (12);
(3)对于每个策略fx和fy,获取适度函数;(3) For each strategy f x and f y , obtain a fitness function;
(4)通过(13)和(14),求解最优位置轨迹(x*(k),y*(k))以及最优控制输入序列u*(k);(4) Through (13) and (14), solve the optimal position trajectory (x * (k), y * (k)) and the optimal control input sequence u * (k);
(5)将ui *(k|k)代入到每个智能体中,重复上述操作。(5) Substitute u i * (k|k) into each agent, and repeat the above operation.
第四部分,理论仿真。选择一个具有六个智能体的多智能体系统,对于每个智能体系统模型为The fourth part is theoretical simulation. Choose a multi-agent system with six agents, for each agent The system model is
每个智能体的输入约束交流范围θ=2.3,安全距离R=0.5,预测时域Hp=20,权重矩阵Qi=Ri=Pi=I4×4,每个智能体的初始速度和期待速度都设置为0,初始位置为Input constraints for each agent Communication range θ=2.3, safety distance R=0.5, prediction time domain H p =20, weight matrix Q i =R i =P i =I 4×4 , the initial speed and expected speed of each agent are set to 0 , the initial position is
c1(0)=[3 3]T,c2(0)=[1 4]T,c3(0)=[2 0]T c 1 (0)=[3 3] T , c 2 (0)=[1 4] T , c 3 (0)=[2 0] T
c4(0)=[4 1]T,c5(0)=[0 2]T,c6(0)=[3 5]T c 4 (0)=[4 1] T , c 5 (0)=[0 2] T , c 6 (0)=[3 5] T
为了形成一个编队队形,每个智能体的期待位置为:To form a formation formation, the desired position of each agent is:
在MATLAB上运用ICLOCS和PDToolbox求解工具进行仿真实验,结果如附图说明所示,图2为本发明中6个智能体的二维实际轨迹图,图3为本发明中每个智能体的位置坐标—时间曲线图,图4为本发明中每个智能体对的安全距离—时间曲线,图5为本发明中每个智能体的控制输入—时间曲线图。图2中仿真结果表明,在该算法控制下,各个智能体最终能够到达指定的目标点。图3为各个智能体在移动过程中的位置。图4中的各个子图分别展示了智能体之间的相对位置,可以看出智能体之间的相对位置始终大于安全距离0.5,即智能体具有避碰的效果。图4的结果显示了智能体在移动的过程中能够保证输入约束的满足。Use ICLOCS and PDToolbox solving tool on MATLAB to carry out emulation experiment, the result is as shown in the accompanying drawing description, and Fig. 2 is the two-dimensional actual locus figure of 6 intelligent bodies in the present invention, and Fig. 3 is the position of each intelligent body in the present invention Coordinate-time graph, Fig. 4 is the safety distance-time curve of each agent pair in the present invention, and Fig. 5 is the control input-time graph of each agent in the present invention. The simulation results in Figure 2 show that under the control of the algorithm, each agent can finally reach the designated target point. Figure 3 shows the position of each agent during the movement. Each sub-graph in Figure 4 shows the relative positions of the agents respectively. It can be seen that the relative positions of the agents are always greater than the safety distance of 0.5, that is, the agents have the effect of avoiding collisions. The results in Fig. 4 show that the agent can guarantee the satisfaction of the input constraints in the process of moving.
综上所述,以上仅为本发明的较佳实施例而已,并非用于限定本发明的保护范围。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。To sum up, the above are only preferred embodiments of the present invention, and are not intended to limit the protection scope of the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included within the protection scope of the present invention.
Claims (5)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210794170 | 2022-07-05 | ||
CN2022107941706 | 2022-07-05 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115616913A true CN115616913A (en) | 2023-01-17 |
Family
ID=84863865
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211320956.0A Pending CN115616913A (en) | 2022-07-05 | 2022-10-26 | A Model Predictive Leaderless Formation Control Method Based on Distributed Evolutionary Game |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115616913A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117891259A (en) * | 2024-03-14 | 2024-04-16 | 中国科学院数学与系统科学研究院 | Multi-agent formation control method and related products of multi-graphic configuration |
CN118092151A (en) * | 2023-12-26 | 2024-05-28 | 四川大学 | Multi-missile cooperative guidance method based on distributed model predictive control |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112464991A (en) * | 2020-11-04 | 2021-03-09 | 西北工业大学 | Multi-sensor evidence evolution game fusion recognition method based on multi-population dynamics |
CN112558471A (en) * | 2020-11-24 | 2021-03-26 | 西北工业大学 | Spacecraft formation discrete distributed non-cooperative game method based on dynamic event triggering |
CN113359437A (en) * | 2021-05-14 | 2021-09-07 | 北京理工大学 | Hierarchical model prediction control method for multi-agent formation based on evolutionary game |
CN114047758A (en) * | 2021-11-08 | 2022-02-15 | 南京云智控产业技术研究院有限公司 | Multi-mobile robot formation method based on Q-learning |
-
2022
- 2022-10-26 CN CN202211320956.0A patent/CN115616913A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112464991A (en) * | 2020-11-04 | 2021-03-09 | 西北工业大学 | Multi-sensor evidence evolution game fusion recognition method based on multi-population dynamics |
CN112558471A (en) * | 2020-11-24 | 2021-03-26 | 西北工业大学 | Spacecraft formation discrete distributed non-cooperative game method based on dynamic event triggering |
CN113359437A (en) * | 2021-05-14 | 2021-09-07 | 北京理工大学 | Hierarchical model prediction control method for multi-agent formation based on evolutionary game |
CN114047758A (en) * | 2021-11-08 | 2022-02-15 | 南京云智控产业技术研究院有限公司 | Multi-mobile robot formation method based on Q-learning |
Non-Patent Citations (2)
Title |
---|
关志华, 寇纪淞, 李敏强: "基于ε-约束方法的增广Lagrangian多目标协同进化算法", 系统工程与电子技术, no. 09, 20 September 2002 (2002-09-20), pages 1 - 5 * |
谢能刚;潘创业;李锐;王璐;: "基于多种群进化算法的多目标并行博弈设计", 数值计算与计算机应用, no. 02, 14 June 2010 (2010-06-14), pages 1 - 5 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118092151A (en) * | 2023-12-26 | 2024-05-28 | 四川大学 | Multi-missile cooperative guidance method based on distributed model predictive control |
CN117891259A (en) * | 2024-03-14 | 2024-04-16 | 中国科学院数学与系统科学研究院 | Multi-agent formation control method and related products of multi-graphic configuration |
CN117891259B (en) * | 2024-03-14 | 2024-05-14 | 中国科学院数学与系统科学研究院 | Multi-agent formation control method with multi-graph configuration and related product |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115616913A (en) | A Model Predictive Leaderless Formation Control Method Based on Distributed Evolutionary Game | |
CN114415735A (en) | Multi-UAV distributed intelligent task assignment method for dynamic environment | |
CN104181813B (en) | There is the Lagrange system self-adaptation control method of connective holding | |
CN113534660A (en) | Multi-agent system cooperative control method and system based on reinforcement learning algorithm | |
Zhang et al. | Global iterative learning control based on fuzzy systems for nonlinear multi-agent systems with unknown dynamics | |
CN117055605A (en) | Multi-unmanned aerial vehicle attitude control method and system | |
CN116449703A (en) | AUH formation cooperative control method under finite time frame | |
CN116976066A (en) | Observer-based heterogeneous multi-agent system fault-tolerant consistency sliding mode control algorithm | |
CN116700340A (en) | Track planning method and device and unmanned aerial vehicle cluster | |
Xia et al. | Dynamic asynchronous edge-based event-triggered consensus of multi-agent systems | |
CN116582442A (en) | A Multi-Agent Collaboration Method Based on Hierarchical Communication Mechanism | |
CN115599089A (en) | Multi-Agent Formation Control Method Based on Artificial Potential Field Method | |
CN114545777A (en) | Multi-agent consistency reinforcement learning method and system based on improved Q function | |
CN118226748B (en) | Multi-agent sliding mode fault-tolerant control method under finite time observer | |
CN118348995A (en) | Event-triggered multi-unmanned vehicle formation control method based on zero and game | |
Li et al. | A control strategy for unmanned surface vehicles flocking | |
CN118377304A (en) | Multi-robot hierarchical formation control method and system based on deep reinforcement learning | |
CN115356929B (en) | Proportional admissible tracking control method for actuator-attack singular multi-agent systems | |
CN115907248A (en) | Multi-robot unknown environment path planning method based on geometric neural network | |
Yuan et al. | Multi-agent cooperative area coverage: A two-stage planning approach based on reinforcement learning | |
CN115294474A (en) | Multi-agent information interaction method fusing local target characteristics and cooperation characteristics | |
CN114185273A (en) | Design method of distributed preposed time consistency controller under saturation limitation | |
CN113848701A (en) | Distributed average tracking method of uncertainty directed network | |
CN118170154B (en) | A dynamic obstacle avoidance method for drone swarm based on multi-agent reinforcement learning | |
Liu et al. | Review of formation control and cooperative guidance technology of multiple unmanned aerial vehicles |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |