NL2027701B1 - Point-to-point tracking control method for multi-agent trajectory-updating iterative learning - Google Patents

Point-to-point tracking control method for multi-agent trajectory-updating iterative learning Download PDF

Info

Publication number
NL2027701B1
NL2027701B1 NL2027701A NL2027701A NL2027701B1 NL 2027701 B1 NL2027701 B1 NL 2027701B1 NL 2027701 A NL2027701 A NL 2027701A NL 2027701 A NL2027701 A NL 2027701A NL 2027701 B1 NL2027701 B1 NL 2027701B1
Authority
NL
Netherlands
Prior art keywords
point
trajectory
agent
updating
agents
Prior art date
Application number
NL2027701A
Other languages
Dutch (nl)
Other versions
NL2027701A (en
Inventor
Liu Chenglin
Luo Yujuan
Original Assignee
Univ Jiangnan
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Univ Jiangnan filed Critical Univ Jiangnan
Publication of NL2027701A publication Critical patent/NL2027701A/en
Application granted granted Critical
Publication of NL2027701B1 publication Critical patent/NL2027701B1/en

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0212Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory
    • G05D1/0221Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory involving a learning process
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/0265Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0287Control of position or course in two dimensions specially adapted to land vehicles involving a plurality of land vehicles, e.g. fleet or convoy travelling
    • G05D1/0291Fleet control
    • G05D1/0295Fleet control by at least one leading vehicle of the fleet
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B2219/00Program-control systems
    • G05B2219/30Nc systems
    • G05B2219/33Director till display
    • G05B2219/33051BBC behavior based control, stand alone module, cognitive, independent agent
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B2219/00Program-control systems
    • G05B2219/30Nc systems
    • G05B2219/39Robotics, robotics to robotics hand
    • G05B2219/39219Trajectory tracking
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B2219/00Program-control systems
    • G05B2219/30Nc systems
    • G05B2219/42Servomotor, servo controller kind till VSS
    • G05B2219/42342Path, trajectory tracking control

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Feedback Control In General (AREA)

Abstract

The present invention discloses a point—to—point tracking control method for multi—agent trajectory—updating iterative learning, belonging to the field of control technology. The method includes: first constructing a mathematical model of a discrete heterogeneous multi—agent system, where an expected position point is considered as being generated by a virtual leader, and determining a spanning tree structure with the leader as a root node according to a communication topology structure of the multi—agent system; next, designing a target—trajectory updating method according to an expected point, and updating a target trajectory to enable a new target trajectory to converge to a system output; and finally, designing a P—type iterative learning method based on target—trajectory updating for follower agents, to implement complete tracking of the expected point by the multi—agent system. In the present invention, by means of the foregoing method, a point—to—point tracking control problem in a heterogeneous multi—agent system is resolved, and the speed at which a system output tracks a new target trajectory is faster than the speed at which the system output tracks a fixed target trajectory, so that agents complete the tracking of an expected point.

Description

Point-to-point tracking control method for multi-agent trajectory-updating iterative learning
FIELD OF THE INVENTION The present invention relates to an iterative learning method based on target-trajectory updating to resolve a point-to-point tracking control problem in a heterogeneous multi-agent system, belonging to the field of control technology.
DESCRIPTION OF THE RELATED ART In recent decades, with the continuous development of artificial intelligence and industrial technologies, many massive-scale control systems with complex structures have emerged, and a plurality of subsystems need to communicate and cooperate with each other to complete macro tasks. The coordination and cooperation among agents significantly improve the degree of intelligence of individual behavior and more adequately complete a lot of work that cannot be completed by single individuals. So far, multi-agent coordination control technologies have been widely applied to fields such as sensor networks, robots, and traffic signal control. In actual industrial production, many controlled systems perform repeated movement tasks within a limited range, for example, a serving system with an instruction signal being a periodic function, a satellite that collaboratively makes a periodic movement around the earth, and a robotic arm that completes repetitive tasks such as welding and transport on an assembly line. In consideration of wear generated during equipment operation and aging, it is generally very difficult to obtain an accurate system model for a controlled system. For this type of multi-agent system that performs repeated movement tasks within a limited range, a system output needs to implement zero-error tracking of an expected trajectory within an entire operating range. To implement accurate tracking of an expected trajectory within an entire operating range in a multi-agent system having a repeated movement property, the concept of iterative learning is introduced into a consensus tracking control problem of the multi-agent system. In the research of consensus in an iterative learning-based multi-agent system, it is usually required that a system output can implement full trajectory tracking within an entire operating range. However, during production with automated coordination and control, a system output only needs to implement the tracking of an expected position point at a specific time point. For example, when a robotic arm grabs and places a plant, it is only necessary to consider outputs at time points of grabbing and placing a plant, and it is not necessary to additionally consider outputs at other time points. For some complex process procedures, due to the limitations of equipment, it is impossible to detect all data. It is difficult to complete the tracking of all data points, and the tracking of only some detectable position points can be implemented. Therefore, the tracking control of a specific point is of significant research value.
At present, the research of point-to-point tracking control has drawn the attention of some scholars. In a conventional method for implementing point-to-point tracking control, any trajectory passing through an expected position point is usually designed, so that a point-to-point tracking control problem is converted into a full trajectory tracking control problem of a fixed target trajectory. The full trajectory tracking control of a fixed target trajectory is a relatively simple method for resolving a point-to-point tracking control problem. However, the quality of the tracking performance of this method is related to the selection of the fixed target trajectory passing through the expected position point. The selection of an optimal fixed target trajectory requires particular a priori knowledge. This somewhat limits the implementation of a point-to-point tracking control problem. In addition, in the method, the degree of freedom at other time points cannot be fully utilized to resolve a point-to-point tracking control problem. To remedy the deficiency of a fixed-trajectory point-to-point tracking control method, some scholars proposed a control method based on target-trajectory updating to resolve a point-to-point tracking control problem in a system. Son T D, Ahn H S, and
Moore K L. (Iterative-learning control in optimal tracking problems with specified data points. Automatica, 2013) used a tracking error between a target trajectory of a previous iteration and a system output trajectory to obtain to a target trajectory in a current iteration, so as to establish a target trajectory update function. AN Tongjian, LIU Xiangguan (Robust iterative-learning control of target-trajectory updating point-to-point. Journal of ZheJiang University, 2015) used an interpolation method to propose an iterative learning method based on target-trajectory updating to resolve a point-to-point tracking problem having initial disturbance, and reached the conclusion that the algorithm has better tracking performance than a fixed-trajectory point-to-point tracking control algorithm. TAO Hongfeng, DONG Xiaoqi, and YANG Huizhong (Optimization and application of a point-to-point iterative-learning control algorithm with reference to trajectory updating. Control Theory & Applications, 2016) introduced norm optimization based on a target trajectory updating iterative-learning algorithm to improve the tracking precision and rapidity of the algorithm, and analyzed the convergence and robustness of the system with no disturbance and nonrepetitive disturbance. At present, the research of point-to-point tracking control of a single system has drawn the attention of some scholars. For a multi-agent system formed by a plurality of agents that collaborate and cooperate with each other, how to use an iterative learning method to resolve a point-to-point tracking control problem in the multi-agent system is one difficulty in the current control field.
SUMMARY OF THE INVENTION An objective of the present invention is to provide an iterative learning method based on target-trajectory updating to resolve a point-to-point tracking control problem in a heterogeneous multi-agent system. The technical solution for achieving the objective of the present invention is as follows: A multi-agent trajectory-updating iterative-learning point-to-point tracking control method includes the following steps:
step 1. constructing a model of a discrete heterogeneous multi-agent system; step 2. analyzing an information exchange relationship among agents in the discrete heterogeneous multi-agent system, and constructing a communication topology structure of the multi-agent system by using a directed graph, where only one or more follower agents are capable of acquiring leader information, and a communication topology diagram formed by a leader and followers includes one spanning tree with the leader as a root node; step 3. giving an initial state condition of all follower agents; step 4. designing a target-trajectory updating method according to an expected position point, solving parameters of the target-trajectory updating method, and updating a target trajectory to enable a new target trajectory to asymptotically converge to a system output; and step 5. designing a P-type iterative learning method based on target-trajectory updating for the follower agents, and solving parameters of the P-type iterative learning method, to implement complete tracking of the expected position point within a limited time in the multi-agent system. Compared with the prior art, the obvious advantages of the present invention lie in that a point-to-point tracking control problem in a heterogeneous multi-agent system is resolved, and an updated target trajectory is closer to a system output than a fixed target trajectory. That is, the speed at which the system output converges to a new target trajectory is faster than the speed at which the system output converges to the fixed target trajectory, so that agents can complete the tracking of a given expected point, and the control better conforms to actual application.
BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a structural diagram of a network topology according to the present invention. FIG. 2 is a tracking process of a 10! iteration in a communication diagram o of the topology in FIG. 1 according to the present invention.
FIG. 3 is a tracking process of an 80" iteration in a communication diagram of the topology in FIG. 1 according to the present invention.
FIG. 4 is an error convergence diagram in a communication diagram of the topology in FIG. 1 according to the present invention.
FIG. 5 is a tracking process of a 10% iteration in a communication diagram of the topology in FIG. 1 in an iterative learning method based on a fixed target trajectory.
FIG. 6 is a tracking process of a 100% iteration in a communication diagram of the topology in FIG. 1 in an iterative learning method based on a fixed target trajectory.
FIG. 7 is an error convergence diagram in a communication diagram of the topology in FIG. 1 in an iterative learning method based on a fixed target trajectory.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The solution of the present invention is further described below with reference to the accompanying drawings and specific embodiments.
The present invention provides an iterative learning method based on target-trajectory updating to resolve a point-to-point tracking problem in a multi-agent system, including the following steps: Step 1. Construct a mathematical model of a discrete heterogeneous multi-agent system.
A model of a discrete heterogeneous multi-agent system formed by n different agents is: frst +1) = Aix (0) + Bui (©) i=12-n (1). Vil) = Cox; pe (0) where k denotes the number of iterations, i represents an i" agent, i = 1,2,---,n, and t € [0,N] is a sampling time point within one period; x;,(t) € RP, u;g(t) €R7i, and y;,(t) € R™ respectively denote a state, a control input, and a system output of the agent i; and A; € RPUPi, B; € RVU, and CE
R™Pi are matrices having corresponding numbers of dimensions.
It is defined that x(t) = [x] (6), 23, (0), +, xk (OF and w(t) = ie De], © = [Ee yi (@®), +, vE OTT, so that the system (1) is written in a compact matrix form as: X(t +1) = Ax, (0) + Buy (1) a = tr 2) where A =diag{d, A4, An} . B =diag{B,,B,,--,B,} , and C= diag{Cy, Cy, ++, Cy}. The system (2) is converted into an input-output matrix model based on a time sequence: Yi = Pur + 0x (0) (3), where yi = DO), (1), +, ye (N)]" and wy, = [1 (0), up (1), +, wp (NJ, 0 0 0 0 0 CB 0 0 0 - 0 CAB CB 0 0 | and CA%B CAB CB 0 «0 CAN-1B CAN-2B CAN-3B u CB 0 Q=I[c cA CA? CA3 - CAV] For a conventional iterative-learning control algorithm, a control target usually implements full trajectory tracking control of a fixed trajectory yg(£). It is required that as the iterations continue, the system output keeps approaching the fixed trajectory as the number of iterations increases, that is, Vix(E) > Valt), t €{0,1,2,---,N}. However, during actual engineering, in most cases, it 1s only necessary to implement the tracking of the time points T = {t;,t,, +, ty} that require tracking.
Therefore, in the present invention, it is considered to use an iterative-learning control algorithm based on target-trajectory updating to implement the tracking of an expected position point in the multi-agent system, that is, v;, (ts) > Valts) (Ss = 12 :,M), and has OSt <t << ty SN, where yalts) is the expected position point.
Based on a leader-follower communication structure, it is considered that the expected position point Valts) (s=12::,M) is regarded as being generated by a virtual leader, n agents in the system (1) are considered as followers, and only some follower agents can directly acquire leader information.
The main work content of the present invention is that for the multi-agent system (1) in which only some follower agents can directly acquire information of the expected position point, in a fixed communication topology, an appropriate learning method is designed to implement complete tracking of the expected position point within a limited time in the multi-agent system (1). Step 2. Analyze an information exchange relationship among agents in the multi-agent system, construct a communication topology structure of the multi-agent system by using a directed graph, and determine a directed spanning tree structure with the leader as a root node according to the communication topology structure of the multi-agent system.
The directed graph G = (V,E, A) is used to denote the topology structure of the multi-agent system, where a node set V ={1,2,---,n} of the graph G corresponds to the n agents, an edge set ESV XV of the graph G corresponds to information exchange and transfer among the agents, the weight of an edge is a; 20 and a; =0 (i,j EV), and a matrix A = [a] € RX js a weighted adjacency matrix; if a node j is capable of obtaining information from a node i in the directed graph, a node connecting edge is denoted by e;; = (i,j) EE; if e; EE, an element in the weighted adjacency matrix is a;; > 0, or otherwise the element 1s 0, and a; = 0, and Vi € V; a neighbor set of the agent i is Ni={j Ee V:(i,j) €E}, and a Laplacian matrix of the graph G is L=D-A= [&;] ER" and a matrix D is a degree matrix of the graph G, where in the formula: Lj = (21e, and ij D= diag{ 1 Gij = 1, ce, ml} In the directed graph G, a directed path from a node i; to a node iy is an ordered sequence (itx), (is,is) of a series of edges.
If one node i has one directed path to all other nodes in the directed graph G, the node i is used as a root node, and if the graph G has a root node, the directed graph has one spanning tree. In the present invention, a leader-follower coordination control structure is used to research a multi-agent consensus tracking problem. After a leader is added, the n follower agents and the leader form a graph G = {0 UG}. Information transfer between the agent i and the leader is denoted by s;. s; > 0 denotes that the agent has a relation to the leader. 5; = 0 denotes that the agent does not have a relation to the leader.
In the directed graph G, if there is one directed spanning tree with the leader as a root node, it indicates that the leader has one directed path to all the follower agents.
Step 3. Give an initial state condition of all follower agents. an initial state reset condition of all the follower agents is: x; ,(t) = 0.
Step 4. Design a target-trajectory updating method according to an expected position point, solve parameters of the target-trajectory updating method, and update a target trajectory to enable a new target trajectory to asymptotically converge to a system output.
The tracking of a fixed trajectory by using an iterative-learning control algorithm usually requires that as the number of iterations increases, the system output y;,(t) asymptotically converges to a fixed trajectory ya(t), that is, Iva Verl Sla Vell (4). A target-trajectory updating algorithm proposed in the present invention is to enable a new target trajectory 7;x(t) to asymptotically converge to the system output y(t), that is, Neer — yell lle — yell (5). First, it is defined that the target-trajectory updating algorithm is: Fin (1) =v, (t) +1 (1) f (1), (6), where T;x+1(Ê) is a target trajectory of the i™ agent obtained after learning and update of a k™ iteration, y,(t) is any trajectory passing through the expected position point ya(ts). h(t)=(t—1,)(r—1,)---(1—1,) , and fi (©) is any discrete function. Let (8) = DOT FO = [AO LO, LOI H()=diag{h (1) 7 (t). 1, (0)}, and V0) = [ya (©), ya(®), +, y4(D)]". Formula (6) is converted into: Te+1 (0) = Ya (O + HOS) (D). Formula (7) is then written into a time sequence-based form: Tk+1 = Ya + Hf (8), where Terr = Dis OT 1 (1), +, 1 (NJ, Ya = [Ya (0), Ya (1), =, Ya (NM), H = diag{H(0),H(1),---,H(N)}, and =O. FN].
Because point-to-point tracking requires that the value of a target trajectory at a time point T = {t;,t,, tm} that requires tracking in each update be kept consistent with that of a given expected point, that is, (ts) = Valts), Formula (8) may further be converted into a target trajectory at any sampling point: Teer =T + Hf (9).
Let f = F(ry — yr), where F is a real diagonal matrix, Formula (9) is denoted as: Ter = Te FHF (ne Ye) (10).
Let A, = HF, because a matrix H and a matrix F are both diagonal matrices, A; is also a real diagonal matrix, and A, (0) 0 oe 0 Ak = | 0 (1) u 0 | where 0 0 LW)
A(T) 0 aE 0 in the formula, A(t) = | 0 A2x (0) IJ 0 | , the 0 NE) target-trajectory updating algorithm (10) is turned into: Terr = Ti + Alg — Vie) (11). As can be learned from Formula (11): Feat 7 Vk =n +A (iN) (02). =(I+4)(n- x), Norms are calculated on two sides of Formula (12): reer Vell SU + Alle yell (13). Therefore, when ||! + All < 1, it is obtained that [lr1 — Vill < re — yell. In a point-to-point tracking control problem based on target-trajectory updating, the value of a target trajectory at a time point T = {t;,t,, ty} that requires tracking is fixed, and is kept consistent with that of an expected point, that is, the value satisfies: Tik(ts) = Valts). s=12--,M (14). Therefore, it is obtained that: Tk+1(ts) = rip(ts) (15).
As can be learned from Formula (11), when Alt) =0 (s=12--,M) is satisfied at a time point T = {t4,t,,-, ty} that requires tracking and 7;1(ts) = Valts) is satisfied, Formula (15) is true.
Therefore, if ||! + All = 1 is satisfied and A;,(t;) =0 (s=12:-,M), it is obtained that |r — Yell S Ire — yell.
As can be seen from Formula (5). as the number of iterations increases, an updated target trajectory is closer to the system output than a fixed target trajectory. That is, the speed at which the system output converges to a new target trajectory is faster than the speed at which the system output converges to the fixed target trajectory. As can be seen, a point-to-point tracking control algorithm based on target-trajectory updating can enable the system to track an expected point faster, to achieve a better tracking effect, and the deficiency of the point-to-point tracking control algorithm of the fixed target trajectory can be remedied.
Step 5. Design a P-type iterative learning method based on target-trajectory updating for the follower agents, and solve parameters of the P-type iterative learning method, to implement complete tracking of the expected position point within a limited time in the multi-agent system.
First, it is given that a tracking error of each agent is: ex (0) =r) Jilt) (16), ex) = yi) —yix(t), JEN, (17), where e; (ft) represents an error between an output of the agent i during the k'™ iteration and a target trajectory obtained after iterative updating, and eijx(t) denotes an error between the agent and neighbor agents thereof in the kt" iteration.
Let &; (ft) denote information received or measured by the agent i during the kf iteration, it is obtained that: oeli) = Djen; aijeiju(t) + sie (t) (18), where d;; is the weight of an edge, and s; is a coupling weight between the agent { and the leader.
Because e(t) = e;;(t) — jx (t), Formula (18) is converted into: Zn = Tye, aij (eik = e140) + sie (8) (19).
It is defined that erf) = [er x(t), nr (H)]” and &(t) = [Ee (0), 82 (6), En, xCO)JF, and by using graph theory knowledge, Formula (19) is written as: GO =(L +9 ne) (20), where S = diag{s;,s3 Sn}, L is a Laplacian matrix of G, and In denotes an m X m-dimensional identity matrix.
Formula (20) 1s also written into a time sequence-based form, that is: Sp =Meg (21),
where ex = [er(0), ex (1), +, ex (NM]", & = Be), ED, &(W)]T, and M = diag{(L + 5) ® Im}wxn- In the present invention, it is considered to use the P-type iterative learning method for each follower agent to resolve a tracking control problem of an expected point in the multi-agent system, and an iterative learning method is shown as follows: Upger1 (1) = Up () + i641 (1) (22), where [; € RU? is a learning gain.
Let we (8) = [et (Oz (Dn (D] GD) = [EO EO, Eni (D)] and Formula (22) is converted into: Ups 1 (8) = up (8) + PÈg+1(0) (23). where FP = diag{l'y, 1, Lj}. Next, let dn = [e(0), EE (NJ and Uk = [u (0), up (1), +, tp (NDF, and Formula (23) is converted into: Ugs1 = Uy + Tks (24), where I" = diagT}Nxy.
The iterative learning method is obtained by substituting Formula (21) into Formula (24): Uppy = ug +PMexs1 (25). An iterative learning method based on target-trajectory updating is obtained by using Formula (11) and Formula (25): = Me : bo + on “30 (20). When A; = 0, Formula (26) is turned into: fii = Uy + Mey1 (27). Tg+1 = Tk In this case, a target trajectory is not iteratively updated.
Therefore, Formula (27) is an iterative learning method based on a fixed target trajectory.
As can be seen, Formula (27) is a special form of Formula (26). It is obtained from Formula (16) that: ex Tg — Yr (28).
It may then further be obtained from Formula (3) and Formula (26) that: Cy PT Yew zi +46 3} Guy, — 0x, (0) (29). =r, +A (5-3 )-G(u +TMe,_, )- 0x, (0) = + Ae, Gu, —GI Me, ,, — 0x, (0). It is obtained by integrating Formula (29) that: (I + GPM)ex1 = Tk + Age — Guy — Q%p+1(0) (30). It is obtained from Formula (3) that: Guy = yr — 0x (0) (31). It is obtained by substituting Formula (31) into Formula (30) that: (1+GTM )e,,, =r, Vet Ae + Ox, (0)-0x,,, (0) aD. ze, +Ae +0x (0)-0x, (0) =(I+4)e +0x, (0)-0x,, (0). Because all the follower agents satisfy x;,(0) =0, it is obtained that 2341(0) — x, (0) = 0, so that Formula (32) is simplified as: (I+GPM}ex,1 =U + Ae, (33). Two sides of Formula (33) are both left-multiplied by (I + GM), to obtain: exer = I+ GTM) HI + Ae, (34). Norms are then calculated on two sides of Formula (34), it is obtained that: leaf <| (ream) (1+ 2) a] (35). < \(1+GTm | 11+ 2) Je.
Because it has been proved that ||! + A.]| = 1, it is obtained that: lea <| (146M) le). G6), As can be learned from Formula (36), when |C + GTM) < 1, it is obtained that ler! > 0,k >.
Therefore, for t € [0,N], when k > =, e(t) -» 0. For all ts ET € [0,N], when k — <=, as can be seen from Formula (14) and Formula (16): Viera (Ls) - Tira (Es) = Valts) (37).
In summary, for a discrete heterogeneous multi-agent system, under the action of an iterative learning method based on target-trajectory updating, if a matrix I enables an inequation [[(I + GTM) | <1 to be true, as the iterations continue, an output trajectory of a follower converges to an expected point, that is, when k >, yp ,(ts) = Valts).
Embodiment A discrete heterogeneous multi-agent system formed by six different follower agents and one leader agent is considered. A communication topology diagram in the system is shown in FIG. 1. The sequence number 0 represents the leader agent, and the sequence numbers 1 to 6 represent the follower agents. A dynamics model of the follower agents is as follows: I 2 0 fre +0 =| Sle + [Ju Jr) =[0 0.1]xy, (8) _ [04 -02 0 foc +0 = rea + | wane Var (t) = [0.2 1]xgr(t) 1 -05 0 0.1 Xip(t + 1) = |0.1 0 0.2 Xi p(t) + 0 u(t) and 1 2 -3 1 ‚an yix(®) = [0.1 0.2 0.4]x,(t),i=34 1 0.2 03 05 0.5 (#40 02 03 0 O0 (+ 0 © x = x, 4, (6), He 01 0 02 1 |" 0" I -4 1 3 1 VaO=[0 0 0 02]x, (0, i=56. The system simulation time is t € [0,2]s. The sampling time is 0.1 s. Five points are selected as expected position points for the research of tracking control, and a point to be tracked is T = {20,60,100,140,180}. An expected output is y4(T) = {5,3, 3, 5, 1.5}.
The expected position point y4(T) = {5,3, 3, -5,1.5} is considered as being generated by a virtual leader with a sequence number of 0. The foregoing six agents are considered as followers, and only some follower agents can directly acquire the leader information. As can be learned from a communication topology diagram 1, only the agent 1 and the agent 4 can directly obtain information of the leader 0. Therefore, S = diag{1.5,0,0,2,0,0}, and at the same time a Laplacian matrix among the agents is obtained as follows: 2 0 -1 -1 0 © -1 1 0 0 0 ©O 0 | EE | 0 0 +1 06 2 1 0 0 0 10 1 In the simulation, initial states of the agents is set to: x, ,(0) = [0 10], x2x(0) =[0 1, xp@=[2 2 1, (9) =[2 2 1]". (0) = [0 0 0 5], and x5, (0)=[0 0 0 5], and a control input signal of each agent during the first iteration is set to 0.
For the iterative learning method (27) based on a fixed target trajectory, a trajectory of the foregoing expected position point y,(T) = {5, 3,-3,-5,1.5} is y(t) = (-6.5t*+41.7t3-72.4t2+33.3t+1).
For the iterative learning method (26) based on target-trajectory updating, Ha) = ya(®), and Ae satisfies A0 = {ECON poe 1.2 15, 15, 3,3} is selected, to obtain a converge condition |: +GI(L+S)® 1) | = 0.73 <1 of the multi-agent system. Under the action of the iterative learning method (26) based on target-trajectory updating, FIG. 2 and FIG. 3 respectively denote a tracking process of the sixth agent in a 10" iteration and an 80" iteration. It can be very clearly seen that as the iteration process continues, the agents can track the expected position point. FIG. 5 denotes an error convergence diagram of the six follower agents under the action of the iterative learning method based on target-trajectory updating. maX ‚7 le. (1) <107is set to an error precision requirement.
As can be seen, when the iterative learning has performed 80 times, the six follower agents can completely track the expected position point.
To compare the tracking performance of the iterative learning method (26) based on target-trajectory updating with that of the iterative learning method (27) based on a fixed target trajectory, Tx4+1 = Tk = Valt) is selected.
In this case, the algorithm (26) is converted into an iterative-learning control algorithm of the fixed target trajectory.
Under the action of an iterative-learning algorithm with a fixed target trajectory, FIG. 6 and FIG. 7 respectively denote tracking processes of the sixth agent in the 10" iteration and the 100" iteration, and it is very clearly seen that as the iterative process continues, the agents can track the fixed target trajectory ya(t). Because the fixed target trajectory y,(t) passes through the expected position point ya(T), the algorithm (27) can complete the tracking of the expected position point.
It can be seen from FIG. 7 that the follower agents that use an iterative-learning control algorithm of the fixed target trajectory can completely track an expected trajectory after the 100™ iteration, and the convergence speed is slower than the convergence speed of a target-trajectory updating iterative-learning algorithm.
In summary, it is found that an updated target trajectory can implement point-to-point tracking in the multi-agent system more rapidly than the fixed target trajectory.

Claims (6)

ConclusiesConclusions 1. Werkwijze voor een punt-naar-punt traceer-regelmethode voor iteratief leren bij het bijwerken van het traject van meerdere agenten, omvattende de stappen van het: stap 1 het construeren van een model van meerdere discreet heterogeen multi-agentsysteem; stap 2 het analyseren van een informatie-uitwisselingsrelatie tussen agenten in het discrete heterogene multi-agentsysteem, en het construeren van een communicatietopologiestructuur van het multi-agentsysteem met behulp van een gerichte graaf, waarbij slechts één of meerdere volg-agenten in staat zijn leidersinformatie te verwerven, en een communicatietopologiediagram gevormd door door een leider en een volger omvat één opspannende boom met de leider als een hoofdknooppunt; stap 3 het geven van een initiële toestand van alle volg-agenten; stap 4 het ontwerpen van een doeltraject-updatemethode volgens een verwacht positiepunt, het oplossen van parameters van de doeltraject- updatemethode, en het updaten van een doeltraject om het mogelijk te maken dat een nieuw doeltraject asymptotisch convergeert naar een systeemuitvoer; en stap 5 het ontwerpen van een P-type iteratief leerproces gebaseerd op het bijwerken van het doeltraject voor de volg-agenten, en het oplossen van parameters van het P-type iteratieve leerproces, om volledige tracering van de verwachte puntpositie binnen een beperkte tijd in het multi-agentsysteem te implementeren.A method for a point-to-point trace control method for iterative learning in updating the trajectory of multiple agents, comprising the steps of: step 1 constructing a model of a plurality of discrete heterogeneous multi-agent systems; step 2 analyzing an information exchange relationship between agents in the discrete heterogeneous multi-agent system, and constructing a communication topology structure of the multi-agent system using a directed graph, where only one or more trailing agents are able to transfer leader information acquire, and a communication topology diagram formed by a leader and a follower includes one spanning tree with the leader as a root node; step 3 giving an initial state of all tracking agents; step 4 designing a target trajectory update method according to an expected position point, resolving parameters of the target trajectory update method, and updating a target trajectory to allow a new target trajectory to converge asymptotically to a system output; and step 5 designing a P-type iterative learning process based on updating the target trajectory for the tracking agents, and resolving parameters of the P-type iterative learning process, to achieve complete tracing of the expected point position within a limited time. implement the multi-agent system. 2. De werkwijze voor een punt-naar-punt traceer-regelmethode voor iteratief leren bij het bijwerken van het traject van meerdere agenten volgens conclusie 1, waarbij in stap 1 een model van een discreet heterogeen mutli-agentsysteem gevormd is door n verschillende agenten: 20 fax +1) = Axe (t) + Brug (1) (1 Vilt) = Cil) ’The point-to-point trace control method for iterative learning in updating the trajectory of multiple agents according to claim 1, wherein in step 1, a discrete heterogeneous multi-agent system model is formed by n different agents: 20 fax +1) = Ax (t) + Bridge (1) (1 Felt) = Cyl)' waarbij k staat voor het aantal iteraties, / staat voor de i% agent, i = 1,2, -,n, en t €[0,N] is een bemonsteringstijdstip binnen één periode; x;,(t) € RP, u(t) ER , en y(t) ER™ duiden respectievelijk een staat, een besturingsinput en een systeemoutput van de agent i; en A; € RPPPt, B; € RPT, en C; € R™Pi zijn matrices met een overeenkomstig aantal dimensies, het is gedefinieerd dat 20 = [Pi Oi, al (D]T en w(t) = OE OT, n= Die, vi, vl (OTT, zodanig dat het systeem (1) in een compacte matrixvorm wordt geschreven als: [rale +1) = Ax (t) + Buy (t) (2). ye) = Cxy(t) waarbij A =diag{d, Az, Ax} , B=diag{ByB,-,B,} , en C= diag{C,, Co, Co}; het systeem (2) wordt omgezet in een input-output matrixmodel gebaseerd op een tijdsequentie: Vi = Pug + Qx, (0) (3), waarbij yi = DO, y (1), =, 3 (M]" en wy = [1,(0), up (1), tee (NJ, 0 0 0 0 «0 CB 0 0 0 0 : CAB CB 0 0 | en CA’B CAB CB 0 «0 CAN-1B CAN-2B CAN-3B u CB 0 Q=I[c cA CA? CA3 - CAV en tijdpunten T = Í{t,t,, tu} die moeten worden gevolgd, worden gegeven, waarbij bij voorkeur een controlemethode wordt gebruikt om het volgen van het verwachte positiepunt in het multi-agentsysteem te implementeren, dat is, Virlts) > Valts), s=12-,M, en OSt <t; < + <ty SN, waarbij yalts) het verwachte positiepunt is; het verwachte positiepunt yg(ts) wordt beschouwd als gegenereerd door een virtuele leider, s =1,2--,M; en n agenten in het systeem worden beschouwd als volgers, en slechts enkele volg-agenten kunnen direct de leider-informatie verkrijgenwhere k represents the number of iterations, / represents the i% agent, i = 1,2, -,n, and t €[0,N] is a sampling time within one period; x i(t) RP , u(t) ER , and y(t) ER ™ denote a state, a control input and a system output of the agent i, respectively; and A; € RPPPt, B; €RPT, and C; € R™Pi are matrices with a corresponding number of dimensions, it is defined that 20 = [Pi Oi, al (D]T and w(t) = OE OT, n= Die, vi, vl (OTT, such that the system (1) in a compact matrix form is written as: [rale +1) = Ax (t) + Buy (t) (2).ye) = Cxy(t) where A =diag{d, Az, Ax} , B =diag{ByB,-,B,} , and C= diag{C,, Co, Co}; the system (2) is converted into an input-output matrix model based on a time sequence: Vi = Pug + Qx, (0)(3), where yi = DO, y(1), =, 3(M]" and wy = [1,(0), up (1), tee (NJ, 0 0 0 0 «0 CB 0 0 0 0 : CAB CB 0 0 | and CA'B CAB CB 0 «0 CAN-1B CAN-2B CAN -3B u CB 0 Q=I[c cA CA? CA3 - CAV and time points T = Í{t,t,, tu} to be tracked are given, preferably using a control method to track the expected position point in the multi-agent system, that is, Virlts) > Valts), s=12-,M, and OSt <t; < + <ty SN, where yalts) is the expected position point; the expected position point yg(ts ) is considered to be generated by a virtual leader, s = 1,2--,M; and n agents in the system are considered followers, and only a few follower agents can get the leader information directly 3. De werkwijze voor een punt-naar-punt traceer-regelmethode voor iteratief leren bij het bijwerken van het traject van meerdere agenten volgens conclusie 1 of 2 , waarbij in stap 2, de gerichte graaf G = (V,E, A) wordt gebruikt om de topologiestructuur van het multi-agentsysteem aan te duiden, waarbij een knooppuntset van V ={1,2,,n} van de graaf G overeenkomt met de 7 agenten, en een randset ESVxV van de graaf G overeenkomt met informatieuitwisseling en overdracht onder de agenten, het gewicht van een rand isa; 20 en a; =0 (i,j EV), en een matrix A = [a;;] € R™™ js een gewogen aangrenzende matrix; als knooppunt j in staat is om informatie van knooppunt 7 te verkrijgen in de gerichte graaf, wordt een knooppuntverbindingsrand aangegeven door e; = (Lj) EE; als e; EE, een element in de gewogen aangrenzende matrix a;; = 0 is, of anders is het element 0, en a; =0, en Vi € V: een naburige verzameling van de agent i is N; ={j € V:(i,j) EE}; en een Laplaciaanse matrix van de graaf G is L=D A= [&;] € R™™ en een matrix D is een gradenmatrix van de graaf G, waarbij in de formule: Ly = (2204 py en j D= diag{¥7., a, i =1, em}, en in de gerichte graaf CG, een gericht pad van knooppunt i; naar knooppunt i is een geordende reeks (iz), ,lis1,is) van een serie van randen; als één knooppunt i één gericht pad heeft naar alle andere knooppunten in de gerichte graaf G, is het knooppunt i een wortelknooppunt, en als de graaf G een wortelknooppunt heeft, heeft de gerichte graaf één opspannende boom. na het toevoegen van een leider, vormen de 7 volg-agenten en de leider een graaf G = {0 UG}, informatieoverdracht tussen de agent i en de leider wordt aangeduid als s;, s; > 0 geeft aan dat de agent een relatie heeft met de leider, en s; = 0 geeft aan dat de agent geen relatie heeft met de leider; en in de gerichte graaf G, als er één gerichte opgespannen boom is met de leider als hoofdknooppunt, geeft dit aan dat de leider een gericht pad heeft naar alle volt-agenten.The method for a point-to-point trace control method for iterative learning in updating the trajectory of multiple agents according to claim 1 or 2, wherein in step 2, the directed graph G = (V,E,A) is used to denote the topology structure of the multi-agent system, where a node set of V ={1,2,,n} of the graph G corresponds to the 7 agents, and an edge set ESVxV of the graph G corresponds to information exchange and transfer among the agents, the weight of a rand isa; 20 and a; =0 (i,j EV), and a matrix A = [a;;] € R™ ™ js a weighted adjacent matrix; if node j is able to obtain information from node 7 in the directed graph, a node connection edge is denoted by e; = (Lj) EE; as e; EE, an element in the weighted adjacent matrix a;; = 0, or else the element is 0, and a; =0, and Vi € V: a neighboring set of the agent i is N; ={j €V:(i,j) EE}; and a Laplacian matrix of the graph G is L=DA= [&;] € R™™ and a matrix D is a degree matrix of the graph G, where in the formula: Ly = (2204 py and j D= diag{¥ 7., a, i =1, em}, and in the directed graph CG, a directed path from node i: to node i is an ordered sequence (iz), ,lis1,is) of a series of edges; if one node i has one directed path to all other nodes in the directed graph G, the node i is a root node, and if the graph G has a root node, the directed graph has one spanning tree. after adding a leader, the 7 follower agents and the leader form a graph G = {0 UG}, information transfer between the agent i and the leader is denoted as s;, s; > 0 indicates that the agent has a relationship with the leader, and s; = 0 indicates that the agent has no relationship with the leader; and in the directed graph G, if there is one directed spanned tree with the leader as the root node, it indicates that the leader has a directed path to all volt agents. 4. De werkwijze voor een punt-naar-punt traceer-regelmethode voor iteratief leren bij het bijwerken van het traject van meerdere agenten volgens een van de conclusies 1-3 , waarbij in stap 3 een initiële toestandherstelvoorwaarde van alle volg-agenten is: xi (0) =0 (4).The method for a point-to-point trace control method for iterative learning in updating the trajectory of multiple agents according to any one of claims 1 to 3, wherein in step 3 an initial state recovery condition of all tracking agents is: xi (0) = 0 (4). 5. De werkwijze voor een punt-naar-punt traceer-regelmethode voor iteratief leren bij het bijwerken van het traject van meerdere agenten volgens een van de conclusies 1-4 | waarbij in stap 4 de doeltraject-updatemethode als volgt is: Bia (=v (+R (0) f(t), (5), waarbij T;x+1(t) een doeltraject is van de i agent verkregen na het leren en bijwerken van een ks iteratie, y,(t) een traject is dat door de verwachte puntpositie yalts) gaat , A (1)=(t=1)(t=1,)-(t=1,). en f(t) zijn discrete functies; laat (0 = DD OT FO = [AQ LO, (OI, H (1) =diag{h (1).h (1). 1, (1)}, en Yo(®) = ya(©,ya(®), ~, ya(OI", Formule (4) wordt omgezet in: Tear (8) = Ya (©) + H(O)f (0) (6); Formule (6) wordt herschreven naar een op een tijdsequentie gebaseerde vorm: Teer = Va + Hf (7), waarbij Tt = [eer Ta (DTe 1 (NDT, Ya = [Y,(0), Y, (1), tty Ya (NT, H = diag{H(0),H(1),---,H(N)}, en f=1f(0), FQ, FN]; omdat punt-naar-punt tracering vereist dat de waarde van een doeltraject op een tijdstip T = {t;,t,, +, ty} waarvoor tracering in elke update vereist is consistent wordt gehouden met die van een bepaald punt verwacht wordt, dat wil zeggen, 1y;(ts) = Valts), wordt Formule (7) verder omgezet naar een doeltraject op elk bemonsteringspunt: Teer = Te +Hf (8); laat f = F(x — ye), waarbij F een reële diagonal matrix is, Formule (8) wordt genoteerd als: Thor = FHF re) (9); laat A, = HF, omdat een matrix H en een matrix F beide diagonal matrices zijn, is A; ook een reële diagonale matrix, en A (0) 0 “ee 0 A = | 0 Ae (1) u 0 waarbij 0 0 a) Aart) 0 oe 0 in de formule, A,(t)= | 0 Aak) . 0 | ‚ de doeltraject- 0 0 Ank) updatemethode (9) wordt omgekeerd naar: Terr = Tk + Akke) (10); het traceren van een vast traject door gebruik te maken van een iteratief leeralgoritme vereist dat als het aantal iteraties toeneemt, de systeemoutput Vix(t) asymptotisch convergeert naar een vast traject y,(t), dat wil zeggen IVa Verl lye zel (11); een huidig doeltraject-update algoritme is om een nieuw doeltraject 7;g(t) asymptotisch te laten convergeren naar de systeemoutput y(t), dat wil zeggen, Iris Vell Slk yell (12); en voor een punt-naar-punt tracering regelprobleem, wordt het doeltraject- update algoritme gebruikt Txs = Tr + A(x — yi), en als aan |! +24 =1 wordt voldaan en Agvoldoet aan = < Aix) < 0,t € [ONNT kan [ri — AO =O0teT ’The point-to-point trace control method for iterative learning in updating the trajectory of multiple agents according to any one of claims 1-4 | wherein in step 4 the target trajectory update method is as follows: Bia (=v (+R (0) f(t), (5), where T;x+1(t) is a target trajectory of the i agent obtained after the learning and updating a ks iteration, y,(t) is a trajectory passing through the expected point position yalts) , A (1)=(t=1)(t=1,)-(t=1,). f(t) are discrete functions; let (0 = DD OT FO = [AQ LO, (OI, H (1) =diag{h (1).h (1). 1, (1)}, and Yo( ®) = ya(©,ya(®), ~, ya(OI", Formula (4) is converted into: Tear (8) = Ya (©) + H(O)f (0) (6); Formula (6) is rewritten to a time sequence based form: Teer = Va + Hf (7), where Tt = [eer Ta (DTe 1 (NDT, Ya = [Y,(0), Y, (1), tty Ya (NT, H = diag{H(0),H(1),---,H(N)}, and f=1f(0), FQ, FN]; because point-to-point tracing requires that the value of a target trajectory at a time T = {t;,t,, +, ty} that requires tracing in each update is kept consistent with that expected from a particular point, i.e., 1y;(ts) = Valts), Formula (7) is further converted to a target trajectory at each sampling point: Teer = Te +Hf(8); let f = F(x — ye), where F is a real diagonal matrix, Formula (8) is denoted as: Thor = FHF re) (9); let A, = HF, because a matrix H and a matrix F are both diagonal matrices, is A; also a real diagonal matrix, and A(0) 0 “ee 0 A = | 0 Ae (1) u 0 where 0 0 a) Aart) 0 oe 0 in the formula, A,(t)= | 0 Bark) . 0 | ‚ the target trajectory 0 0 Ank) update method (9) is reversed to: Terr = Tk + Akke) (10); tracing a fixed trajectory using an iterative learning algorithm requires that as the number of iterations increases, the system output Vix(t) converges asymptotically to a fixed trajectory y,(t), i.e. IVa Verl lye zel (11) ; a current target range update algorithm is to make a new target range 7;g(t) converge asymptotically to the system output y(t), i.e. Iris Vell Slk yell (12); and for a point-to-point tracing control problem, the target trajectory update algorithm is used Txs = Tr + A(x — yi), and if on |! +24 =1 is satisfied and Agsatisfaction = < Aix) < 0,t € [ONNT can [ri — AO =O0teT ' Vill < Ir — vill worden verkregen, en T geeft tijdstippen T = {t;,t5, ++, ty} aan die moeten worden gevolgd.Vill < Ir — vill are obtained, and T denotes times T = {t;,t5, ++, ty} to follow. 6. De werkwijze voor een punt-naar-punt traceer-regelmethode voor iteratief leren bij het bijwerken van het traject van meerdere agenten volgens een van de conclusies 1-5, waarbij in stap 5 het P-type iteratief leerproces gebaseerd op het bijwerken van het doeltraject als volgt is: eerst, het wordt gegeven da teen traceerafwijking van elke agent is: eik = Tilt) — Vilt) (13), jn) = y(t) VO, JEN, (14), waarbij e;,(t) een afwijking representeert tussen de output van de agent i tijdens de &“° iteratie en een doeltraject verkregen na iteratief bijwerken, en e(t) staat voor een afwijking tussen de agent en de naastgelegen agenten daarvan tijdens de 4“ iteratie; laat &;,(t) informatie aanduiden die ontvangen of gemeten is door de agent i tijdens de &““ iteratie, wordt verkregen dat: diel) = jen aijeijn(t) + sie (8) (15), waarbij a;; de weging is van een rand, en s; is een koppelweging tussen de agent; en de leider; omdat ex (t) = e;,(t) — e‚x(t), wordt Formule (15) omgezet naar: diel) = Xen, ij (e(t) — ej (t)) + se, (8) (16); het is gedefinieerd dat e(t) = [erx(6, esp (t), +, nk ()]T en &()= Ee), E20), En «OTT, en door gebruik te maken van grafentheorie, is het mogelijk om Formule (16) te schrijven als: GM =(L+®Iy)e(® (17), waarbij S = diag{s,,S2, Sn}, L een Laplaciaanse matrix is van G, en Im noteert een m X m-dimensionale eenheidsmatrix;The method for a point-to-point trace control method for iterative learning in updating the trajectory of multiple agents according to any one of claims 1 to 5, wherein in step 5, the P-type iterative learning process based on updating the target trajectory is as follows: first, it is given that a trace drift of each agent is: eik = Tilt) — Felt) (13), jn) = y(t) VO, JEN, (14), where e;,( t) represents a discrepancy between the output of the agent i during the &"° iteration and a target trajectory obtained after iterative update, and e(t) represents a deviation between the agent and its neighboring agents during the 4" iteration; let &;,(t) denote information received or measured by the agent i during the &"" iteration, it is obtained that: diel) = jen aijeijn(t) + sie (8) (15), where a;; the weighting is of an edge, and s; is a link weighting between the agent; and the leader; because ex (t) = e;,(t) — e‚x(t), Formula (15) is converted to: diel) = Xen, ij (e(t) — ej (t)) + se, (8 ) (16); it is defined that e(t) = [erx(6, esp (t), +, nk()]T and &()= Ee), E20), And «OTT, and using graph theory, is it is possible to write Formula (16) as: GM =(L+®Iy)e(® (17), where S = diag{s,,S2, Sn}, L is a Laplacian matrix of G, and Im denotes a m X m-dimensional unity matrix; Formule (17) wordt ook geschreven in een op tijdsequentie gebaseerde vorm, dat wil zeggen:Formula (17) is also written in a time sequence based form, that is: $e = Mex (18),$e = Mex (18), waarbij ex = [er(0), ec (1), ‚er (NI, & = [§,(0), & (1), +, Eel], enwhere ex = [er(0), ec (1), ‚er (NI, & = [§,(0), & (1), +, Eel], and SM = diag{(L +S) ® Imn}nxen:SM = diag{(L +S) ® Imn}nxen: waarbij bij voorkeur gebruik wordt gemaakt van het P-type iteractieve leerproces voor elke volg-agent om een traceercontroleprobleem van een verwacht punt in het multi-agentsysteem op te lossen, en een iteratieve leermethode wordt als volgt weergegeven:preferably using the P-type iterative learning for each tracking agent to solve a trace control problem of an expected point in the multi-agent system, and an iterative learning method is shown as follows: Upger1 = wp) + liëix+i() (19),upger1 = wp) + lieix+i() (19), waarbij I; € RTP? een leertoename is;where I; €RTP? is a learning increase; laat u(t) = [01 (0), uz 1 (2), ne (0) en E(t) = [£16 (6), E20 (D), EO], Formule (19) wordt omgezet naar:let u(t) = [01 (0), uz 1 (2), ne (0) and E(t) = [£16 (6), E20 (D), EO], Formula (19) is converted to : Up (8) = up (8) + P&k41(t) (20),Up (8) = up (8) + P&k41(t) (20), waarbij P = diag{l'y, 5, Li},where P = diag{l'y, 5, Li}, vervolgens, laat Ze = [64(0),& (1), +, & (NT en Uk = [u (0), u, (1), ++, wu, (NT, Formule (20) wordt omgezet naar:then, let Ze = [64(0),& (1), +, & (NT and Uk = [u (0), u, (1), ++, wu, (NT, Formula (20) is converted nasty: Uppy = U +E (20),Uppy = U +E (20), waarbij P = diag{T'}yxn;where P = diag{T'}yxn; Formule (18) wordt ingevuld in Formule (21) om een iteratief-lerende regelmethode te verkrijgen:Formula (18) is populated into Formula (21) to obtain an iterative learning control method: Ug+1 = U +I Mey; (22);Ug+1 = U +I Mey; (22); een iteratieve leermethode op basis van het bijwerken van het doeltraject die vanuit Formule (10) en Formule (22) verkregen kan worden is:an iterative learning method based on updating the target trajectory that can be obtained from Formula (10) and Formula (22) is: Ur = Up + PMegy1 (23): enUr = Up + PMegy1 (23): en Ther = Tx + Acre — Vp)Ther = Tx + Acre — Vp) Voor het discrete heterogene multi-agentsysteem (1), onder invloed van de iteratieve leermethode (23) gebaseerd op het bijwerken van het doeltraject, als een ongelijkheid ||(I + GPM)! <1 waar is, terwijl de iteraties doorgaan, convergeert een outputtraject van een volger naar een verwacht punt, dat wil zeggen, als k >, Yx+1(ts) = Valts).For the discrete heterogeneous multi-agent system (1), under the influence of the iterative learning method (23) based on updating the target trajectory, as an inequality ||(I + GPM)! <1 is true, as the iterations continue, an output path from a follower converges to an expected point, i.e. if k >, Yx+1(ts) = Valts).
NL2027701A 2020-06-19 2021-03-03 Point-to-point tracking control method for multi-agent trajectory-updating iterative learning NL2027701B1 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010565612.0A CN111722628B (en) 2020-06-19 2020-06-19 Point-to-point tracking control method for multi-agent track updating iterative learning

Publications (2)

Publication Number Publication Date
NL2027701A NL2027701A (en) 2022-01-28
NL2027701B1 true NL2027701B1 (en) 2022-03-15

Family

ID=72567744

Family Applications (1)

Application Number Title Priority Date Filing Date
NL2027701A NL2027701B1 (en) 2020-06-19 2021-03-03 Point-to-point tracking control method for multi-agent trajectory-updating iterative learning

Country Status (2)

Country Link
CN (1) CN111722628B (en)
NL (1) NL2027701B1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112526886A (en) * 2020-12-08 2021-03-19 北京航空航天大学 Iterative learning formation control method for discrete multi-agent system under random test length
CN113342002B (en) * 2021-07-05 2022-05-20 湖南大学 Multi-mobile-robot scheduling method and system based on topological map
CN113791611B (en) * 2021-08-16 2024-03-05 北京航空航天大学 Real-time tracking iterative learning control system and method for vehicle under interference

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108803349B (en) * 2018-08-13 2020-06-26 中国地质大学(武汉) Optimal consistency control method and system for nonlinear multi-agent system
CN110815225B (en) * 2019-11-15 2020-12-25 江南大学 Point-to-point iterative learning optimization control method of motor-driven single mechanical arm system
CN110948504B (en) * 2020-02-20 2020-06-19 中科新松有限公司 Normal constant force tracking method and device for robot machining operation

Also Published As

Publication number Publication date
CN111722628B (en) 2021-07-09
NL2027701A (en) 2022-01-28
CN111722628A (en) 2020-09-29

Similar Documents

Publication Publication Date Title
NL2027701B1 (en) Point-to-point tracking control method for multi-agent trajectory-updating iterative learning
Wang et al. Containment control for general second-order multiagent systems with switched dynamics
CN109409524B (en) A kind of quantum program operating method and device, storage medium and electronic device
Saravanakumar et al. Reliable memory sampled-data consensus of multi-agent systems with nonlinear actuator faults
Xia et al. Dynamic leader-following consensus for asynchronous sampled-data multi-agent systems under switching topology
Ai et al. A zero-gradient-sum algorithm for distributed cooperative learning using a feedforward neural network with random weights
Cai et al. Distributed consensus control for second-order nonlinear multi-agent systems with unknown control directions and position constraints
Partovi et al. Structural controllability of high order dynamic multi-agent systems
Ai et al. Distributed stochastic configuration networks with cooperative learning paradigm
Chen et al. Distributed fault-tolerant consensus protocol for fuzzy multi-agent systems
Ruano et al. An overview of nonlinear identification and control with neural networks
Liu et al. Fault-tolerant consensus control with control allocation in a leader-following multi-agent system
Zhao et al. Distributed adaptive fuzzy fault-tolerant control for multi-agent systems with node faults and denial-of-service attacks
Sebastián et al. LEMURS: Learning distributed multi-robot interactions
Bouteraa et al. Adaptive backstepping synchronization for networked Lagrangian systems
Gao et al. Effects of adding arcs on the consensus convergence rate of leader-follower multi-agent systems
Elimelech et al. Fast action elimination for efficient decision making and belief space planning using bounded approximations
Ge et al. A novel method for distributed optimization with globally coupled constraints based on multi-agent systems
Selvam et al. Domination in join of fuzzy graphs using strong arcs
CN114637278A (en) Multi-agent fault-tolerant formation tracking control method under multi-leader and switching topology
Ma Neural-network-based containment control of nonlinear multi-agent systems under communication constraints
Zhang et al. Consensus of Second-Order Heterogeneous Hybrid Multiagent Systems via Event-Triggered Protocols
Basterrech et al. A more powerful random neural network model in supervised learning applications
Alaya et al. A CPS-Agent self-adaptive quality control platform for industry 4.0
Liang et al. Distributed data-driven iterative learning point-to-point consensus tracking control for unknown nonlinear multi-agent systems