CN113778619B

CN113778619B - Multi-agent state control method, device and terminal for multi-cluster game

Info

Publication number: CN113778619B
Application number: CN202110923586.9A
Authority: CN
Inventors: 尉越; 崔金强; 丁玉隆; 商成思; 宋伟伟; 孙涛
Original assignee: Peng Cheng Laboratory
Current assignee: Peng Cheng Laboratory
Priority date: 2021-08-12
Filing date: 2021-08-12
Publication date: 2024-05-14
Anticipated expiration: 2041-08-12
Also published as: CN113778619A

Abstract

The invention discloses a multi-agent state control method, a device and a terminal for multi-cluster game, which are characterized in that the state between each agent is controlled by determining the first communication parameters from each agent in each cluster to neighbor agents according to the first communication relation between each agent in each cluster of the agent system, determining the second communication parameters from each cluster to neighbor clusters according to the second communication relation between the leading agents of each cluster, and constructing an agent state control function according to preset inequality constraint and cost function, the first communication parameters and the second communication parameters, so that the whole intelligent system achieves Nash equilibrium.

Description

Multi-agent state control method, device and terminal for multi-cluster game

Technical Field

The invention relates to the technical field of multi-agent system control, in particular to a multi-agent state control method, device and terminal for multi-cluster game.

Background

The multi-agent system can be seen as an autonomous intelligent unmanned platform group, can complete a plurality of group tasks through interaction of information and actions, greatly improves the intelligent degree of the whole group through cooperative behavior among agents in the multi-agent system, can cope with complex tasks which cannot be completed by a single body, and is widely applied to the fields including sensor group deployment, multi-unmanned aerial vehicle formation control, multi-mechanical arm cooperative transportation and the like.

However, in the prior art, there is no method how to converge the state of an agent system composed of agent clusters to a game problem Nash equilibrium point when there is a game between agent clusters.

Accordingly, there is a need for improvement and advancement in the art.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides a multi-agent state control method, a multi-agent state control device and a multi-agent state control terminal for multi-cluster game, and aims to solve the problem that in the prior art, a method for converging the state of an agent system formed by agent clusters to a game problem Nash equilibrium point when the game exists among the agent clusters is not yet available.

In order to solve the technical problems, the technical scheme adopted by the invention is as follows:

In a first aspect of the present invention, a multi-agent state control method for multi-cluster gaming is provided, the method comprising:

acquiring a first communication relationship between the intelligent agents in each cluster of the intelligent agent system and a second communication relationship between the leading intelligent agents of each cluster;

Determining first communication parameters from each intelligent agent to neighbor intelligent agents in each cluster according to the first communication relation, and determining second communication parameters from each cluster to neighbor clusters according to the second communication relation;

constructing an intelligent body state control function according to preset inequality constraint, preset cost function, each first communication parameter and each second communication parameter;

And controlling the state of each agent at each moment according to the agent state control function so as to ensure that the agent system achieves Nash equilibrium.

The multi-agent state control method for multi-cluster game, wherein determining the first communication parameters from each agent to the neighbor agents in each cluster according to the first communication relation comprises the following steps:

Undirected communication between neighbor agents within each cluster, the neighbor agent set of agent i within cluster j is represented as Said first communication parameter of agent i to agent k is denoted/>Said first communication parameter of agent k to agent i is denoted/>

The multi-agent state control method for multi-cluster game, wherein the determining the second communication parameter from each cluster to the neighbor cluster according to the second communication relation comprises the following steps:

directed communication between the leading agent of a cluster and the leading agent of a neighbor cluster, the neighbor cluster set of cluster j being represented as The second communication parameter of cluster j to cluster l is denoted/>When the leading agent of cluster j can receive information from the leading agent of cluster l, then/>

The multi-agent state control method for the multi-cluster game comprises the following steps:

Wherein/> k＝1,2,For the local cost function of agent i in cluster j,/>Is a convex function;

The state decision variables of agent i in cluster j are expressed as Where j ε {1, …, m }, i ε {1, …, n _j }/>M represents the number of clusters in the intelligent system, n _j represents the number of intelligent agents in the cluster j, n represents the number of intelligent agents in the intelligent system, and the stack form of state decision variables of all intelligent agents in the cluster j isThe state decision variables of all participating agents except the state decision variable x _j are represented as Is part of x _{_j}, defined as x _{_j} information that is available to agent i in cluster j.

The multi-agent state control method for multi-cluster game comprises the steps of for all j epsilon {1, …, m } and i epsilon {1, …, n _j }, cost functionPair/>, given x _{_j} Is secondarily continuous and slightly and strongly convex,/>Pair/>, given x _{_j} Is the lower semi-continuous closed convex function.

According to the multi-agent state control method for the multi-cluster game, inequality constraint of a cluster j is g _j.

The multi-agent state control method for the multi-cluster game comprises the step of determining g _j under the given x _{_j} Is the lower semi-continuous closed convex function, and g _j is K _2,j -Li Puxi Ks continuous, K _2,j is a constant, K _2,j > 0.

Wherein the method comprises the steps of State decision vector representing leading agent of cluster j,/>Y _j、ν_j、η_j、ω_j、r_j are auxiliary variables,/>And when i+.1,/>Alpha ₁、α₂、α₃、α₄ is a constant.

The multi-agent state control method for multi-cluster game, wherein the controlling the state of each agent according to the agent state control function comprises the following steps:

determining a state decision vector of each intelligent agent at each moment according to the intelligent agent state control function;

And determining the target state of each intelligent agent at each moment according to the state decision vector of each intelligent agent at each moment, and controlling each intelligent agent to reach the target state corresponding to each moment respectively at each moment.

In a second aspect of the present invention, there is provided a multi-agent state control device for multi-cluster gaming, the device comprising:

the communication relation acquisition module is used for acquiring a first communication relation between the intelligent agents in each cluster of the intelligent agent system and a second communication relation between the leading intelligent agents of each cluster;

The communication parameter determining module is used for determining first communication parameters from each intelligent agent to a neighbor intelligent agent in each cluster according to the first communication relation and determining second communication parameters from each cluster to the neighbor cluster according to the second communication relation;

The function construction module is used for constructing an intelligent body state control function according to preset inequality constraint, preset cost function, each first communication parameter and each second communication parameter;

And the state control module is used for controlling the state of each intelligent agent according to the intelligent agent state control function so as to ensure that the intelligent agent system achieves Nash equilibrium.

The multi-agent state control device for the multi-cluster game comprises the following intelligent agent state control functions:

wherein the state decision variables of agent i in cluster j are represented as Where j ε {1, …, m }, i ε {1, …, n _j }/>M represents the number of clusters in the intelligent system, n _j represents the number of intelligent agents in the cluster j, n represents the number of intelligent agents in the intelligent system, and the stack form of state decision variables of all intelligent agents in the cluster j isThe state decision variables of all participating agents except the state decision variable x _j are represented as As part of x _{_j}, defined as the information of x _{_j} that is receivable at agent i of cluster j, the inequality constraint of cluster j is g _j,/>State decision vector representing leading agent of cluster j,/>Y _j、ν_j、η_j、ω_j are auxiliary variables,/> And when i+.1,/>Alpha ₁、α₂、α₃、α₄ is a constant and the neighbor set of agent i within cluster j is denoted/>Said first communication parameter of agent i to agent k is denoted/>The neighbor cluster set of cluster j is denoted/>The second communication parameter of cluster j to cluster l is denoted/>Is the local cost function of agent i in cluster j.

In a third aspect of the present invention, there is provided a terminal comprising a processor, a storage medium in communication with the processor, the storage medium adapted to store a plurality of instructions, the processor adapted to invoke the instructions in the storage medium to perform the steps of implementing the multi-agent state control method of multi-cluster gaming as described in any of the preceding claims.

In a fourth aspect of the present invention, there is provided a storage medium storing one or more programs executable by one or more processors to implement the steps of the multi-agent state control method for multi-cluster gaming described in any of the above.

Compared with the prior art, the invention provides a multi-agent state control method, a terminal and a storage medium for multi-cluster game, wherein the multi-agent state control method for multi-cluster game ensures that the state between each agent is controlled by determining the first communication parameters from each agent in the cluster to the neighbor agents according to the first communication relation between each agent in each cluster of the agent system, determining the second communication parameters from each cluster to the neighbor clusters according to the second communication relation between the leading agents of each cluster, and constructing the state control function of each agent according to the preset inequality constraint and cost function, the first communication parameters and the second communication parameters.

Drawings

FIG. 1 is a flowchart of an embodiment of a multi-agent state control method for multi-cluster gaming provided by the present invention;

FIG. 2 is a schematic diagram of a communication relationship of a multi-agent system in an embodiment of a multi-agent state control method for multi-cluster gaming provided by the present invention;

FIG. 3 is a schematic diagram showing a result of validity verification of a multi-agent state control method for multi-cluster gaming according to the present invention;

FIG. 4 is a second schematic diagram of the validity verification result of the multi-agent state control method for multi-cluster gaming provided by the present invention;

FIG. 5 is a third schematic diagram of the validity verification result of the multi-agent state control method for multi-cluster gaming provided by the present invention;

FIG. 6 is a schematic diagram showing a result of validity verification of a multi-agent state control method for multi-cluster gaming according to the present invention;

FIG. 7 is a schematic diagram showing a result of validity verification of a multi-agent state control method for multi-cluster gaming provided by the present invention;

FIG. 8 is a schematic diagram of a multi-agent state control device for multi-cluster gaming according to the present invention;

Fig. 9 is a schematic diagram of an embodiment of a terminal provided by the present invention.

Detailed Description

In order to make the objects, technical solutions and effects of the present invention clearer and more specific, the present invention will be described in further detail below with reference to the accompanying drawings and examples. It should be understood that the particular embodiments presented herein are illustrative of the invention and are not intended to limit the invention.

Example 1

The multi-agent state control method for multi-cluster game provided by the invention can be applied to a terminal, and the terminal can control the state of each agent in the multi-agent system through the multi-agent state control method for multi-cluster game provided by the invention, so that the multi-agent system achieves Nash equilibrium.

In this specification, l ₁ norm is represented by +| ₁. Sign symbolRepresenting a real set,/>Representing a positive real set, diag { b ₁,…,b_n } represents a diagonal matrix, where the i-th diagonal element is/>Vector 0 _n is an all-zero vector, matrix O _n is an n-dimensional null matrix, symbol (), ^T, represents a matrix transpose operation,/>Representing the derivative of x with respect to time.

As shown in fig. 1, in one embodiment of the multi-agent state control method for multi-cluster gaming, the method includes the steps of:

S100, acquiring a first communication relation among the intelligent agents in each cluster of the intelligent agent system and a second communication relation among the leading intelligent agents of each cluster.

Specifically, the intelligent agent system comprises m clusters, each cluster comprises a plurality of intelligent agents, the intelligent agents in the clusters need to agree on the decision of the clusters, each cluster is internally provided with a leading intelligent agent, and the leading intelligent agent can obtain corresponding local resource allocation constraint and inequality constraint information. And the m leading agents game in the inter-cluster topology network and determine a local generalized Nash equilibrium search strategy.

S200, determining first communication parameters from each intelligent agent to neighbor intelligent agents in each cluster according to the first communication relation, and determining second communication parameters from each cluster to neighbor clusters according to the second communication relation.

Specifically, the determining, according to the first communication relationship, a first communication parameter from each agent to a neighboring agent in each cluster includes:

The neighbor agents of the agents in the clusters refer to agents capable of performing information interaction with the agents, as shown in fig. 2, the neighbor agents in each cluster are in undirected communication, that is, the agents in the clusters can receive information from the neighbor agents and can also send information to the neighbor agents. Cluster j may be represented by an undirected graph G _j. Laplacian matrix L _j＝D_j-A_j of undirected graph G _j, whereIs a degree matrix based on the/>, which is related to the agent i in the clusterFor the diagonal matrix of diagonal elements, i E {1,2,.. N _j},n_j is the number of agents in the j-th cluster, agent k is connected to agent i by an information interaction edge E _ik E, the information interaction edge E _ik E represents that agent i and agent k can interact information in both directions, if the information interaction edge E _ik E, a _ik＝a_ki > 0, otherwise a _ik =0,/>Is a weighted adjacency matrix. It can be seen that according to the communication relationship between the agents in the cluster, each first communication parameter from each agent in the cluster to each neighbor agent can be determined, and the laplace matrix of the undirected graph corresponding to the cluster is determined.

The determining, according to the second communication relationship, a second communication parameter from each cluster to a neighbor cluster, including:

When the leading agent of the a-cluster is able to receive information from or send information to the leading agent of the B-cluster, the B-cluster is said to be a neighbor agent of the a-cluster. As shown in fig. 2, there is a case where the leading agent of a cluster has a directional communication with the leading agent of a neighbor cluster, that is, one of the leading agents of a cluster can only receive information from the leading agent of a neighbor cluster and cannot transmit information. The communication relationship between clusters may be represented by a directed graph G ₀. Laplacian matrix L ₀＝Dⁱⁿ-A₀ of directed graph G ₀, whereIs an incorporative matrix based on the correlation of cluster j in multi-agent systemFor the diagonal matrix of diagonal elements, j E {1,2,.. M }, m being the number of clusters in the multi-agent system, cluster i is connected with cluster j by an information interaction edge E _jl E, information interaction edge E _jl E indicates that the leader agent of cluster j is able to receive information from the leader agent of cluster l, a _jl >0 if information interaction edge E _jl E, or a _jl =0,/>Is a weighted adjacency matrix. It can be seen that according to the communication relationship between the leading agents of the clusters, each second communication parameter from each cluster in the clusters to each neighbor cluster of the clusters can be determined, and the laplace matrix of the directed graph corresponding to the multi-agent system can be determined.

Referring to fig. 1 again, the multi-agent state control method for multi-cluster game provided in this embodiment further includes the steps of:

S300, constructing an intelligent agent state control function according to preset inequality constraint, preset cost function, each first communication parameter and each second communication parameter.

Specifically, the decision variables of agent i in cluster j are expressed asIs a q-dimensional vector where j e {1, …, m }, i e {1, …, n _j },/>The stack of decision variables for all agents in cluster j is in the form ofWithout loss of generality, the present invention marks the leading agent in the cluster as the first agent in the cluster, i.e./>A state decision vector representing the leading agent of cluster j. The communication topology graph among the m leader agents of the multi-cluster can be modeled as the balancing directed graph G ₀ of the foregoing by discussing gaming and deciding on local generalized nash equilibrium search strategies in the inter-cluster topology network. Except that the internal agent connectivity topology of cluster j e {1, …, m } is represented as the undirected graph G _j above due to collaboration requirements. All participating agents' decision variables except the decision variable x _j can be expressed as/>The agents in cluster j e {1, …, m } can only be selected from neighbor clusters/> in G ₀ Information of x _{_j} is received. Furthermore,/>Is part of x _{_j}, defined as x _{_j} information that is available to agent i in cluster j, i.e./>The cost function for cluster j is:

Wherein/> k＝1,2,/>For the local cost function of agent i in cluster j,/>Is a convex function.

For each agent i, there are two convex functionsContained in its local cost function, where/>At/>The upper part is smooth,/>At/>Is non-smooth in the upper part, and the local inequality constraint/>Is non-smooth and is only learned by the lead agent of cluster j, and for a given x _{_j}, the goal of the agent in cluster j participating in the game is to minimize the cost function value of cluster j if the constraint is met, i.e., minimize the problem for a given solution:

min F_j(x_j,x_{_j})

s.t.(x_j,x_{_j})∈Ω (1)

Wherein, Is a feasibility set of decision variables, wherein/>Matrix L _j is the laplace matrix of graph G _j corresponding to cluster j.

The generalized Nash equilibrium definition for the multi-cluster gaming problem (1) is as follows:

Decision variable x ^* is generalized Nash equalization for multi-cluster gaming problem (1), with agents in all clusters j ε {1, …, m }:

lemma 1: definition of the definition D.epsilon.D (x ^*). Solution of generalized variational inequality

Is generalized Nash equalization for the multi-cluster gaming problem (1), where j ε {1, …, m }.

For the accuracy of the multi-agent system multi-cluster gaming problem (1) description, several preconditions are given:

1. Cost function for all j ε {1, …, m } and i ε {1, …, n _j } Pair/>, given x _{_j} Is secondarily continuous slightly and strongly convex, that is to say that there is a constant K ₁ > 0 such that for agent i there is: wherein y and z are auxiliary variables,/> y≠z。

2. For all j e {1, …, m } and i e {1, …, n _j },And g _j pair/>, given x _{_j} Is a lower semi-continuous closed fit convex function, they pair/>Can be obtained by simple calculation. In addition, g _j is K _2,j -Li Puxi Ks continuous, constant K ₂＝max_{j∈{1,…,m}}[K_2,j, and K ₂ > 0. That is, for each cluster's inequality constraint g _j, the corresponding K _2,j may be determined, taking the maximum value among all K _2,j as K ₂.

3. The inter-cluster topology graph G ₀ is a strongly connected balanced directed graph. For all j∈ {1, …, m }, the intra-cluster topology graph G _j is an undirected connected graph.

4. The stoneley condition of the multi-cluster gaming problem (1) may be satisfied, i.e. there is at least one feasible point for the multi-cluster gaming problem (1).

The conditions for establishment of the generalized variational inequality solution are as follows, which is also generalized Nash equalization of the problem (2):

And (4) lemma 2: Is generalized Nash equilibrium of problem (1), then there is a constant/> And/>So that

Where j ε {1, …, m },When i is not equal to 1,/>

According to the game problem established in the foregoing, the invention designs a distributed Li Puxi-z continuous generalized Nash equilibrium search algorithm, and specifically constructs an intelligent body state control function as follows:

The composite form is as follows:

Wherein the method comprises the steps of And l=diag (L ₁,…,L_m).

And (3) lemma 3: if it isIs the equilibrium point for algorithm (6), then x ^* is the generalized Nash equilibrium for multi-cluster game (2).

And 4, lemma: in particular, ifIs the equilibrium point of algorithm (6), then there is

Wherein the method comprises the steps ofAnd/>

The lyapunov alternative function is designed as V (x, μ, y, V, η, ω) =v ₁(x,y)+V₂(x,μ)+V₃ (V, η, ω),

Wherein the method comprises the steps of

And is also provided withAnd/>Wherein/>And there is/>, for i+.j, in cluster j ε {1, …, m }

The inventors have found that, if0＜α₄＜1-K₂α₂,/> And is also provided withWhere λ _min (·) and λ _nax (·) represent the minimum eigenvalue and the maximum eigenvalue in brackets, respectively, then (6) the driven trajectory (x, μ, y, v, η, ω) is bounded and x will converge to a generalized nash equilibrium for the multi-cluster gaming problem (1) at t→infinity. The conclusion is demonstrated below:

(1) V (x ^*,μ^*,y^*,v^*,η^*,ω^*) =0 is readily available. It will be demonstrated below that V (x, μ, y, V, η, ω) > 0 is (x, μ, y, V, η, ω) noteq (x ^*,μ^*,y^*,v^*,η^*,ω^*). Combined with alpha ₃, have

Since F ¹ (x) is strongly convex, there is a relationshipIt is easy to demonstrate that V ₃ (V, eta, omega) is not less than 0. Binding/>Alpha ₄ > 0, have

Wherein κ ₁＝1-α₃λ_max(L)＞0,κ₂＝α₄/κ₁ > 0. Thus, V (x, μ, y, V, η, ω) is positive, radially unbounded, V (x, μ, y, V, η, ω) is ≡0 and zero when (x, μ, y, V, η, ω) = (x ^*,μ^*,y^*,v^*,η^*,ω^*).

(2) Next, it is proved that there are all t.gtoreq.0From (6)

It means

Due to convexity of F ² (x), sub-differentiationIs monotonous, and thus can be seen

Wherein the method comprises the steps ofThereby having the following characteristics

Combining the quotation marks 4 and the formula (12) is thatAnd

From the above formulas (12) to (14), it can be seen that the derivatives of the lyapunov alternative function V (x, μ, y, V, η, ω) along the trajectory of (6) satisfy

Wherein the method comprises the steps of

Definition v=v ^a+v^b, where v ^a is an all 1 vector, due to the fact thatTherefore there is/>And is also provided withBecause/>So there is/>, again AndThus, from the assumption of alpha _k, k ε {1, …,5}, it can be deduced from equation (16)

Wherein the method comprises the steps of ε₁＝1-α₄-α₂K₂,ε₂＝2α₁K₁-2α₂K₂-α₄-2,Epsilon ₄＝2-2α₂K₂. Since V (x, μ, y, V, η, ω) is positive, radially unbounded and lower bounded, the equilibrium point (x ^*,μ^*,y^*,v^*,η^*,ω^*) is lyapunov stable and the trajectory (x, μ, y, V, η, ω) is bounded.

Defining a setNamely there isAssuming that D is the maximum invariant set of T, it is known that the trajectory (x, μ, y, v, η, ω) converges to D at t→infinity under the driving of (6) according to the invariance principle. If (x, μ, y, v, η, ω) is the trace starting from (x ₀,μ₀,y₀,v₀,η₀,ω₀) ε D under the drive of (6), then there is/>, for all t.gtoreq.0Thus/>This means/>If/>Then there is/>This conflicts with the invariance set principle. Furthermore, it is possible to deduce/>, using x≡x ^*、η≡η^* and v ^b≡0_mq />From the demonstration of quotation 1, it can be seen that/>The trajectory (x, μ, y, v, η, ω) then converges to the equilibrium point of the system driven by (6) according to lemma 3. Finally, as can be seen from the lemma 2, the system state x will converge to generalized Nash equilibrium for the multi-cluster game problem (4) at t→infinity.

S400, controlling the state of each agent at each moment according to the agent state control function so that the agent system achieves Nash equilibrium.

The controlling the state of each agent according to the agent state control function includes:

According to the evidence, the state x of the intelligent system driven by the intelligent state control function finally reaches Nash equilibrium along with the time, the state decision vector of each intelligent body at the target moment can be determined according to the intelligent state control function, and the target state of the intelligent body at the target moment can be determined according to the state decision vector of the intelligent body at the target moment, so that the state of each intelligent body at each moment is controlled to reach the target state corresponding to each moment.

The effectiveness of the method provided by the invention is verified by carrying out corresponding simulation on the multi-agent system multi-cluster game problem with the constraint of the non-smooth cost function and the non-smooth inequality under the distributed smooth search function (6). As shown in fig. 2, it is assumed that the system is composed of sixteen agents whose models are first-order integrators, which can be divided into four clusters, the leading agents of each cluster are numbered 1, 5, 8, 13, respectively, the directed communication topology between the clusters is represented by an orange solid line, and the undirected communication topology inside the clusters is represented by a blue dashed line. The specific form of the distributed multi-cluster game problem of the non-smooth multi-intelligent system aimed by the simulation is that

Wherein the method comprises the steps ofIn the non-smooth inequality constraint/>D _j＝[16.1,15.5,16.1,14.5]^T,R_j＝[-3,1,3,3,2,-1,-2,-4]^T. The local cost function f _i(x_i) of agent i consists of the following functions:

/>

Wherein the method comprises the steps of And/>Representing a smooth cost function and a local restriction set/>, respectivelyIs a function of the indication of (a). The initial bits of the agent are x₁(0)＝[-5.6m,3.9m,-8m,10m,-11.6m,7.5m,-10m,4m]^T,x₂(0)＝[1.8m,3.2m,4.1m,8.9m,5.8m,4.9m]^T,x₃(0)＝[3.5m,-1.9m,5.2m,-6.1m,1.3m,-7.2m,7.9m,-4.1m,6.1m,-8.9m]^T and x ₄(0)＝[-3.9m,-1m,-7.9m,-3.2m,-6.1m,-7.5m,-9.8m,-6.1m]^T. Let the initial values of the auxiliary variables all be zero. The simulation step size is t _p =0.1s, the iteration step number is n=3000, and the running time is t= 361.8s. System status/>And (3) withThe trajectory variation of (a) is shown in fig. 4 and 5. Fig. 6 shows the trajectory variation of the global benefit function F (x). FIG. 7 shows the non-smooth inequality constraints/>, which individual agents are subject toTrack change over time. As can be seen from fig. 3-7, the state of the multi-agent system eventually satisfies the non-smooth inequality constraint and the resource allocation constraint, and achieves generalized nash equalization for the multi-cluster gaming problem.

In summary, the present embodiment provides a multi-agent state control method for multi-cluster game, which determines a first communication parameter from each agent in a cluster to a neighbor agent according to a first communication relationship between each agent in each cluster of an agent system, determines a second communication parameter from each cluster to a neighbor cluster according to a second communication relationship between leading agents of each cluster, and constructs an agent state control function according to a preset inequality constraint and cost function, the first communication parameter and the second communication parameter to control states between each agent, so that the whole agent system achieves Nash equilibrium.

It should be understood that, although the steps in the flowcharts shown in the drawings of the present specification are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least a portion of the steps in the flowcharts may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order in which the sub-steps or stages are performed is not necessarily sequential, and may be performed in turn or alternately with at least a portion of the sub-steps or stages of other steps or other steps.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link (SYNCHLINK) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

Example two

Based on the above embodiment, the present invention further provides a multi-agent state control device for multi-cluster game, as shown in fig. 8, where the device includes:

The communication relation acquiring module is configured to acquire a first communication relation between the respective agents in each cluster of the agent system and a second communication relation between the leading agents of each cluster, which is specifically described in the first embodiment;

The communication parameter determining module is configured to determine a first communication parameter from each agent in each cluster to a neighboring agent according to the first communication relationship, and determine a second communication parameter from each cluster to a neighboring cluster according to the second communication relationship, as described in embodiment one;

the function construction module is configured to construct an agent state control function according to a preset inequality constraint, a preset cost function, each first communication parameter and each second communication parameter, as described in embodiment one;

And the state control module is used for controlling the state of each intelligent agent according to the intelligent agent state control function so as to enable the intelligent agent system to achieve Nash equilibrium, and the state control module is specifically as described in the first embodiment.

Example III

Based on the above embodiment, the present invention also correspondingly provides a terminal, as shown in fig. 9, which includes a processor 10 and a memory 20. Fig. 9 shows only some of the components of the terminal, but it should be understood that not all of the illustrated components are required to be implemented and that more or fewer components may alternatively be implemented.

The memory 20 may in some embodiments be an internal storage unit of the terminal, such as a hard disk or a memory of the terminal. The memory 20 may in other embodiments also be an external storage device of the terminal, such as a plug-in hard disk provided on the terminal, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD), etc. Further, the memory 20 may also include both an internal storage unit and an external storage device of the terminal. The memory 20 is used for storing application software and various data installed in the terminal. The memory 20 may also be used to temporarily store data that has been output or is to be output. In one embodiment, the memory 20 stores a multi-agent state control program 30 for multi-cluster gaming, and the multi-agent state control program 30 for multi-cluster gaming is executed by the processor 10, so as to implement the multi-agent state control method for multi-cluster gaming in the present application.

The processor 10 may in some embodiments be a central processing unit (Central Processing Unit, CPU), microprocessor or other chip for executing program code or processing data stored in the memory 20, such as performing multi-agent state control methods of the multi-cluster game, etc.

Example IV

The present invention also provides a storage medium storing one or more programs executable by one or more processors to implement the steps of the multi-agent state control method for multi-cluster gaming as described above.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A multi-agent state control method for multi-cluster gaming, the method comprising:

Determining a first communication parameter from each agent to a neighbor agent in each cluster according to the first communication relation, including:

the cost function is:

Wherein/> For the local cost function of agent i in cluster j,/>Is a convex function;

The state decision variables of agent i in cluster j are expressed as Where j is {1, L, m }, i is {1, L, n _j },M represents the number of clusters in the intelligent system, n _j represents the number of intelligent agents in the cluster j, n represents the number of intelligent agents in the intelligent system, and the stack form of state decision variables of all intelligent agents in the cluster j is/>The state decision variables of all participating agents except the state decision variable x _j are represented as As part of x _-j, is defined as x _-j information that is available to agent i in cluster j;

cost function for all j ε {1, L, m } and i ε {1, L, n _j }, cost function Pair/>, given x _-j Is secondarily continuous and slightly and strongly convex,/>Pair/>, given x _-j Is a lower semi-continuous closed convex function;

2. The multi-agent state control method of multi-cluster gaming of claim 1, wherein the inequality constraint of cluster j is g _j.

3. The multi-agent state control method of multi-cluster gaming of claim 2, wherein g _j is specific to a given x _-j pairIs the lower semi-continuous closed convex function, and g _j is K _2,j -Li Puxi z continuous, K _2,j is a constant, K _2,j >0.

4. The multi-agent state control method of multi-cluster gaming of claim 3, wherein the agent state control function is:

Wherein the method comprises the steps of State decision vector representing leading agent of cluster j,/>Y _j、ν_j、η_j、ω_j、r_j is an auxiliary variable which is used as a main variable,And when i+.1,/>Alpha ₁、α₂、α₃、α₄ is a constant.

5. The multi-agent state control method of multi-cluster gaming of claim 4, wherein said controlling the state of each agent according to said agent state control function comprises:

6. The utility model provides a multi-agent state controlling means of multi-cluster recreation which characterized in that, multi-agent state controlling means of multi-cluster recreation includes:

the cost function is:

7. The multi-agent state control device of multi-cluster gaming of claim 6, wherein the agent state control function is:

wherein the state decision variables of agent i in cluster j are represented as Where j is {1, L, m }, i is {1, L, n _j },/>M represents the number of clusters in the intelligent system, n _j represents the number of intelligent agents in the cluster j, n represents the number of intelligent agents in the intelligent system, and the stack form of state decision variables of all intelligent agents in the cluster j isThe state decision variables of all participating agents except the state decision variable x _j are represented as As part of x _-j, defined as the information of x _-j that is receivable at agent i of cluster j, the inequality constraint of cluster j is g _j,/>State decision vector representing leading agent of cluster j,/>Y _j、ν_j、η_j、ω_j are auxiliary variables,/>And when i+.1,/>Alpha ₁、α₂、α₃、α₄ is a constant and the neighbor set of agent i within cluster j is denoted/>Said first communication parameter of agent i to agent k is denoted/>The neighbor cluster set of cluster j is denoted/> The second communication parameter of cluster j to cluster l is denoted/>Is the local cost function of agent i in cluster j.

8. A terminal, the terminal comprising: a processor, a storage medium communicatively coupled to the processor, the storage medium adapted to store a plurality of instructions, the processor adapted to invoke the instructions in the storage medium to perform the steps of implementing the multi-agent state control method of the multi-cluster gaming of any of the preceding claims 1-5.

9. A storage medium storing one or more programs executable by one or more processors to implement the steps of the multi-agent state control method of multi-cluster gaming of any of claims 1-5.