CN114488802A

CN114488802A - Nash equilibrium designated time searching method for multi-group game with consistent decision in group

Info

Publication number: CN114488802A
Application number: CN202210056868.8A
Authority: CN
Inventors: 周佳玲; 栾萌; 吕跃祖; 温广辉
Original assignee: Individual
Current assignee: Individual
Priority date: 2022-01-18
Filing date: 2022-01-18
Publication date: 2022-05-13
Anticipated expiration: 2042-01-18
Also published as: CN114488802B

Abstract

The invention provides a method for searching the convergence Nash equilibrium of appointed time of a multi-group game under the constraint of decision consistency in a group, which comprises the following steps: a multi-cluster game problem model is built according to a multi-agent system, a communication topological structure meeting conditions is built, a time planning method is introduced, a continuous time distributed Nash equilibrium searching method is designed aiming at each agent, and relevant parameters enabling the Nash equilibrium searching method to achieve specified time convergence are designed. The method can realize the convergence Nash equilibrium search in the appointed time aiming at the multi-group game problem under the constraint of the decision consistency in the group, and provides a basis for the decision of the situation that the multi-unmanned cluster system collaborates in the cluster in a cooperative way and competes for the game among the clusters.

Description

Nash equilibrium designated time searching method for multi-group game with consistent decision in group

Technical Field

The invention belongs to the communication technology, relates to a multi-agent game decision technology, and particularly relates to a designated time convergence Nash equilibrium search method for multi-group games under the constraint of intra-group decision consistency.

Background

With the development of artificial intelligence, the unmanned and intelligent technologies become powerful power for promoting a new round of military change after mechanization and informatization, and impact and even subversive influence are generated on the fighting forms. In future military operations, a large-scale unmanned cluster confronts the unmanned cluster in a scene. However, limited by communication latency, the communication topology hierarchy of the unmanned cluster system cannot be extended indefinitely, thereby limiting the cluster size. The solution available is to study multiple unmanned cluster systems, where each small-scale unmanned cluster system is treated as a cluster, and a plurality of such cluster systems are jointly deployed for combat. Due to the complexity of the whole large system dynamics caused by the diversity of tasks and the cluster number of a plurality of unmanned cluster systems, the task conflict often exists in practice. At this time, since there is no global unified director, there is a cooperation and competition relationship at a task level between each unmanned cluster, and there is also a cooperation and competition relationship between the individuals within the cluster. Therefore, modeling of the multi-task decision problem of the complex multi-cluster system is urgently needed, and through a dynamics analysis evolution mechanism, on one hand, theoretical explanation is provided for a multi-task decision result of the actual multi-unmanned cluster system, and on the other hand, the modeling is used for guiding the design and optimization of a cluster framework in the multi-unmanned cluster system. In the existing multi-cluster game continuous time nash equilibrium search algorithm, for a consistency-constrained multi-cluster game problem, in a literature (x.zeng, s.liang, and y.hong.distributed variable equilibrium search of multi-correlation gain a variable equilibrium. IFAC-paperon line, 20th IFAC World consistency, 50(1): 940-. The limitation of this scheme is that the designed balanced search algorithm requires clusters with the same number of agents and the same topology. On the basis, the document (X.Zeng, J.Chen, S.Liang, and Y.Hong. generalized Nash equilibrium for distributed non-smooth multi-cluster machine. Automatica,2019,103: 20-26.) further generalizes the method to the situation of different topological structure cluster maps, proposes a distributed non-smooth algorithm using projection differential inclusion, and analyzes the convergence of the algorithm. However, the implementation of this scheme relies on an undirected topology and the convergence rate of the algorithm is not specifically analyzed in the literature. In the literature (x.nian, f.niu and z.yang.distributed away computing games Switching Communication on Systems, Man, and Cybernetics: Systems, doi: 10.1109/tsmc.2021.30515.) Under a joint strongly connected directed Switching Communication topology, a new Nash Equilibrium search algorithm based on a coherence protocol and a gradient Game rule is proposed, and all the actions of the agent in the cluster are estimated using a leader-follower coherence protocol, so that a more general multiple cluster Game Nash Equilibrium search algorithm suitable for the agent to know only part of the decision information is designed, and results of local convergence and non-local convergence are given for the two algorithms respectively. However, this approach does not take into account decision consistency constraints inside the cluster.

Disclosure of Invention

In order to solve the problems, the invention provides a continuous time Nash equilibrium searching method aiming at a multi-group game under the constraint of intra-group decision consistency, and the convergence of the designated time is realized by utilizing a time planning method. The method considers the complex scene of multi-task combined operation of a plurality of unmanned cluster systems, comprehensively considers cooperative tasks in each unmanned cluster system and operation tasks of each unmanned cluster system, provides a rapid solution method of appointed time for researching balance of multi-task cluster game by establishing cluster game models among the unmanned cluster systems and in the unmanned cluster systems, and provides a solution thought for multi-task decision and control problems of the unmanned cluster systems from the perspective of cluster game.

In order to achieve the purpose, the invention provides the following technical scheme:

a method for carrying out convergence Nash equilibrium search at specified time aiming at a multi-group game under the constraint of decision consistency in a group comprises the following steps:

step 1: aiming at the situations that a multi-unmanned cluster system collaborates in a cluster and competes for games among the clusters, a multi-cluster game model which obeys a consistency constraint set is constructed for a multi-agent system;

step 2: constructing a communication topology structure aiming at the multi-agent system;

and step 3: based on a time planning method, a balanced fast and accurate searching method of a multi-task cluster game with specified time convergence is designed for each agent.

And 4, step 4: designing a Nash equilibrium search method to realize the parameter condition of appointed time convergence.

Further, the step 1 specifically includes the following sub-steps:

step 1-1: aiming at the situation that the conflict exists between the internal cooperative task and each unmanned cluster system task in the multi-unmanned cluster system, a multi-cluster game model which obeys a consistency constraint set is constructed as follows:

wherein N is the number of clusters participating in the game, and the cluster i comprises N_iThe number of the individual agents is set to be,

for the state of cluster i, the number ij indicates the jth agent in cluster i,

in order for the state of the agent ij,

representing all clusters togetherThe State of the Complex State, Cluster, is obeyed with a set of consistency constraints of

Quadratic continuous slightly convex function f_ij(x) Representing a cost function, function f, of agent j in cluster i_ij(x) With a Lipschitz continuous gradient: i.e. for any

Satisfy the requirement of

wherein l_ijAnd 0 is Lipschitz constant. Function f_i(x) Cost function for cluster i:

further, the step 2 specifically includes the following sub-steps:

step 2-1: the communication topology of a multi-agent system is described as follows:

modeling communication topology among all agents as directed graph

Node set is

The edges are collected as

And N is the number of the clusters participating in the game. The inside of the cluster and different clusters can carry out directed communication on topological connection edges. In particular, cluster i contains n_iIndividual agents, agent sets represented as

Induced subgraph for communication topology within cluster i

It is shown that,

the number ij indicates the jth agent in the cluster i for which agent

Define its set of in-neighbors in the network as

Define the set of in-neighbors in its cluster as

Out-neighbor set within its cluster

Definition map

Is a contiguous matrix of

wherein

Is the first of matrix A

Line of

Column element, if (pq, ij) ∈ ε, pq ≠ ij

Otherwise

Definition map

Is a contiguous matrix of

wherein

Is a matrix A_iIf (il, ij) e epsilon_iJ is not equal to l, then

Otherwise

Obviously, A₁,...,A_NDiagonal blocks of the matrix a.

Definition of

Is shown as a drawing

Of laplacian matrix of, wherein

Is the second of the matrix L

Line of

Column element, if ij ═ pq, then

Otherwise

Step 2-2: the communication topology requirements of a multi-agent system are as follows:

communication diagram

And communication subgraph

Are all strongly connected.

Further, the step 3 specifically includes the following sub-steps:

step 3-1: and (3) estimating global state information for the intelligent agent based on a leader-following consistency idea by combining a time planning method:

wherein ,

representing the estimate, normal, of the global state x by agent ij

Satisfy the requirement of

d_ijRepresents the in degree of agent ij:

T_k＝t_k+1-t_kis a sampling interval, a time sequence of sampling intervals

Is designed as

Is a convergent infinite series of numbers, i.e.

Is limited.

Definition of

Step 3-2: the state iteration law of the intelligent agent and the auxiliary variable updating law for gradient information estimation are designed into

wherein ,x_ij(t) represents the state of agent ij at time t,

for estimating items of gradient information, initialising

Is composed of

α is a positive constant to be designed. Matrix array

In the case of a random row, the data is transmitted,

for elements in the jth row and m columns, let

Matrix array

In the case of a column that is random,

for the elements in row j and column m,

each agent ij selects two sets of positive parameter sets

And

the following conditions are satisfied:

these two sets of parameters serve as weights for information received from inner neighbors within the cluster and information sent to outer neighbors within the cluster, respectively.

Is defined as a matrix R_iLeft eigenvector corresponding to eigenvalue 1, i.e. satisfy

v_iIs defined as a matrix C_iThe right eigenvector corresponding to eigenvalue 1, i.e. satisfied

A simpler selection method is:

definition of

Is easy to obtain

Are Schur matrices.

Further, the step 4 specifically includes the following sub-steps:

step 4-1: requiring a pseudo-gradient

Is strongly monotonic, i.e. there is a constant l > 0 such that

wherein

Can be regarded as an objective function of the cluster i, y ═ y₁,y₂,...y_N]^TThe state of N virtual participants.

Step 4-2: the step length parameter requirements for realizing the specified time convergence by the Nash equilibrium search method of the multi-group game under the constraint of decision consistency in a design group are as follows:

wherein ,

σ＝max_i{b_1i}+γ₂max_i{b_2i}+γ₃max_i{b_3i},

is a symmetric positive definite matrix and satisfies

Is a symmetric positive definite matrix and satisfies

Is a symmetric positive definite matrix and satisfies

Is shown as a drawing

The laplacian matrix of.

Compared with the prior art, the invention has the following advantages and beneficial effects:

1. aiming at the situation that competition relations of task levels exist among all unmanned clusters and cooperation relations exist among individuals in the clusters, the invention considers the Nash equilibrium search method of multi-cluster game under the constraint of intra-cluster decision consistency, and provides a solution for multi-task decision and control problems of a multi-unmanned cluster system.

2. In the constructed rapid solution method of Nash equilibrium, a design method based on time planning introduces an infinite series design sampling interval of convergence, which plays an important role in realizing the Nash equilibrium solution in specified time and greatly reduces the communication cost.

3. Compared with the correlation result of finite time and fixed time, the proposed Nash equilibrium fast solving method can achieve the convergence of the designated time, and the convergence time of the method does not depend on the initial action and the method parameters, thereby being convenient to determine the convergence time in advance according to the actual requirement.

Drawings

FIG. 1 is a schematic diagram of the steps of the time-specific convergent Nash equilibrium search method for multi-group games under intra-group decision consistency constraint of the present invention;

FIG. 2 is a specific flowchart of the time-specific convergence Nash equilibrium search method for multi-group game under the constraint of intra-group decision consistency according to the present invention;

FIG. 3 is a diagram of a communication topology of a multi-agent system provided by an example of the present invention;

fig. 4 is a graph showing the evolution of the state convergence of agents to equilibrium at a given time of 1 second, according to an embodiment of the present invention.

Detailed Description

The technical solutions provided by the present invention will be described in detail below with reference to specific examples, and it should be understood that the following specific embodiments are only illustrative of the present invention and are not intended to limit the scope of the present invention.

The invention provides a specified time convergence Nash equilibrium search method for a multi-group game under the constraint of decision consistency in a group, which comprises the following specific steps and flows as shown in figures 1 and 2:

step 1: aiming at the situations that a multi-unmanned cluster system collaborates in a cluster and competes for games among the clusters, a multi-cluster game model which obeys a consistency constraint set is constructed for a multi-agent system, and the method specifically comprises the following sub-steps:

step 1-1: aiming at the situations that a multi-unmanned cluster system collaborates in a cluster and competes for games among clusters, the following multi-cluster game model obeying a consistency constraint set is constructed:

for the state of cluster i, the number ij indicates the jth agent in cluster i,

in order for the state of the agent ij,

representing the joint state of all clusters, the state of the clusters obeying a set of consistency constraints of

Satisfy the requirements of

step 2: on the basis of the step 1, the communication topology of the multi-agent system can realize nash equilibrium search only if corresponding conditions are met. The method for constructing the communication topology structure of the multi-agent system meeting the conditions specifically comprises the following substeps:

modeling communication topology among all agents as directed graph

Node set is

The edges are collected as

Induced subgraph for communication topology within cluster i

It is shown that,

(i ═ 1, 2.., N). The number ij indicates the jth agent in the cluster i for which agent

Define its set of in-neighbors in the network as

Define the set of in-neighbors in its cluster as

Out-neighbor set within its cluster

Definition map

Is a contiguous matrix of

wherein

Is the first of matrix A

Line of

Column element, if (pq, ij) ∈ ε, pq ≠ ij

Otherwise

Definition map

Is a contiguous matrix of

wherein

Is a matrix A_iIf (il, ij) e epsilon_iJ is not equal to l, then

Otherwise

Obviously, A₁,...,A_NDiagonal blocks of the matrix a.

Definition of

Is shown as a drawing

Of a laplacian matrix of, wherein

Is the second of the matrix L

Line of

Column element, if ij ═ pq, then

Otherwise

Step 2-2: the communication topology requirements for a multi-agent system are as follows:

communication diagram

And communication subgraph

Each of (i ═ 1, 2., N) is strongly connected.

And step 3: and designing a balanced rapid and accurate searching method of the multi-task cluster game with specified time convergence for each agent by combining a time planning method.

Step 3-1: a time planning method is introduced, global state information is estimated for an agent based on a leader-follower consistency idea:

wherein ,

representing the estimate, normal, of the global state x by agent ij

Satisfy the requirement of

d_ijRepresents the in degree of agent ij:

T_k＝t_k+1-t_kis a sampling interval, a time sequence of sampling intervals

Is designed as

Is a convergent infinite series of numbers, i.e.

Is limited.

Definition of

wherein ,x_ij(t) represents the state of agent ij at time t,

for estimating items of gradient information, initialising

Is composed of

α is a positive constant to be designed. Matrix array

In the case of a random row, the data is,

for elements in j row and m column, let

Matrix array

In the case of a column that is random,

for the elements in row j and column m,

each agent ij selects two sets of positive parameter sets

And

the following conditions are satisfied:

wherein

These two sets of parameters serve as weights for information received by ij from the inner neighbors within the cluster and for information sent to the outer neighbors within the cluster, respectively.

v_iIs defined as a matrix C_iThe right eigenvector corresponding to eigenvalue 1, i.e. satisfy

A simpler selection method is:

definition of

Is easy to obtain

Are Schur matrices.

And 4, step 4: on the basis of step 3, the designed Nash equilibrium search method can specify time convergence when the design parameters meet certain conditions. The parameter condition for realizing the specified time convergence by the Nash equilibrium solving method is provided, and the method specifically comprises the following substeps:

step 4-1: requiring a pseudo-gradient

Is strongly monotonic, i.e. there is a constant l > 0 such that

wherein

Can be regarded as an objective function of the cluster i, y ═ y₁,y₂,…y_N]^TThe state of N virtual participants.

wherein ,

σ＝max_i{b_1i}+γ₂max_i{b_2i}+γ₃max_i{b_3i},

is a symmetric positive definite matrix and satisfies

Is a symmetric positive definite matrix and satisfies

Is a symmetric positive definite matrix and satisfies

Is shown as a drawing

The laplacian matrix of.

Example 1

Step 1: consider the game problem of 3 clusters, each containing N₁＝3,n₂＝4,n ₃3 agents. The cost function of agent ij is

The correlation coefficient is set as follows: m is₁₁＝3,m₁₂＝11,m₁₃＝22,m₂₁＝m₂₂＝2,m₂₃＝64,m₂₄＝8,m₃₁＝60,m₃₂＝m₃₃＝4,s₁₁＝s₁₂＝s₁₃＝10,s₂₁＝s₂₂＝s₂₃＝50,s₃₁＝s₃₂＝s₃₃＝20,h₁₁＝0.35,h₁₂＝0.25,h₁₃＝0.15,h₂₁＝0.2,h₂₂＝0.1,h₂₃＝0.05,h₂₄＝0.25,h₃₁＝0.02,h₃₂＝0.08,h₃₃＝0.2.

Step 2: the communication topology of the multi-agent system is shown in fig. 3.

And step 3: aiming at the designed multi-group designated time convergence Nash equilibrium search method obeying the consistency constraint, the state of an initialization node is selected as x (t)₀)＝[0,10,20,0,10,20,30,0,10,20]。

And 4, step 4: the parameter of the design method is alpha is 0.02. The state of each agent is set to converge to equilibrium at a specified time of 1 second, with the state trace shown in fig. 4. The simulation results show that the state of each agent converges to Nash equilibrium at 1 second, and

x*＝[6.837,6.837,6.837,26.026,26.026,26.026,26.026,10.412,10.412,10.412]^T，

y*＝[6.837,26.026,10.412]^T.

it is to be noted that, in the attached drawings or in the description, the implementation modes not shown or described are all the modes known by the ordinary skilled person in the field of technology, and are not described in detail. Further, the above definitions of the various elements and methods are not limited to the various specific structures, shapes or arrangements of parts mentioned in the examples, which may be easily modified or substituted by those of ordinary skill in the art.

The technical means disclosed in the invention scheme are not limited to the technical means disclosed in the above embodiments, but also include the technical scheme formed by any combination of the above technical features. It should be noted that those skilled in the art can make various improvements and modifications without departing from the principle of the present invention, and such improvements and modifications are also considered to be within the scope of the present invention.

Claims

1. A Nash equilibrium designated time search method for multi-group game with consistent intra-group decision is characterized by comprising the following steps:

and step 3: designing a multi-task cluster game equilibrium rapid and accurate searching method with appointed time convergence for each agent by combining a time planning method;

and 4, step 4: and providing a parameter condition for realizing specified time convergence by a Nash equilibrium search method.

2. The nash equilibrium designated time search method for multi-group game with consistent decision in group as claimed in claim 1, wherein the step 1 specifically comprises the following sub-steps:

step 1-1: aiming at the situations that a multi-unmanned cluster system collaborates in a cluster and competes for games among clusters, the following multi-cluster game model which obeys a consistency constraint set is constructed:

for the state of cluster i, the number ij indicates the jth agent in cluster i,

in order for the state of the agent ij,

Satisfy the requirement of

wherein l_ij0 is Lipschitz constant, function f_i(x) Cost function for cluster i:

3. the nash equilibrium designated time search method for multi-group game with consistent decision in group as claimed in claim 1, wherein the step 2 specifically comprises the following sub-steps:

modeling communication topology among all agents as directed graph

Node set is

The edges are collected as

N is the number of clusters participating in the game, directional communication can be carried out on topological connection sides inside the clusters and among different clusters, and specifically, the cluster i comprises N_iIndividual agents, agent sets represented as

Induced subgraph for communication topology within cluster i

It is shown that,

the number ij indicates the jth agent in the cluster i for which agent

Define its set of in-neighbors in the network as

Define the set of in-neighbors in its cluster as

Out-neighbor set within its cluster

Definition map

Is a contiguous matrix of

wherein

Is the first of matrix A

Line of

Column element, if (pq, ij) ∈ ε, pq ≠ ij

Otherwise

Definition map

Is a contiguous matrix of

wherein

Is a matrix A_iIf (il, ij) e epsilon_iJ is not equal to l, then

Otherwise

Obviously, A₁,...,A_NFor the diagonal blocks of the matrix a,

definition of

Is shown as a drawing

Of laplacian matrix of, wherein

Is the second of the matrix L

Line of

Column element, if ij ═ pq, then

Otherwise

communication diagram

And communication subgraph

Wherein, i 1,2, N are all strongly connected.

4. The nash equilibrium designated time search method for multi-group game with consistent decision in group as claimed in claim 1, wherein the step 3 specifically comprises the following sub-steps:

wherein ,

representing the estimate, normal, of the global state x by agent ij

Satisfy the requirement of

d_ijRepresents the in degree of agent ij:

is a sampling interval, a time sequence of sampling intervals

Is designed as

Is a convergent infinite series of numbers, i.e.

Is limited;

definition of

Step 3-2: the state iteration law of the agent and the auxiliary variable updating law for gradient information estimation are designed into the following forms:

wherein ,x_ij(t) represents the state of agent ij at time t,

for estimating items of gradient information, initialising

Is composed of

α is a positive constant to be designed, the matrix

In the case of a random row, the data is,

for elements in j row and m column, let

Matrix array

Is a random number of columns,

for the elements in row j and column m,

each agent ij selects two sets of positive parameter sets

And

the following conditions are satisfied:

wherein

These two sets of parameters serve as weights for information received from inner neighbors within the cluster and information sent to outer neighbors within the cluster,

A simpler selection method is:

definition of

Is easy to obtain

Are Schur matrices.

5. The nash equilibrium designated time search method for multi-group game with consistent decision in group as claimed in claim 1, wherein the step 4 comprises the following sub-steps:

step 4-1: requiring a pseudo-gradient

Is strongly monotonic, i.e. there is a constant l > 0 such that

wherein

Can be regarded as an objective function of the cluster i, y ═ y₁,y₂,...y_N]^TStates for N virtual participants;

step 4-2: the designed Nash equilibrium search method for the multi-group game under the constraint of the intra-group decision consistency realizes the following requirements on the parameters of the specified time convergence step length: