CN114488802A - Nash equilibrium designated time searching method for multi-group game with consistent decision in group - Google Patents

Nash equilibrium designated time searching method for multi-group game with consistent decision in group Download PDF

Info

Publication number
CN114488802A
CN114488802A CN202210056868.8A CN202210056868A CN114488802A CN 114488802 A CN114488802 A CN 114488802A CN 202210056868 A CN202210056868 A CN 202210056868A CN 114488802 A CN114488802 A CN 114488802A
Authority
CN
China
Prior art keywords
cluster
agent
group
matrix
game
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210056868.8A
Other languages
Chinese (zh)
Other versions
CN114488802B (en
Inventor
周佳玲
栾萌
吕跃祖
温广辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202210056868.8A priority Critical patent/CN114488802B/en
Publication of CN114488802A publication Critical patent/CN114488802A/en
Application granted granted Critical
Publication of CN114488802B publication Critical patent/CN114488802B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
    • G05B13/042Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance

Abstract

The invention provides a method for searching the convergence Nash equilibrium of appointed time of a multi-group game under the constraint of decision consistency in a group, which comprises the following steps: a multi-cluster game problem model is built according to a multi-agent system, a communication topological structure meeting conditions is built, a time planning method is introduced, a continuous time distributed Nash equilibrium searching method is designed aiming at each agent, and relevant parameters enabling the Nash equilibrium searching method to achieve specified time convergence are designed. The method can realize the convergence Nash equilibrium search in the appointed time aiming at the multi-group game problem under the constraint of the decision consistency in the group, and provides a basis for the decision of the situation that the multi-unmanned cluster system collaborates in the cluster in a cooperative way and competes for the game among the clusters.

Description

Nash equilibrium designated time searching method for multi-group game with consistent decision in group
Technical Field
The invention belongs to the communication technology, relates to a multi-agent game decision technology, and particularly relates to a designated time convergence Nash equilibrium search method for multi-group games under the constraint of intra-group decision consistency.
Background
With the development of artificial intelligence, the unmanned and intelligent technologies become powerful power for promoting a new round of military change after mechanization and informatization, and impact and even subversive influence are generated on the fighting forms. In future military operations, a large-scale unmanned cluster confronts the unmanned cluster in a scene. However, limited by communication latency, the communication topology hierarchy of the unmanned cluster system cannot be extended indefinitely, thereby limiting the cluster size. The solution available is to study multiple unmanned cluster systems, where each small-scale unmanned cluster system is treated as a cluster, and a plurality of such cluster systems are jointly deployed for combat. Due to the complexity of the whole large system dynamics caused by the diversity of tasks and the cluster number of a plurality of unmanned cluster systems, the task conflict often exists in practice. At this time, since there is no global unified director, there is a cooperation and competition relationship at a task level between each unmanned cluster, and there is also a cooperation and competition relationship between the individuals within the cluster. Therefore, modeling of the multi-task decision problem of the complex multi-cluster system is urgently needed, and through a dynamics analysis evolution mechanism, on one hand, theoretical explanation is provided for a multi-task decision result of the actual multi-unmanned cluster system, and on the other hand, the modeling is used for guiding the design and optimization of a cluster framework in the multi-unmanned cluster system. In the existing multi-cluster game continuous time nash equilibrium search algorithm, for a consistency-constrained multi-cluster game problem, in a literature (x.zeng, s.liang, and y.hong.distributed variable equilibrium search of multi-correlation gain a variable equilibrium. IFAC-paperon line, 20th IFAC World consistency, 50(1): 940-. The limitation of this scheme is that the designed balanced search algorithm requires clusters with the same number of agents and the same topology. On the basis, the document (X.Zeng, J.Chen, S.Liang, and Y.Hong. generalized Nash equilibrium for distributed non-smooth multi-cluster machine. Automatica,2019,103: 20-26.) further generalizes the method to the situation of different topological structure cluster maps, proposes a distributed non-smooth algorithm using projection differential inclusion, and analyzes the convergence of the algorithm. However, the implementation of this scheme relies on an undirected topology and the convergence rate of the algorithm is not specifically analyzed in the literature. In the literature (x.nian, f.niu and z.yang.distributed away computing games Switching Communication on Systems, Man, and Cybernetics: Systems, doi: 10.1109/tsmc.2021.30515.) Under a joint strongly connected directed Switching Communication topology, a new Nash Equilibrium search algorithm based on a coherence protocol and a gradient Game rule is proposed, and all the actions of the agent in the cluster are estimated using a leader-follower coherence protocol, so that a more general multiple cluster Game Nash Equilibrium search algorithm suitable for the agent to know only part of the decision information is designed, and results of local convergence and non-local convergence are given for the two algorithms respectively. However, this approach does not take into account decision consistency constraints inside the cluster.
Disclosure of Invention
In order to solve the problems, the invention provides a continuous time Nash equilibrium searching method aiming at a multi-group game under the constraint of intra-group decision consistency, and the convergence of the designated time is realized by utilizing a time planning method. The method considers the complex scene of multi-task combined operation of a plurality of unmanned cluster systems, comprehensively considers cooperative tasks in each unmanned cluster system and operation tasks of each unmanned cluster system, provides a rapid solution method of appointed time for researching balance of multi-task cluster game by establishing cluster game models among the unmanned cluster systems and in the unmanned cluster systems, and provides a solution thought for multi-task decision and control problems of the unmanned cluster systems from the perspective of cluster game.
In order to achieve the purpose, the invention provides the following technical scheme:
a method for carrying out convergence Nash equilibrium search at specified time aiming at a multi-group game under the constraint of decision consistency in a group comprises the following steps:
step 1: aiming at the situations that a multi-unmanned cluster system collaborates in a cluster and competes for games among the clusters, a multi-cluster game model which obeys a consistency constraint set is constructed for a multi-agent system;
step 2: constructing a communication topology structure aiming at the multi-agent system;
and step 3: based on a time planning method, a balanced fast and accurate searching method of a multi-task cluster game with specified time convergence is designed for each agent.
And 4, step 4: designing a Nash equilibrium search method to realize the parameter condition of appointed time convergence.
Further, the step 1 specifically includes the following sub-steps:
step 1-1: aiming at the situation that the conflict exists between the internal cooperative task and each unmanned cluster system task in the multi-unmanned cluster system, a multi-cluster game model which obeys a consistency constraint set is constructed as follows:
Figure BDA0003476633050000021
Figure BDA0003476633050000022
wherein N is the number of clusters participating in the game, and the cluster i comprises NiThe number of the individual agents is set to be,
Figure BDA0003476633050000023
for the state of cluster i, the number ij indicates the jth agent in cluster i,
Figure BDA0003476633050000024
in order for the state of the agent ij,
Figure BDA0003476633050000025
Figure BDA0003476633050000026
representing all clusters togetherThe State of the Complex State, Cluster, is obeyed with a set of consistency constraints of
Figure BDA0003476633050000027
Quadratic continuous slightly convex function fij(x) Representing a cost function, function f, of agent j in cluster iij(x) With a Lipschitz continuous gradient: i.e. for any
Figure BDA0003476633050000031
Satisfy the requirement of
Figure BDA0003476633050000032
wherein lijAnd 0 is Lipschitz constant. Function fi(x) Cost function for cluster i:
Figure BDA0003476633050000033
further, the step 2 specifically includes the following sub-steps:
step 2-1: the communication topology of a multi-agent system is described as follows:
modeling communication topology among all agents as directed graph
Figure BDA0003476633050000034
Node set is
Figure BDA0003476633050000035
The edges are collected as
Figure BDA0003476633050000036
And N is the number of the clusters participating in the game. The inside of the cluster and different clusters can carry out directed communication on topological connection edges. In particular, cluster i contains niIndividual agents, agent sets represented as
Figure BDA0003476633050000037
Induced subgraph for communication topology within cluster i
Figure BDA0003476633050000038
It is shown that,
Figure BDA0003476633050000039
the number ij indicates the jth agent in the cluster i for which agent
Figure BDA00034766330500000310
Define its set of in-neighbors in the network as
Figure BDA00034766330500000311
Define the set of in-neighbors in its cluster as
Figure BDA00034766330500000312
Out-neighbor set within its cluster
Figure BDA00034766330500000313
Definition map
Figure BDA00034766330500000314
Is a contiguous matrix of
Figure BDA00034766330500000315
wherein
Figure BDA00034766330500000316
Is the first of matrix A
Figure BDA00034766330500000317
Line of
Figure BDA00034766330500000318
Column element, if (pq, ij) ∈ ε, pq ≠ ij
Figure BDA00034766330500000319
Otherwise
Figure BDA00034766330500000320
Definition map
Figure BDA00034766330500000321
Is a contiguous matrix of
Figure BDA00034766330500000322
wherein
Figure BDA00034766330500000323
Is a matrix AiIf (il, ij) e epsiloniJ is not equal to l, then
Figure BDA00034766330500000324
Otherwise
Figure BDA00034766330500000325
Obviously, A1,...,ANDiagonal blocks of the matrix a.
Figure BDA00034766330500000326
Definition of
Figure BDA00034766330500000327
Is shown as a drawing
Figure BDA00034766330500000328
Of laplacian matrix of, wherein
Figure BDA00034766330500000329
Is the second of the matrix L
Figure BDA00034766330500000330
Line of
Figure BDA00034766330500000331
Column element, if ij ═ pq, then
Figure BDA00034766330500000332
Otherwise
Figure BDA00034766330500000333
Step 2-2: the communication topology requirements of a multi-agent system are as follows:
communication diagram
Figure BDA00034766330500000334
And communication subgraph
Figure BDA00034766330500000335
Are all strongly connected.
Further, the step 3 specifically includes the following sub-steps:
step 3-1: and (3) estimating global state information for the intelligent agent based on a leader-following consistency idea by combining a time planning method:
Figure BDA0003476633050000041
wherein ,
Figure BDA0003476633050000042
representing the estimate, normal, of the global state x by agent ij
Figure BDA0003476633050000043
Satisfy the requirement of
Figure BDA0003476633050000044
dijRepresents the in degree of agent ij:
Figure BDA0003476633050000045
Tk=tk+1-tkis a sampling interval, a time sequence of sampling intervals
Figure BDA0003476633050000046
Is designed as
Figure BDA0003476633050000047
Figure BDA0003476633050000048
Is a convergent infinite series of numbers, i.e.
Figure BDA0003476633050000049
Is limited.
Definition of
Figure BDA00034766330500000410
Step 3-2: the state iteration law of the intelligent agent and the auxiliary variable updating law for gradient information estimation are designed into
Figure BDA00034766330500000411
Figure BDA00034766330500000412
wherein ,xij(t) represents the state of agent ij at time t,
Figure BDA00034766330500000413
for estimating items of gradient information, initialising
Figure BDA00034766330500000414
Is composed of
Figure BDA00034766330500000415
α is a positive constant to be designed. Matrix array
Figure BDA00034766330500000416
In the case of a random row, the data is transmitted,
Figure BDA00034766330500000417
for elements in the jth row and m columns, let
Figure BDA00034766330500000418
Matrix array
Figure BDA00034766330500000419
In the case of a column that is random,
Figure BDA00034766330500000420
for the elements in row j and column m,
Figure BDA00034766330500000421
each agent ij selects two sets of positive parameter sets
Figure BDA00034766330500000422
And
Figure BDA00034766330500000423
the following conditions are satisfied:
Figure BDA00034766330500000424
Figure BDA00034766330500000425
Figure BDA00034766330500000426
these two sets of parameters serve as weights for information received from inner neighbors within the cluster and information sent to outer neighbors within the cluster, respectively.
Figure BDA00034766330500000427
Is defined as a matrix RiLeft eigenvector corresponding to eigenvalue 1, i.e. satisfy
Figure BDA00034766330500000428
viIs defined as a matrix CiThe right eigenvector corresponding to eigenvalue 1, i.e. satisfied
Figure BDA0003476633050000051
A simpler selection method is:
Figure BDA0003476633050000052
definition of
Figure BDA0003476633050000053
Is easy to obtain
Figure BDA0003476633050000054
Are Schur matrices.
Further, the step 4 specifically includes the following sub-steps:
step 4-1: requiring a pseudo-gradient
Figure BDA0003476633050000055
Is strongly monotonic, i.e. there is a constant l > 0 such that
Figure BDA0003476633050000056
wherein
Figure BDA0003476633050000057
Can be regarded as an objective function of the cluster i, y ═ y1,y2,...yN]TThe state of N virtual participants.
Step 4-2: the step length parameter requirements for realizing the specified time convergence by the Nash equilibrium search method of the multi-group game under the constraint of decision consistency in a design group are as follows:
Figure BDA0003476633050000058
wherein ,
Figure BDA0003476633050000061
σ=maxi{b1i}+γ2maxi{b2i}+γ3maxi{b3i},
Figure BDA0003476633050000062
Figure BDA0003476633050000063
Figure BDA0003476633050000064
Figure BDA0003476633050000065
Figure BDA0003476633050000066
Figure BDA0003476633050000067
Figure BDA0003476633050000068
Figure BDA0003476633050000069
is a symmetric positive definite matrix and satisfies
Figure BDA00034766330500000610
Figure BDA00034766330500000611
Is a symmetric positive definite matrix and satisfies
Figure BDA00034766330500000612
Figure BDA00034766330500000613
Is a symmetric positive definite matrix and satisfies
Figure BDA00034766330500000614
Figure BDA00034766330500000615
Is shown as a drawing
Figure BDA00034766330500000616
The laplacian matrix of.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. aiming at the situation that competition relations of task levels exist among all unmanned clusters and cooperation relations exist among individuals in the clusters, the invention considers the Nash equilibrium search method of multi-cluster game under the constraint of intra-cluster decision consistency, and provides a solution for multi-task decision and control problems of a multi-unmanned cluster system.
2. In the constructed rapid solution method of Nash equilibrium, a design method based on time planning introduces an infinite series design sampling interval of convergence, which plays an important role in realizing the Nash equilibrium solution in specified time and greatly reduces the communication cost.
3. Compared with the correlation result of finite time and fixed time, the proposed Nash equilibrium fast solving method can achieve the convergence of the designated time, and the convergence time of the method does not depend on the initial action and the method parameters, thereby being convenient to determine the convergence time in advance according to the actual requirement.
Drawings
FIG. 1 is a schematic diagram of the steps of the time-specific convergent Nash equilibrium search method for multi-group games under intra-group decision consistency constraint of the present invention;
FIG. 2 is a specific flowchart of the time-specific convergence Nash equilibrium search method for multi-group game under the constraint of intra-group decision consistency according to the present invention;
FIG. 3 is a diagram of a communication topology of a multi-agent system provided by an example of the present invention;
fig. 4 is a graph showing the evolution of the state convergence of agents to equilibrium at a given time of 1 second, according to an embodiment of the present invention.
Detailed Description
The technical solutions provided by the present invention will be described in detail below with reference to specific examples, and it should be understood that the following specific embodiments are only illustrative of the present invention and are not intended to limit the scope of the present invention.
The invention provides a specified time convergence Nash equilibrium search method for a multi-group game under the constraint of decision consistency in a group, which comprises the following specific steps and flows as shown in figures 1 and 2:
step 1: aiming at the situations that a multi-unmanned cluster system collaborates in a cluster and competes for games among the clusters, a multi-cluster game model which obeys a consistency constraint set is constructed for a multi-agent system, and the method specifically comprises the following sub-steps:
step 1-1: aiming at the situations that a multi-unmanned cluster system collaborates in a cluster and competes for games among clusters, the following multi-cluster game model obeying a consistency constraint set is constructed:
Figure BDA0003476633050000071
Figure BDA0003476633050000072
wherein N is the number of clusters participating in the game, and the cluster i comprises NiThe number of the individual agents is set to be,
Figure BDA0003476633050000073
for the state of cluster i, the number ij indicates the jth agent in cluster i,
Figure BDA0003476633050000074
in order for the state of the agent ij,
Figure BDA0003476633050000075
Figure BDA0003476633050000076
representing the joint state of all clusters, the state of the clusters obeying a set of consistency constraints of
Figure BDA0003476633050000077
Quadratic continuous slightly convex function fij(x) Representing a cost function, function f, of agent j in cluster iij(x) With a Lipschitz continuous gradient: i.e. for any
Figure BDA0003476633050000078
Satisfy the requirements of
Figure BDA0003476633050000079
wherein lijAnd 0 is Lipschitz constant. Function fi(x) Cost function for cluster i:
Figure BDA00034766330500000710
step 2: on the basis of the step 1, the communication topology of the multi-agent system can realize nash equilibrium search only if corresponding conditions are met. The method for constructing the communication topology structure of the multi-agent system meeting the conditions specifically comprises the following substeps:
step 2-1: the communication topology of a multi-agent system is described as follows:
modeling communication topology among all agents as directed graph
Figure BDA0003476633050000081
Node set is
Figure BDA0003476633050000082
The edges are collected as
Figure BDA0003476633050000083
And N is the number of the clusters participating in the game. The inside of the cluster and different clusters can carry out directed communication on topological connection edges. In particular, cluster i contains niIndividual agents, agent sets represented as
Figure BDA0003476633050000084
Induced subgraph for communication topology within cluster i
Figure BDA0003476633050000085
It is shown that,
Figure BDA0003476633050000086
(i ═ 1, 2.., N). The number ij indicates the jth agent in the cluster i for which agent
Figure BDA0003476633050000087
Define its set of in-neighbors in the network as
Figure BDA0003476633050000088
Define the set of in-neighbors in its cluster as
Figure BDA0003476633050000089
Out-neighbor set within its cluster
Figure BDA00034766330500000810
Definition map
Figure BDA00034766330500000811
Is a contiguous matrix of
Figure BDA00034766330500000812
wherein
Figure BDA00034766330500000813
Is the first of matrix A
Figure BDA00034766330500000814
Line of
Figure BDA00034766330500000815
Column element, if (pq, ij) ∈ ε, pq ≠ ij
Figure BDA00034766330500000816
Otherwise
Figure BDA00034766330500000817
Definition map
Figure BDA00034766330500000818
Is a contiguous matrix of
Figure BDA00034766330500000819
wherein
Figure BDA00034766330500000820
Is a matrix AiIf (il, ij) e epsiloniJ is not equal to l, then
Figure BDA00034766330500000821
Otherwise
Figure BDA00034766330500000822
Obviously, A1,...,ANDiagonal blocks of the matrix a.
Figure BDA00034766330500000823
Definition of
Figure BDA00034766330500000824
Is shown as a drawing
Figure BDA00034766330500000825
Of a laplacian matrix of, wherein
Figure BDA00034766330500000826
Is the second of the matrix L
Figure BDA00034766330500000827
Line of
Figure BDA00034766330500000828
Column element, if ij ═ pq, then
Figure BDA00034766330500000829
Otherwise
Figure BDA00034766330500000830
Step 2-2: the communication topology requirements for a multi-agent system are as follows:
communication diagram
Figure BDA00034766330500000831
And communication subgraph
Figure BDA00034766330500000832
Each of (i ═ 1, 2., N) is strongly connected.
And step 3: and designing a balanced rapid and accurate searching method of the multi-task cluster game with specified time convergence for each agent by combining a time planning method.
Step 3-1: a time planning method is introduced, global state information is estimated for an agent based on a leader-follower consistency idea:
Figure BDA00034766330500000833
wherein ,
Figure BDA00034766330500000834
representing the estimate, normal, of the global state x by agent ij
Figure BDA00034766330500000835
Satisfy the requirement of
Figure BDA00034766330500000836
dijRepresents the in degree of agent ij:
Figure BDA00034766330500000837
Tk=tk+1-tkis a sampling interval, a time sequence of sampling intervals
Figure BDA00034766330500000838
Is designed as
Figure BDA0003476633050000091
Figure BDA0003476633050000092
Is a convergent infinite series of numbers, i.e.
Figure BDA0003476633050000093
Is limited.
Definition of
Figure BDA0003476633050000094
Step 3-2: the state iteration law of the intelligent agent and the auxiliary variable updating law for gradient information estimation are designed into
Figure BDA0003476633050000095
Figure BDA0003476633050000096
wherein ,xij(t) represents the state of agent ij at time t,
Figure BDA0003476633050000097
for estimating items of gradient information, initialising
Figure BDA0003476633050000098
Is composed of
Figure BDA0003476633050000099
α is a positive constant to be designed. Matrix array
Figure BDA00034766330500000910
In the case of a random row, the data is,
Figure BDA00034766330500000911
for elements in j row and m column, let
Figure BDA00034766330500000912
Matrix array
Figure BDA00034766330500000913
In the case of a column that is random,
Figure BDA00034766330500000914
for the elements in row j and column m,
Figure BDA00034766330500000915
each agent ij selects two sets of positive parameter sets
Figure BDA00034766330500000916
And
Figure BDA00034766330500000917
the following conditions are satisfied:
Figure BDA00034766330500000918
Figure BDA00034766330500000919
wherein
Figure BDA00034766330500000920
These two sets of parameters serve as weights for information received by ij from the inner neighbors within the cluster and for information sent to the outer neighbors within the cluster, respectively.
Figure BDA00034766330500000921
Is defined as a matrix RiLeft eigenvector corresponding to eigenvalue 1, i.e. satisfy
Figure BDA00034766330500000922
viIs defined as a matrix CiThe right eigenvector corresponding to eigenvalue 1, i.e. satisfy
Figure BDA00034766330500000923
A simpler selection method is:
Figure BDA00034766330500000924
definition of
Figure BDA0003476633050000101
Is easy to obtain
Figure BDA0003476633050000102
Are Schur matrices.
And 4, step 4: on the basis of step 3, the designed Nash equilibrium search method can specify time convergence when the design parameters meet certain conditions. The parameter condition for realizing the specified time convergence by the Nash equilibrium solving method is provided, and the method specifically comprises the following substeps:
step 4-1: requiring a pseudo-gradient
Figure BDA0003476633050000103
Is strongly monotonic, i.e. there is a constant l > 0 such that
Figure BDA0003476633050000104
wherein
Figure BDA0003476633050000105
Can be regarded as an objective function of the cluster i, y ═ y1,y2,…yN]TThe state of N virtual participants.
Step 4-2: the step length parameter requirements for realizing the specified time convergence by the Nash equilibrium search method of the multi-group game under the constraint of decision consistency in a design group are as follows:
Figure BDA0003476633050000106
wherein ,
Figure BDA0003476633050000111
σ=maxi{b1i}+γ2maxi{b2i}+γ3maxi{b3i},
Figure BDA0003476633050000112
Figure BDA0003476633050000113
Figure BDA0003476633050000114
Figure BDA0003476633050000115
Figure BDA0003476633050000116
Figure BDA0003476633050000117
Figure BDA0003476633050000118
Figure BDA0003476633050000119
is a symmetric positive definite matrix and satisfies
Figure BDA00034766330500001110
Figure BDA00034766330500001111
Is a symmetric positive definite matrix and satisfies
Figure BDA00034766330500001112
Figure BDA00034766330500001113
Is a symmetric positive definite matrix and satisfies
Figure BDA00034766330500001114
Figure BDA00034766330500001115
Is shown as a drawing
Figure BDA00034766330500001116
The laplacian matrix of.
Example 1
Step 1: consider the game problem of 3 clusters, each containing N1=3,n2=4,n 33 agents. The cost function of agent ij is
Figure BDA00034766330500001117
The correlation coefficient is set as follows: m is11=3,m12=11,m13=22,m21=m22=2,m23=64,m24=8,m31=60,m32=m33=4,s11=s12=s13=10,s21=s22=s23=50,s31=s32=s33=20,h11=0.35,h12=0.25,h13=0.15,h21=0.2,h22=0.1,h23=0.05,h24=0.25,h31=0.02,h32=0.08,h33=0.2.
Step 2: the communication topology of the multi-agent system is shown in fig. 3.
And step 3: aiming at the designed multi-group designated time convergence Nash equilibrium search method obeying the consistency constraint, the state of an initialization node is selected as x (t)0)=[0,10,20,0,10,20,30,0,10,20]。
And 4, step 4: the parameter of the design method is alpha is 0.02. The state of each agent is set to converge to equilibrium at a specified time of 1 second, with the state trace shown in fig. 4. The simulation results show that the state of each agent converges to Nash equilibrium at 1 second, and
x*=[6.837,6.837,6.837,26.026,26.026,26.026,26.026,10.412,10.412,10.412]T
y*=[6.837,26.026,10.412]T.
it is to be noted that, in the attached drawings or in the description, the implementation modes not shown or described are all the modes known by the ordinary skilled person in the field of technology, and are not described in detail. Further, the above definitions of the various elements and methods are not limited to the various specific structures, shapes or arrangements of parts mentioned in the examples, which may be easily modified or substituted by those of ordinary skill in the art.
The technical means disclosed in the invention scheme are not limited to the technical means disclosed in the above embodiments, but also include the technical scheme formed by any combination of the above technical features. It should be noted that those skilled in the art can make various improvements and modifications without departing from the principle of the present invention, and such improvements and modifications are also considered to be within the scope of the present invention.

Claims (5)

1. A Nash equilibrium designated time search method for multi-group game with consistent intra-group decision is characterized by comprising the following steps:
step 1: aiming at the situations that a multi-unmanned cluster system collaborates in a cluster and competes for games among the clusters, a multi-cluster game model which obeys a consistency constraint set is constructed for a multi-agent system;
step 2: constructing a communication topology structure aiming at the multi-agent system;
and step 3: designing a multi-task cluster game equilibrium rapid and accurate searching method with appointed time convergence for each agent by combining a time planning method;
and 4, step 4: and providing a parameter condition for realizing specified time convergence by a Nash equilibrium search method.
2. The nash equilibrium designated time search method for multi-group game with consistent decision in group as claimed in claim 1, wherein the step 1 specifically comprises the following sub-steps:
step 1-1: aiming at the situations that a multi-unmanned cluster system collaborates in a cluster and competes for games among clusters, the following multi-cluster game model which obeys a consistency constraint set is constructed:
Figure FDA0003476633040000011
Figure FDA0003476633040000012
wherein N is the number of clusters participating in the game, and the cluster i comprises NiThe number of the individual agents is set to be,
Figure FDA0003476633040000013
for the state of cluster i, the number ij indicates the jth agent in cluster i,
Figure FDA0003476633040000014
in order for the state of the agent ij,
Figure FDA0003476633040000015
Figure FDA0003476633040000016
representing the joint state of all clusters, the state of the clusters obeying a set of consistency constraints of
Figure FDA0003476633040000017
Quadratic continuous slightly convex function fij(x) Representing a cost function, function f, of agent j in cluster iij(x) With a Lipschitz continuous gradient: i.e. for any
Figure FDA0003476633040000018
Satisfy the requirement of
Figure FDA0003476633040000019
wherein lij0 is Lipschitz constant, function fi(x) Cost function for cluster i:
Figure FDA00034766330400000110
3. the nash equilibrium designated time search method for multi-group game with consistent decision in group as claimed in claim 1, wherein the step 2 specifically comprises the following sub-steps:
step 2-1: the communication topology of a multi-agent system is described as follows:
modeling communication topology among all agents as directed graph
Figure FDA00034766330400000111
Node set is
Figure FDA00034766330400000112
The edges are collected as
Figure FDA0003476633040000021
N is the number of clusters participating in the game, directional communication can be carried out on topological connection sides inside the clusters and among different clusters, and specifically, the cluster i comprises NiIndividual agents, agent sets represented as
Figure FDA0003476633040000022
Induced subgraph for communication topology within cluster i
Figure FDA0003476633040000023
It is shown that,
Figure FDA0003476633040000024
the number ij indicates the jth agent in the cluster i for which agent
Figure FDA0003476633040000025
Define its set of in-neighbors in the network as
Figure FDA0003476633040000026
Define the set of in-neighbors in its cluster as
Figure FDA0003476633040000027
Out-neighbor set within its cluster
Figure FDA0003476633040000028
Definition map
Figure FDA0003476633040000029
Is a contiguous matrix of
Figure FDA00034766330400000210
wherein
Figure FDA00034766330400000211
Is the first of matrix A
Figure FDA00034766330400000212
Line of
Figure FDA00034766330400000213
Column element, if (pq, ij) ∈ ε, pq ≠ ij
Figure FDA00034766330400000214
Otherwise
Figure FDA00034766330400000215
Definition map
Figure FDA00034766330400000216
Is a contiguous matrix of
Figure FDA00034766330400000217
wherein
Figure FDA00034766330400000218
Is a matrix AiIf (il, ij) e epsiloniJ is not equal to l, then
Figure FDA00034766330400000219
Otherwise
Figure FDA00034766330400000220
Obviously, A1,...,ANFor the diagonal blocks of the matrix a,
Figure FDA00034766330400000221
definition of
Figure FDA00034766330400000222
Is shown as a drawing
Figure FDA00034766330400000223
Of laplacian matrix of, wherein
Figure FDA00034766330400000224
Is the second of the matrix L
Figure FDA00034766330400000225
Line of
Figure FDA00034766330400000226
Column element, if ij ═ pq, then
Figure FDA00034766330400000227
Otherwise
Figure FDA00034766330400000228
Step 2-2: the communication topology requirements for a multi-agent system are as follows:
communication diagram
Figure FDA00034766330400000229
And communication subgraph
Figure FDA00034766330400000230
Wherein, i 1,2, N are all strongly connected.
4. The nash equilibrium designated time search method for multi-group game with consistent decision in group as claimed in claim 1, wherein the step 3 specifically comprises the following sub-steps:
step 3-1: and (3) estimating global state information for the intelligent agent based on a leader-following consistency idea by combining a time planning method:
Figure FDA00034766330400000231
wherein ,
Figure FDA00034766330400000232
representing the estimate, normal, of the global state x by agent ij
Figure FDA00034766330400000233
Satisfy the requirement of
Figure FDA00034766330400000234
dijRepresents the in degree of agent ij:
Figure FDA00034766330400000235
is a sampling interval, a time sequence of sampling intervals
Figure FDA00034766330400000236
Is designed as
Figure FDA00034766330400000237
Figure FDA0003476633040000031
Is a convergent infinite series of numbers, i.e.
Figure FDA0003476633040000032
Is limited;
definition of
Figure FDA0003476633040000033
Step 3-2: the state iteration law of the agent and the auxiliary variable updating law for gradient information estimation are designed into the following forms:
Figure FDA0003476633040000034
Figure FDA0003476633040000035
wherein ,xij(t) represents the state of agent ij at time t,
Figure FDA0003476633040000036
for estimating items of gradient information, initialising
Figure FDA0003476633040000037
Is composed of
Figure FDA0003476633040000038
α is a positive constant to be designed, the matrix
Figure FDA0003476633040000039
In the case of a random row, the data is,
Figure FDA00034766330400000310
for elements in j row and m column, let
Figure FDA00034766330400000311
Matrix array
Figure FDA00034766330400000312
Is a random number of columns,
Figure FDA00034766330400000313
for the elements in row j and column m,
Figure FDA00034766330400000314
each agent ij selects two sets of positive parameter sets
Figure FDA00034766330400000315
And
Figure FDA00034766330400000316
the following conditions are satisfied:
Figure FDA00034766330400000317
Figure FDA00034766330400000318
wherein
Figure FDA00034766330400000319
These two sets of parameters serve as weights for information received from inner neighbors within the cluster and information sent to outer neighbors within the cluster,
Figure FDA00034766330400000320
is defined as a matrix RiLeft eigenvector corresponding to eigenvalue 1, i.e. satisfy
Figure FDA00034766330400000321
viIs defined as a matrix CiThe right eigenvector corresponding to eigenvalue 1, i.e. satisfy
Figure FDA00034766330400000322
A simpler selection method is:
Figure FDA00034766330400000323
definition of
Figure FDA00034766330400000324
Is easy to obtain
Figure FDA00034766330400000325
Are Schur matrices.
5. The nash equilibrium designated time search method for multi-group game with consistent decision in group as claimed in claim 1, wherein the step 4 comprises the following sub-steps:
step 4-1: requiring a pseudo-gradient
Figure FDA0003476633040000041
Is strongly monotonic, i.e. there is a constant l > 0 such that
Figure FDA0003476633040000042
wherein
Figure FDA0003476633040000043
Can be regarded as an objective function of the cluster i, y ═ y1,y2,...yN]TStates for N virtual participants;
step 4-2: the designed Nash equilibrium search method for the multi-group game under the constraint of the intra-group decision consistency realizes the following requirements on the parameters of the specified time convergence step length:
Figure FDA0003476633040000044
wherein ,
Figure FDA0003476633040000045
σ=maxi{b1i}+γ2maxi{b2i}+γ3maxi{b3i},
Figure FDA0003476633040000046
Figure FDA0003476633040000047
Figure FDA0003476633040000048
Figure FDA0003476633040000049
Figure FDA00034766330400000410
Figure FDA00034766330400000411
Figure FDA00034766330400000412
Wciis a symmetric positive definite matrix and satisfies
Figure FDA00034766330400000413
Figure FDA00034766330400000414
Is a symmetric positive definite matrix and satisfies
Figure FDA0003476633040000051
Figure FDA0003476633040000052
Is a symmetric positive definite matrix and satisfies
Figure FDA0003476633040000053
Figure FDA0003476633040000054
Is shown as a drawing
Figure FDA0003476633040000055
The laplacian matrix of.
CN202210056868.8A 2022-01-18 2022-01-18 Nash equilibrium appointed time searching method for intra-group decision-making consistent multi-group game Active CN114488802B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210056868.8A CN114488802B (en) 2022-01-18 2022-01-18 Nash equilibrium appointed time searching method for intra-group decision-making consistent multi-group game

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210056868.8A CN114488802B (en) 2022-01-18 2022-01-18 Nash equilibrium appointed time searching method for intra-group decision-making consistent multi-group game

Publications (2)

Publication Number Publication Date
CN114488802A true CN114488802A (en) 2022-05-13
CN114488802B CN114488802B (en) 2023-11-03

Family

ID=81472207

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210056868.8A Active CN114488802B (en) 2022-01-18 2022-01-18 Nash equilibrium appointed time searching method for intra-group decision-making consistent multi-group game

Country Status (1)

Country Link
CN (1) CN114488802B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150295407A1 (en) * 2014-04-10 2015-10-15 Nec Laboratories America, Inc. Decentralized Energy Management Platform
US20190347371A1 (en) * 2018-05-09 2019-11-14 Volvo Car Corporation Method and system for orchestrating multi-party services using semi-cooperative nash equilibrium based on artificial intelligence, neural network models,reinforcement learning and finite-state automata
US20200160168A1 (en) * 2018-11-16 2020-05-21 Honda Motor Co., Ltd. Cooperative multi-goal, multi-agent, multi-stage reinforcement learning
CN113534660A (en) * 2021-05-27 2021-10-22 山东大学 Multi-agent system cooperative control method and system based on reinforcement learning algorithm
CN113778619A (en) * 2021-08-12 2021-12-10 鹏城实验室 Multi-agent state control method, device and terminal for multi-cluster game

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150295407A1 (en) * 2014-04-10 2015-10-15 Nec Laboratories America, Inc. Decentralized Energy Management Platform
US20190347371A1 (en) * 2018-05-09 2019-11-14 Volvo Car Corporation Method and system for orchestrating multi-party services using semi-cooperative nash equilibrium based on artificial intelligence, neural network models,reinforcement learning and finite-state automata
US20200160168A1 (en) * 2018-11-16 2020-05-21 Honda Motor Co., Ltd. Cooperative multi-goal, multi-agent, multi-stage reinforcement learning
CN113534660A (en) * 2021-05-27 2021-10-22 山东大学 Multi-agent system cooperative control method and system based on reinforcement learning algorithm
CN113778619A (en) * 2021-08-12 2021-12-10 鹏城实验室 Multi-agent state control method, device and terminal for multi-cluster game

Also Published As

Publication number Publication date
CN114488802B (en) 2023-11-03

Similar Documents

Publication Publication Date Title
Jiang et al. Deep-learning-based joint resource scheduling algorithms for hybrid MEC networks
CN110929394B (en) Combined combat system modeling method based on super network theory and storage medium
Xu et al. Learning to explore via meta-policy gradient
CN112650239B (en) Multi-underwater robot formation obstacle avoidance method and system based on improved artificial potential field method
CN107562066B (en) Multi-target heuristic sequencing task planning method for spacecraft
CN114741886B (en) Unmanned aerial vehicle cluster multi-task training method and system based on contribution degree evaluation
US11281232B2 (en) Systems and methods for multi-agent system control using consensus and saturation constraints
CN111191728A (en) Deep reinforcement learning distributed training method and system based on asynchronization or synchronization
CN109657794B (en) Instruction queue-based distributed deep neural network performance modeling method
Bedi et al. Asynchronous online learning in multi-agent systems with proximity constraints
Shi et al. LANA: an ADMM-like Nash equilibrium seeking algorithm in decentralized environment
CN111414006A (en) Unmanned aerial vehicle cluster reconnaissance task planning method based on distributed sequential distribution
CN113487029A (en) Transplantable neural network distributed parallel strategy searching method
Xu et al. Decentralized machine learning through experience-driven method in edge networks
Zhang et al. Multi-agent system application in accordance with game theory in bi-directional coordination network model
Wang et al. Oracle-guided deep reinforcement learning for large-scale multi-UAVs flocking and navigation
Li et al. Multi-AUV distributed task allocation based on the differential evolution quantum bee colony optimization algorithm
Sarma et al. Fog computing: an enhanced performance analysis emulation framework for IoT with load balancing smart gateway architecture
AlSuwaidan et al. Swarm intelligence algorithms for optimal scheduling for cloud-based fuzzy systems
Zhang et al. A parallel strategy for convolutional neural network based on heterogeneous cluster for mobile information system
Pruekprasert et al. Quantitative supervisory control game for discrete event systems
CN114488802A (en) Nash equilibrium designated time searching method for multi-group game with consistent decision in group
CN116362327A (en) Model training method and system and electronic equipment
Hu et al. Enhanced multi-strategy bottlenose dolphin optimizer for UAVs path planning
Xia et al. A Collaborative Neurodynamic Optimization Approach to Distributed Nash-Equilibrium Seeking in Multicluster Games With Nonconvex Functions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant