CN112270103B - Cooperative strategy inversion identification method based on multi-agent game - Google Patents
Cooperative strategy inversion identification method based on multi-agent game Download PDFInfo
- Publication number
- CN112270103B CN112270103B CN202011236015.XA CN202011236015A CN112270103B CN 112270103 B CN112270103 B CN 112270103B CN 202011236015 A CN202011236015 A CN 202011236015A CN 112270103 B CN112270103 B CN 112270103B
- Authority
- CN
- China
- Prior art keywords
- agent
- game
- player
- cooperative
- strategy
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2111/00—Details relating to CAD techniques
- G06F2111/04—Constraint-based CAD
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2119/00—Details relating to the type or aim of the analysis or the optimisation
- G06F2119/10—Noise analysis or noise optimisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2119/00—Details relating to the type or aim of the analysis or the optimisation
- G06F2119/14—Force analysis or force optimisation, e.g. static or dynamic forces
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Evolutionary Computation (AREA)
- Geometry (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a cooperative strategy inversion identification method based on a multi-agent game, which aims at a multi-agent closed-loop dynamic game decision model in which multiple players participate, wherein each player detects and monitors a multi-unmanned-aerial-vehicle game system through a radar, acquires a mixed signal generated by the system and separates the signal, carries out inversion modeling according to the acquired system state information and the control input information of each agent, and identifies a cooperative game strategy characterization matrix of each agent in the game.
Description
Technical Field
The invention relates to the technical field of system identification and parameters, in particular to a multi-agent game based cooperative strategy inversion identification method.
Background
Inverse problems (or inverse problems) are a broad problem on how to transform observations or measurements into their causal relationships, i.e. to determine parameters (or model parameters) characterizing a problem starting from observations and some general principles (or models). Some mechanism characterization parameters which cannot be directly observed can be obtained through an inversion identification technology, so that the method is widely applied to the fields of control, communication, medical treatment, geophysical and the like. The mechanism according to inversion can be divided into: linear inversion, generalized linear inversion, nonlinear inversion, iterative inversion, optimized inversion, and the like.
The cooperative strategy inversion identification method based on the multi-agent game has strong theoretical research significance and has important application value in military affairs. For example, in actual combat, because the environment of the intelligent agent is uncertain, the decision information is incomplete, and the interactive communication conditions are relatively limited, the cluster behavior of the multi-party intelligent agent combat unit may better conform to the implicit cooperative decision mode, i.e., imitate the cooperative mode of human beings, do not depend on direct interaction, and realize the implicit cooperative decision of the multi-agent through an implicit cooperative frame based on roles, i.e., approximate description can be realized through the non-cooperative dynamic game cooperative decision frame of the multi-agent system. If the cooperative game strategy can be inversely identified by observing the game results of the players participating in the game, the next action of the players of other parties can be predicted, and the response can be made in advance, so that the winning rate of the own party is improved.
For the dynamic game inversion problem, scholars at home and abroad have conducted some targeted researches: in a text of 'Inverse Optimal Control for Identification in Non-Cooperative Differential Games' published in 2017 by Simon Rothfu beta et al, taking a driving auxiliary system as an example, in a man-machine Cooperative background and under the condition of a known auxiliary system Cooperative game strategy, performing inversion modeling on a Cooperative game strategy of a person through dynamic game inversion Identification; florianIn a text of Inverse relationship Learning for Identification in Linear-Quadratic Dynamic Games published in 2017, for a two-player discrete closed-loop game system, the coupled algebraic ricacarti equation is used as optimization constraint, and an Inverse Identification method is designed for a ball-lever actual model on the premise of a known player cooperative game strategy; two finite time Open-Loop nonlinear Differential game inversion algorithms based on the minimum value principle are proposed in a text of Inverse Open-Loop nonlinear Differential Games and Inverse Optimal Control published in 2019 by Timothy L.Molloy et al, and higher identification precision is realized in two intelligent three-dimensional collision avoidance game examples.
At present, the existing optimization inversion modeling method applied to the cooperative strategy inversion identification of the multi-agent game system also has the following problems:
1. the method is difficult to be applied to a closed-loop dynamic game system decision model in which multiple players participate;
2. the existing method needs to completely know the cooperative game strategy of one player, and increases the suitability of an inversion optimization model;
3. the existing method has insufficient inversion identification precision under the noise-free condition and poor inversion identification robustness under the noise interference condition.
Disclosure of Invention
In order to solve the problems, the invention provides a cooperative strategy inversion identification method based on multi-agent game, which realizes a decision model of a closed-loop dynamic game system for participation of multiple players, performs inversion modeling by acquiring motion states and control input information of the multi-unmanned-plane game system as input, and then obtains a cooperative game strategy characterization matrix Q through an inversion identification method i And R ij Finally by using the identified weight matrix Q i And R ij And solving the Nash equilibrium solution of the forward game problem again to verify the effectiveness of the algorithm.
The invention provides a cooperative strategy inversion identification method based on multi-agent game, which has the following specific technical scheme:
s1: acquiring system state information with noise and control input information of each player;
the method comprises the following steps of collecting mixed signals generated by the multi-agent game system, separating the collected mixed signals, and further obtaining system state information and control input information of each agent through calculation, wherein the general forms of the system state information and the control input information of each agent are as follows:
x(kT)=x * (kT)+v(kT),k=1:M
where k denotes the k-th observation point, T denotes the observation period, M denotes a total of M observation points in a certain period, x (kT) and u i (kT) represents observed system motion states and ith agent control inputs, v (kT) and w, respectively i (kT) represents the observed noise at the corresponding time.
S2: identifying an optimal feedback matrix for each player;
according to the obtained system state information and each intelligent agentIdentifying an estimate of the optimal feedback matrix for each agent in the game
S3: constructing a double-layer optimization model;
acquiring a dynamic equation of the multi-agent game system, and establishing a double-layer optimization model according to an optimal game strategy equation and an inverse identification problem of each player optimal feedback matrix obtained by identification on a cooperative game strategy, wherein the double-layer optimization model is as follows:
wherein i, j is the { 1., N }, which represents the number of players, representing a positive definite matrix, Q i And R ij The optimal estimation of (2) is the inverse identification of the cooperative strategy of the multi-agent game system.
S4: solving the constructed optimization model to obtain an inversion identification result;
and converting the double-layer optimization model into a quadratic programming problem to solve.
S5: carrying out accuracy verification on the inversion identification result;
and verifying the accuracy of the result by calculating the relative error between the real value and the predicted value of the system state.
Further, in step S01, the separation of the acquired mixed signal includes the following steps:
A. under the condition of sufficient prior knowledge, designing each intelligent agent signal data separator of each player by a maximum posterior probability method or a principal component analysis method to realize signal separation;
B. under the condition of insufficient prior knowledge, multi-target data are separated through an ICA-based blind signal separation algorithm.
Further, in step S02, the step of acquiring the system state information with noise and the control input information of each player includes the steps of:
s01: obtaining the cooperative game strategy of each player according to the objective function of each player;
firstly, each player finds out the cooperative game strategy through an objective function shown by a minimization formula, wherein the objective function is as follows:
x(t 0 )=x 0
wherein: i, j ∈ {1,..., N }, representing the number of players, and a collaborative characterization matrix Q i And R ij Representing the cooperative game strategy adopted by the ith player in the game;
s02: obtaining the optimal system state under the Nash balance of the system and the solution of each player control input, and generating a corresponding observed value interfered by noise;
and (3) executing the cooperative control strategy of the self game by solving the solution of the coupled algebra Riccati equation, wherein the calculation formula is as follows:
further, in step S2, according to the system state information with noise and the control input information of each agent, an estimated value of the optimal feedback matrix is obtained by a least square identification method, and a calculation formula is as follows:
Feedback matrix K of the i-th player i The calculation formula is as follows:
u i (t)=-K i x(t)
the described
Further, in step S3, the obtaining of the dynamic equation of the multi-agent gaming system includes the following steps:
A. if the prior information of the intelligent agent of each player participating in the game exists, the model of the intelligent agent is distinguished through an observation means, and a system dynamics equation is obtained;
B. if the prior information of the player agents of all the parties participating in the game does not exist, the dynamic equations and the control inputs of the multiple agents participating in the game are identified through a blind system identification method of the multi-input multi-output system, and the dynamic equations of the system are obtained.
The dynamic equation is as follows:
u i (t) and B ii The representation is as follows:
wherein i belongs to { 1., N }, and N represents the number of players participating in the game; x (t) epsilon R n Representing the state quantity of the whole system; x is the number of i (t) status information representing the ith player multi-agent; u. u i (t) control input information representing an ith player multi-agent;control input information of the mth agent representing the ith player, B ii Represents XX for a diagonal matrix.
Further, in step S4, the specific process of the model conversion solution is as follows:
solving the inner layer coupling algebra Riccati equation according to the coupling algebra Riccati equation and the optimal feedback matrixThe relationship between them transforms the model into a quadratic programming problem equivalently, as follows:
s.t.Q i >0
R ij >0
Further, in step S5, the error is calculated as follows:
e max =max(e 1 ,e 2 ,...,e n )
whereinJ-th component, e, representing the estimated state quantity at time kT max Indicating the relative error level of the algorithm.
The invention has the following beneficial effects:
1. aiming at a multi-agent game system with more than two players participating, a multi-agent game based cooperative strategy inversion optimization model is established through system states acquired within a certain time and control input observed values of the players, and the cooperative game strategy of each agent is obtained through solving the model.
2. An algorithm for equivalently converting the complex nonlinear constraint double-layer optimization problem into the quadratic programming problem easy to calculate is designed, and the method has high identification precision under the noise-free condition and has certain robustness under the noise interference condition.
Drawings
FIG. 1 is a block diagram of the architecture of the multi-agent system closed loop dynamic gaming of the present invention;
FIG. 2 is a schematic diagram of an application scenario of the present invention;
FIG. 3 is a diagram of a system state prediction relative error distribution under a noise-free condition according to the present invention;
FIG. 4 is a system state prediction histogram under noise-free conditions of the present invention;
FIG. 5 is a diagram of the system state prediction relative error distribution under white Gaussian noise of 30dB in accordance with the present invention;
FIG. 6 is a system state prediction histogram in white Gaussian noise of 30dB in accordance with the present invention;
Detailed Description
In the following description, technical solutions in the embodiments of the present invention are clearly and completely described, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
The technical contents of the invention are described in detail below with reference to the accompanying drawings and specific embodiments.
The embodiment of the invention provides a cooperative strategy inversion identification method based on a multi-agent game, and aims at a multi-agent closed-loop dynamic game system formed by participation of multiple players, as shown in figure 1, in an unmanned aerial vehicle group game scene with participation of the multiple players, no cooperative relationship exists among the momentum of each party, each player detects and monitors a multi-unmanned aerial vehicle game system through a radar, acquires mixed signals generated by the system, and acquires the system state, the control input of each unmanned aerial vehicle and a dynamic equation of the multi-agent game system through a blind system identification method for separating the mixed signals from the input multi-output system.
In the game process, all players can make decisions which are most beneficial to themselves, and finally the whole system enters a Nash equilibrium state; the unmanned aerial vehicle group system in which the multiple players participate can be approximated as a linear quadratic non-cooperative closed loop dynamic game system; obtaining optimal system state quantity and optimal control input of each player by solving Nash equilibrium solution of the system, wherein the cooperative game strategy of each player passes through a weight matrix Q i And R ij To characterize.
The specific technical scheme of the embodiment of the invention is as follows:
s1: acquiring the optimal system state under Nash balance and the control input information of each player;
each player acquires a mixed signal generated by the system through detecting and monitoring a plurality of unmanned plane game systems by a radar, and separates the mixed signal to obtain system state information and control input information of each intelligent agent; in this embodiment, an ICA-based blind signal separation algorithm is used to separate the multi-target data signals, so as to obtain the following system states and control input information of each agent:
x(kT)=x * (kT)+v(kT),k=1:M
where k denotes the k-th observation point, T denotes the observation period, M denotes a total of M observation points in a certain period, x (kT) and u i (kT) represents observed system motion states and ith agent control inputs, v (kT) and w, respectively i (kT) represents the observed noise at the corresponding time.
According to the obtained system state information and the control input information of each intelligent agent, modeling is carried out on the multi-unmanned aerial vehicle closed-loop dynamic game control system, and a dynamic equation of the system is constructed, in the embodiment, three parties participate in the modeling of the unmanned aerial vehicle group, and the dynamic equation of the system is as follows:
x(t 0 )=x 0
the coefficient matrix of the randomly generated stable multi-unmanned aerial vehicle system is as follows:
the initial value of the system state is x 0 =[1,1,1,1] T
As shown in fig. 2, in the structural block diagram of the multi-agent closed-loop dynamic gaming problem in which N players participate, the control input of each player in nash balance at each moment depends not only on the state of the own player, but also receives the influence of the control input of other players in the gaming system, and the cooperative gaming policy of each player is obtained by minimizing the objective function shown in the formula, where the objective function of the cooperative gaming policy of each player is:
where i, j is an element {1, N }, where N represents the number of players, and the co-characterization matrix Q i And R ij Representing the cooperative game strategy adopted by the ith player in the game;
according to the cooperative game strategy objective functions of the players, the system state in the game and the optimal control input requirement of the player agent of the ith player meet the following equation, namely an optimal game strategy equation:
wherein x * (t) andrespectively inputting optimal system state and optimal control of the ith player agent; p i Is the solution of the coupled algebraic ricarthat equation; />An optimal feedback matrix representing the ith player.
Solution P of the coupled algebraic Riccati equation in this embodiment i Adopting an iterative algorithm to solve, wherein the solving process is as follows:
(4) The method comprises the following steps If j is less than N, returning to the step (2);
(6) the method comprises the following steps k = k +1,j =1,e =0, and the procedure returns to step (2).
According to the obtained solution P of the coupled algebra Riccati equation i And the system state and the equation which needs to be satisfied by the optimal control input of the ith player agent are obtained, and the optimal system state and the optimal control input quantity of each player agent under Nash balance are obtained.
A Gaussian white noise of 30dB is added to the optimal system state and player control inputs to obtain corresponding observations that are disturbed by noise.
Based on the noisy system state information x (kT) and the control input information u of each player i (kT) uses a least squares identification method to identify the optimal counter of each player in a gameEstimation of a feed matrixThe calculation formula is as follows:
s3: establishing a double-layer optimization model;
establishing an optimization model according to the obtained dynamic equation of the system, the optimal game strategy equation and the inversion identification problem of the optimal feedback matrix of each player on the cooperative game strategy:
where i, j is in the { 1.,..,. N }, and the optimal estimation of the matrix in the optimization model representing the positive definite matrix pair represents the inversion identification of the cooperative strategy of the multi-agent game system.
S4: solving an optimization model;
(1) through the solution of the coupled algebra Riccati equation and the optimal feedback matrix estimation valueThe relationship between the two equivalently converts the optimization model into a quadratic programming problem:
s.t.f k (θ i )≥0,k=1,...,K
whereinf k (θ i ) Represents Q i >0,R ij The nonlinear constraint term obtained by conversion is more than 0, and K represents the number of nonlinear constraint conditions obtained by conversion;
(2) and (3) introducing a logarithm barrier function to convert the nonlinear constraint optimization problem into an unconstrained optimization problem:
(3) obtaining a weight matrix Q of a cooperative game strategy objective function of each player by solving the unconstrained optimization problem through Newton i And R ij 。
S5: weight matrix Q obtained by verifying inversion identification i And R ij The accuracy of (2);
identifying the weight matrix Q obtained by inversion i And R ij Substituting the linear quadratic closed-loop dynamic game problem into the step 2 to solve the problem again to obtain the optimal system state under Nash balance, and then verifying the accuracy of the inversion identification algorithm through the relative error level calculated by the following formula, wherein the formula is as follows:
e max =max(e 1 ,e 2 ,...,e n )
whereinJ-th component, e, representing the estimated state quantity at time kT max Indicating the relative error level of the algorithm. In the present embodiment, the accuracy under the noise-free condition and the robustness under the white Gaussian noise disturbance of 30dB are respectively verifiedThe bar property;
(1) and (3) verifying the accuracy of the inversion identification method under the noise-free condition: this example contains 100 randomly generated sets of Q i And R ij The obtained system state prediction relative error distribution graph and histogram show that the system state quantity relative estimation error is less than 10 under the noise-free condition as shown in fig. 3 and 4 -9 Namely, the provided non-cooperative game cooperative strategy inversion identification method of the multi-agent system has accuracy under the noise-free condition.
(2) And (3) robustness verification of the inversion identification method under 30dB white Gaussian noise: this example contains 100 randomly generated sets of Q i And R ij And (3) performing numerical verification, wherein the obtained system state prediction relative error distribution graph and histogram are shown in fig. 5 and 6, and the probability that the relative estimation error of the system state quantity is less than 0.1 reaches 90% under the interference of 30dB Gaussian white noise, namely the provided non-cooperative game cooperative strategy inversion identification method of the multi-agent system has certain robustness under the noise interference.
The invention is not limited to the foregoing embodiments. The invention extends to any novel feature or any novel combination of features disclosed in this specification and any novel method or process steps or any novel combination of features disclosed.
Claims (7)
1. A cooperative strategy inversion identification method based on multi-agent game is provided, which is a multi-agent closed loop dynamic game decision model for multi-player participation, and is characterized by comprising the following steps:
s1: acquiring system state information with noise and control input information of each player;
the method comprises the following steps of collecting mixed signals generated by a multi-agent game system, separating the collected mixed signals, and further obtaining system state information and control input information of each agent through calculation, wherein the system state information and the control input information of each agent are as follows:
x(kT)=x * (kT)+v(kT),k=1:M
where k denotes the k-th observation point, T denotes the observation period, M denotes a total of M observation points in a certain period, x (kT) and u i (kT) represents observed system motion states and ith agent control inputs, v (kT) and w, respectively i (kT) represents the observation noise at the corresponding time, respectively;
s2: identifying an optimal feedback matrix for each player;
according to the obtained system state information and the control input information of each agent, the estimation value of the optimal feedback matrix of each agent in the game is identified
S3: constructing a double-layer optimization model;
acquiring a dynamic equation of the multi-agent game system, and establishing a double-layer optimization model according to an optimal game strategy equation and an inverse identification problem of each player optimal feedback matrix obtained by identification on a cooperative game strategy, wherein the double-layer optimization model is as follows:
where i, j e {1, \8230;, N }, represents the number of players, representing a positive definite matrix, Q i And R ij The optimal estimation of the method is the inversion identification of the cooperative strategy of the multi-agent game system.
S4: solving the constructed optimization model to obtain an inversion identification result;
and converting the double-layer optimization model into a quadratic programming problem to solve.
S5: carrying out accuracy verification on the inversion identification result;
and verifying the accuracy of the result by calculating the relative error between the real value and the predicted value of the system state.
2. The multi-agent game-based cooperative strategy inversion identification method according to claim 1, wherein the separation of the collected mixed signals in step S1 comprises the following steps:
A. under the condition of sufficient prior knowledge, designing each intelligent agent signal data separator of each player by a maximum posterior probability method or a principal component analysis method to realize signal separation;
B. under the condition of insufficient prior knowledge, multi-target data are separated through an ICA-based blind signal separation algorithm.
3. The multi-agent gaming-based cooperative strategy inversion identification method of claim 1, wherein the step S2 of obtaining noisy system state information and control input information of each player comprises the steps of:
s01: obtaining the cooperative game strategy of each player according to the objective function of each player;
firstly, each player finds the cooperative game strategy through an objective function shown in a minimization formula, wherein the objective function is as follows:
x(t 0 )=x 0
wherein: i, j e {1, \8230;, N }, representing the number of players, a co-characterization matrix Q i And R ij Representing the cooperative game strategy adopted by the ith player in the game;
s02: obtaining the optimal system state under the Nash balance of the system and the solution of each player control input, and generating a corresponding observed value interfered by noise;
and executing the cooperative control strategy of the own game by solving the solution of the coupled algebra Riccati equation, wherein the calculation formula is as follows:
4. the cooperative strategy inversion identification method based on multi-agent gaming as claimed in claim 1, wherein in step S2, the estimated value of the optimal feedback matrix is obtained by a least square identification method according to the noisy system state information and the control input information of each agent, and the calculation formula is as follows:
whereinAs an estimate of the optimal feedback matrix, K i A feedback matrix for the ith player, k representing the k-th observation point, T representing the observation period, M representing a total of M observation points in a certain period, x (kT) and u i (kT) represents the observed system motion state and the ith agent control input, v (kT) and w, respectively i (kT) represents the observation noise at the corresponding time.
5. The cooperative strategy inversion identification method based on multi-agent gaming as claimed in claim 1, wherein in step S3, the obtaining of the dynamic equation of the multi-agent gaming system comprises the following steps:
A. if the prior information of the intelligent agent of each player participating in the game exists, the model of the intelligent agent is distinguished through an observation means, and a system dynamics equation is obtained;
B. if the prior information of the player agents of all the parties participating in the game does not exist, identifying the dynamic equations and the control inputs of the multiple agents participating in the game by a blind system identification method of a multi-input multi-output system to obtain the dynamic equations of the system;
the dynamic equation is as follows:
wherein i belongs to {1, \8230;, N }, and N represents the number of players participating in the game; x (t) epsilon R n Representing the state quantity of the whole system; x is the number of i (t) status information representing the ith player multi-agent; u. of i (t) control input information representing an ith player multi-agent;control input information of the mth agent representing the ith player, B i Is a diagonal matrix.
6. The multi-agent game-based cooperative strategy inversion identification method according to claim 1, wherein in step S4, the model is solved by the following transformation process:
solving the inner layer coupling algebra Riccati equation according to the solution of the coupling algebra Riccati equation and the optimal feedback matrixThe relationship between them transforms the model into a quadratic programming problem equivalently, as follows:
s.t.Q i >0
R ij >0
wherein, I n Is a unit array of n × n, I p Is a unit array of p × p.
7. The multi-agent game-based cooperative strategy inversion identification method according to claim 1, wherein in step S5, the error is calculated by the following method:
e max =max(e 1 ,e 2 ,…,e n )
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011236015.XA CN112270103B (en) | 2020-11-09 | 2020-11-09 | Cooperative strategy inversion identification method based on multi-agent game |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011236015.XA CN112270103B (en) | 2020-11-09 | 2020-11-09 | Cooperative strategy inversion identification method based on multi-agent game |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112270103A CN112270103A (en) | 2021-01-26 |
CN112270103B true CN112270103B (en) | 2023-04-11 |
Family
ID=74339826
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011236015.XA Active CN112270103B (en) | 2020-11-09 | 2020-11-09 | Cooperative strategy inversion identification method based on multi-agent game |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112270103B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111258219B (en) * | 2020-01-19 | 2022-05-03 | 北京理工大学 | Inversion identification method for multi-agent system cooperation strategy |
CN113867418B (en) * | 2021-09-17 | 2022-06-17 | 南京信息工程大学 | Unmanned aerial vehicle cluster autonomous cooperative scout task scheduling method |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111258219A (en) * | 2020-01-19 | 2020-06-09 | 北京理工大学 | Inversion identification method for multi-agent system cooperation strategy |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9737988B2 (en) * | 2014-10-31 | 2017-08-22 | Intelligent Fusion Technology, Inc | Methods and devices for demonstrating three-player pursuit-evasion game |
CN107608366B (en) * | 2017-09-01 | 2021-02-05 | 宁波大学 | Multi-wing umbrella unmanned aerial vehicle system based on event trigger |
CN108958032B (en) * | 2018-07-24 | 2021-09-03 | 湖南工业大学 | Total amount cooperative and consistent control method of nonlinear multi-agent system |
CN111275174B (en) * | 2020-02-13 | 2020-09-18 | 中国人民解放军32802部队 | Game-oriented radar countermeasure generating method |
-
2020
- 2020-11-09 CN CN202011236015.XA patent/CN112270103B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111258219A (en) * | 2020-01-19 | 2020-06-09 | 北京理工大学 | Inversion identification method for multi-agent system cooperation strategy |
Also Published As
Publication number | Publication date |
---|---|
CN112270103A (en) | 2021-01-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112270103B (en) | Cooperative strategy inversion identification method based on multi-agent game | |
CN112180967B (en) | Multi-unmanned aerial vehicle cooperative countermeasure decision-making method based on evaluation-execution architecture | |
CN113435644B (en) | Emergency prediction method based on deep bidirectional long-short term memory neural network | |
Shi et al. | Lateral transfer learning for multiagent reinforcement learning | |
CN112198892B (en) | Multi-unmanned aerial vehicle intelligent cooperative penetration countermeasure method | |
CN112180724A (en) | Training method and system for multi-agent cooperative cooperation under interference condition | |
CN114358141A (en) | Multi-agent reinforcement learning method oriented to multi-combat-unit cooperative decision | |
CN111753300B (en) | Method and device for detecting and defending abnormal data for reinforcement learning | |
CN111258219B (en) | Inversion identification method for multi-agent system cooperation strategy | |
CN116225049A (en) | Multi-unmanned plane wolf-crowd collaborative combat attack and defense decision algorithm | |
CN115544714A (en) | Time sequence dynamic countermeasure threat assessment method based on aircraft formation | |
CN106507275A (en) | A kind of robust Distributed filtering method and apparatus of wireless sensor network | |
CN114679729A (en) | Radar communication integrated unmanned aerial vehicle cooperative multi-target detection method | |
Friedrich et al. | Neural optimal feedback control with local learning rules | |
CN113894780A (en) | Multi-robot cooperative countermeasure method and device, electronic equipment and storage medium | |
Liu et al. | Optimal DoS attack scheduling for multi-sensor remote state estimation over interference channels | |
Stella et al. | Bio-inspired evolutionary game dynamics in symmetric and asymmetric models | |
CN117009811A (en) | Multi-agent training method and system based on reinforcement learning | |
CN114866272B (en) | Multi-round data delivery system of true value discovery algorithm in crowd-sourced sensing environment | |
CN116301042A (en) | Unmanned aerial vehicle group autonomous control method based on VGG16 and virtual game | |
CN114757092A (en) | System and method for training multi-agent cooperative communication strategy based on teammate perception | |
CN114662655A (en) | Attention mechanism-based weapon and chess deduction AI hierarchical decision method and device | |
CN114170338A (en) | Image generation method based on adaptive gradient clipping under differential privacy protection | |
CN113807230A (en) | Equipment target identification method based on active reinforcement learning and man-machine intelligent body | |
CN112926746A (en) | Decision-making method and device for multi-agent reinforcement learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |