CN112529110A - Adversary strategy inversion method, system and device - Google Patents

Adversary strategy inversion method, system and device Download PDF

Info

Publication number
CN112529110A
CN112529110A CN202011586486.3A CN202011586486A CN112529110A CN 112529110 A CN112529110 A CN 112529110A CN 202011586486 A CN202011586486 A CN 202011586486A CN 112529110 A CN112529110 A CN 112529110A
Authority
CN
China
Prior art keywords
agent
probability
distance
strategy
confrontation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011586486.3A
Other languages
Chinese (zh)
Other versions
CN112529110B (en
Inventor
范国梁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN202011586486.3A priority Critical patent/CN112529110B/en
Publication of CN112529110A publication Critical patent/CN112529110A/en
Application granted granted Critical
Publication of CN112529110B publication Critical patent/CN112529110B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Economics (AREA)
  • Molecular Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Quality & Reliability (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the field of decision deduction, and particularly relates to an adversary strategy inversion method, system and device, aiming at solving the problems that the conventional strategy inversion method cannot effectively estimate the adversary intention and is poor in self-adaptability. The method comprises the steps of acquiring state information of each agent of the confrontation party in a visible range in real time as input information; based on input information, combining with the pre-acquired first probability, acquiring posterior prediction probabilities corresponding to the advancing routes of the agents of the confrontation party through a deep confidence network model; calculating the corresponding predicted maneuvering position of each agent of the confrontation party according to the speed of each agent and the advancing route with the maximum posterior prediction probability; the first probability is the prior probability that the space-time motion trail of each agent of the confrontation party passes through the key location. The invention can effectively estimate the intention of the opponent, and improves the capability and the adaptability of the intelligent game countermeasure.

Description

Adversary strategy inversion method, system and device
Technical Field
The invention belongs to the field of decision deduction, and particularly relates to an adversary strategy inversion method, system and device.
Background
The multi-agent game has the characteristics of real-time confrontation, group cooperation, incomplete information game, huge search space, multiple complex tasks, time-space reasoning and the like, and is a very challenging problem in the current artificial intelligence field. Meanwhile, the research result in the field has wide application prospect in the fields of social management, intelligent transportation, economy, military and the like. Situation assessment in gaming is the primary joint. There are many models for situation assessment at present, but the most common model belongs to an Endsley three-layer situation assessment model. Endsley considers situation evaluation as the understanding process of a decision maker on the meaning of elements in the surrounding environment and predicting the change of the future state of the elements in a certain time and space. Therefore, from the cognitive perspective of people, according to the thinking process of people, the situation assessment is divided into three layers of situation perception, situation understanding and situation prediction. 1) Situation awareness, namely, a commander acquires battlefield environment information such as battlefield environment, force deployment, combat attempt/combat target and the like through multiple channels. 2) Situational understanding, i.e., giving a deep sense and understanding of the perceived information factors in conjunction with the battlefield environment. 3) And (4) situation prediction, namely prediction of the development change of the future event after corresponding action is taken according to the result of situation perception and understanding.
The situation prediction is the most difficult situation prediction in situation evaluation, future behavior estimation and exploration are needed, and especially in the game countermeasure process, the estimation and inversion of the strategy and intention of the opponent are needed, which becomes the key point of the game countermeasure success. The existing strategy inversion method cannot effectively estimate the intention of an opponent.
In addition, distributed multi-agent countermeasure is migrated from a predefined distributed system protocol in order to achieve a single goal. Classical designs have a defined goal and then use a top-down design approach to scatter the operation. For example, operators throughout the battlefield first design optimal strategies for agents on a global scale and then inform each agent how to act based on the agent's local information. However, when an agent leaves the battlefield of the modified system, the previously designed strategy is no longer globally optimal. Thus, in a top-down design, the loss of a piece of work will lose the full effect. In this approach, the agent is programmed to design off-line, losing adaptability.
Disclosure of Invention
In order to solve the above problems in the prior art, that is, to solve the problems that the conventional strategy inversion method cannot effectively estimate the adversary intention and has poor adaptivity, a first aspect of the present invention provides an adversary strategy inversion method, which includes:
step S10, acquiring the state information of each agent of the confrontation party in a visible range in real time as input information; the state information comprises ID, space-time motion trail, maneuvering state and speed;
step S20, based on the input information, combining the pre-acquired first probability, and acquiring the posterior prediction probability corresponding to each intelligent agent forward route of the confrontation party through a deep confidence network model;
step S30, calculating the corresponding predicted maneuvering position of each agent of the confrontation party according to the speed of each agent and the advancing route with the maximum posterior prediction probability;
the first probability is the prior probability that the space-time motion trail of each agent of the confrontation party passes through a key place.
In some preferred embodiments, the first probability is obtained by:
step A10, collecting the historical state information of each agent of the confrontation party;
step A20, performing track clustering on the historical state information according to the time sequence by a preset density clustering algorithm; after clustering, using the track points corresponding to the classes with the track points of which the number is greater than a set number threshold as key points;
step A30, calculating the prior probability of the historical space-time motion trail of each agent of the confrontation party passing through the key location as a first probability.
In some preferred embodiments, in step a20, "track clustering the historical status information in time sequence by using a preset density clustering algorithm" is performed by:
calculating the vertical distance, the horizontal distance and the included angle distance between the historical space-time motion track of the current agent and the sample track; the sample track is a track obtained after clustering historical space-time motion tracks of other agents;
combining preset weight, and performing weighted summation on the vertical distance, the horizontal distance and the included angle distance to serve as a final distance between the historical space-time motion track of the current agent and the sample track;
and if the final distance is smaller than a set sample interval threshold value, the final distance is classified into one class.
In some preferred embodiments, the "final distance between the spatiotemporal motion trajectory of the current agent and the sample trajectory" is calculated by:
ist(Li,Lj)=ω.d(Li,Lj)+ω||.d||(Li,Lj)+ωθ.dθ(Li,Lj)
Figure BDA0002867281000000031
d||=MIN(l||1,l||2)
Figure BDA0002867281000000032
wherein, dist (L)i,Lj) Denotes the final distance, d(Li,Lj)、d||(Li,Lj)、dθ(Li,Lj) Respectively represent the vertical distance, the horizontal distance, the included angle distance, omega、ω||、ωθRespectively representing the weight values corresponding to the vertical distance, the horizontal distance and the included angle distance, Li、LjRepresenting sample trajectories, historical spatio-temporal motion trajectories of the current agent,/⊥1、l⊥2Represents LjTo LiDistance of (D), LjTo LiA distance of l||1Represents LjTo LiProjected point of, and LiDistance of the corresponding end point,/||2Represents LjTo LiProjected point of, and LiDistance of the corresponding end point, theta represents LiAnd LjThe included angle of (a).
In some preferred embodiments, in step a30, "calculating a priori probability of the spatiotemporal motion trajectory of each agent of the confrontation passing through the key location", the method includes: and counting and calculating the prior probability of the historical space-time motion trail of each agent of the confrontation party passing through the key location through a Bayesian model.
In a second aspect of the present invention, an adversary strategy inversion system is provided, which includes: the system comprises an information acquisition module, a probability calculation module and a strategy inversion module;
the information acquisition module is configured to acquire the state information of each agent of the confrontation party in a visible range in real time as input information; the state information comprises ID, space-time motion trail, maneuvering state and speed;
the probability calculation module is configured to obtain a posterior prediction probability corresponding to each intelligent agent forward route of the confrontation party through a depth confidence network model based on the input information and in combination with a pre-obtained first probability;
the strategy inversion module is configured to calculate a corresponding predicted maneuvering position of each agent of the confrontation party according to the speed of each agent and the advancing route with the maximum posterior prediction probability;
the first probability is the prior probability that the space-time motion trail of each agent of the confrontation party passes through a key place.
In a third aspect of the present invention, a storage device is provided, in which a plurality of programs are stored, the programs being adapted to be loaded and executed by a processor to implement the above-mentioned adversary strategy inversion method.
In a fourth aspect of the present invention, a processing apparatus is provided, which includes a processor, a storage device; a processor adapted to execute various programs; a storage device adapted to store a plurality of programs; the program is adapted to be loaded and executed by a processor to implement the above-described adversary strategy inversion method.
The invention has the beneficial effects that:
the invention can effectively estimate the intention of the opponent, and improves the capability and the adaptability of the intelligent game countermeasure. The invention firstly clusters the routes of all the agents of the confrontation party and establishes a prior Bayes model. In the current action stage, part of the agent of the countermeasure can be seen in the moving process, and the attack route of the countermeasure can be estimated according to the information of the part of the agent of the countermeasure which can be seen. The method can effectively estimate the action track and the action strategy of the opponent, improve the real-time estimation capability and the self-adaptability of the countermeasure scheme of the opponent, facilitate the subsequent formulation of a countermeasure scheme with stronger pertinence, and effectively improve the game capability of intelligent game countermeasure.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings.
FIG. 1 is a schematic flow chart of an adversary strategy inversion method according to an embodiment of the present invention;
FIG. 2 is a block diagram of an adversary strategy inversion system according to one embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating the clustering effect of the density clustering algorithm according to an embodiment of the present invention;
FIG. 4 is a schematic flow chart of distance calculation according to an embodiment of the present invention;
FIG. 5 is a detailed flow diagram of an adversary strategy inversion method according to an embodiment of the present invention;
FIG. 6 is a diagram of a behavior sequence DBN model according to an embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
An adversary strategy inversion method of the present invention is shown in fig. 1, and the method includes:
step S10, acquiring the state information of each agent of the confrontation party in a visible range in real time as input information; the state information comprises ID, space-time motion trail, maneuvering state and speed;
step S20, based on the input information, combining the pre-acquired first probability, and acquiring the posterior prediction probability corresponding to each intelligent agent forward route of the confrontation party through a deep confidence network model;
step S30, calculating the corresponding predicted maneuvering position of each agent of the confrontation party according to the speed of each agent and the advancing route with the maximum posterior prediction probability;
the first probability is the prior probability that the space-time motion trail of each agent of the confrontation party passes through a key place.
In order to more clearly explain the method for inverting the hand strategy of the present invention, the following will expand the detailed description of the steps in one embodiment of the method of the present invention with reference to the drawings.
In the following embodiments, a process of obtaining the prior probability that the spatiotemporal motion trajectory of each agent of the opponent passes through the key location is detailed first, and then a process of obtaining the predicted maneuver position corresponding to each agent of the opponent is detailed by the opponent strategy inversion method, as shown in fig. 5.
1. Acquisition process of prior probability of time-space motion trajectory of each confronter agent passing through key location
Step A10, collecting the historical state information of each agent of the confrontation party;
in the present embodiment, the historical state information of each moving object of the opponent is collected. The status information includes agent ID, formation (0-Red 1-blue), type (1-infantry 2-vehicle 3-airplane), name, subdivision type (tank 0/chariot 1/personnel 2/artillery 3/unmanned chariot 4/unmanned aerial vehicle 5/helicopter 6/cruise missile 7), base speed, armor type (0-armless 1-light armor 2-medium armor 3-heavy armor 4-composite armor), whether there is a shooting capability between marches, whether to stack, carry weapon ID, number of remaining ammunitions, ammunition type (0-non-missile, 100-heavy missile, 101-medium missile, 102-small missile), whether there is a guided shooting capability, score, bearable type, maximum carrying number, capacity, volume, Observation distance, maneuvering state (0-normal maneuvering 1-marching 2-first-class assault 3-second-class assault 4-masking), current coordinate, percentage progress from current grid to next grid, current maneuvering speed, remaining time of maneuvering stop conversion, maneuvering availability flag bit (used for judging whether maneuvering can be continued or not only in the process of stopping conversion, forced stop cannot continue maneuvering, normal stop can continue maneuvering 0-no 1-yes), whether static (0-no, 1-yes), planned maneuvering path, current blood volume, maximum blood volume, remaining time of switching state, passenger list, transmitting unit list, whether control is lost or not, remaining survival time of patrol missile, remaining time of getting on bus, remaining time of getting off bus, vehicle ID, target state (0-normal maneuvering 1-marching 2-first-class assault 3-second-class assault 4-masking), The remaining cooling time of the weapon, the remaining deployment time of the weapon and the observation of the enemy operator list.
Step A20, performing track clustering on the historical state information according to the time sequence by a preset density clustering algorithm; after clustering, using the track points corresponding to the classes with the track points of which the number is greater than a set number threshold as key points;
in this embodiment, a density clustering algorithm based on a specific track distance function is used. Wherein the distance function is D (tr1, m1, tr2, m2) -R1R2 according to the related mark of the first step filtering output>0. For a given pairThe outputs of the traces tr1, tr2 and their associated markers m1, m2 are non-negative real numbers representing the distance between the two traces. On the basis of the clustering function, the clustering tracks are determined to be grouped, and the tracks which are not clustered are called noise. The output of the clustering operation is a clustering marker of the trajectory, LiC1, C2.., Cm, noise }, where Cm is the clustering result. The clustering adopts a progressive clustering method, and the clustering based on the density relates to two parameters, namely a neighborhood radius NR and the minimum neighbor number NN of an object which becomes a clustering core object. These two parameters are modified to progressively cluster. The most dense clusters are first determined using the larger NNs and smaller NRs. The clustering operation is then iteratively applied to the previous clustering results. In each step, the parameter settings are relaxed by increasing NR or decreasing NN, resulting in better clustering results. The gradual mode is also characterized in that if some clustering result has a certain part with poor clustering effect, the clustering result can be re-clustered for the part independently. Multiple attempts are needed to form a good clustering effect, and the clustering effect is judged manually through visual expression of clustering or through counting the distance sum of clustering tracks.
Trajectory clustering is a very important research topic in the fields of statistical pattern recognition, data mining and the like. The motion track of the war chess deduced combat entity contains the actual combat intention of the commander, the overall behavior mode between the combat entities can be found through track clustering, the combat intention of the commander can be further assisted and judged, and the CTECW algorithm is mainly adopted. The CTECW algorithm is divided into three parts: track preprocessing, track segmentation clustering and visual representation. And the track preprocessing converts the original track of the entity into the simplified track of the entity, and then the simplified track is further processed into track segments. The algorithm introduces the concept of a density function in the DENCLUE algorithm under the basic framework of the DBSCAN algorithm, and clusters track segments based on the proposed similarity measurement function. The visual representation shows the track segmented clustering result to the commander in a form of endowing military meaning, so that the commander can understand and accept the clustering result more easily. Theoretical analysis and experimental results show that the CTECW algorithm can obtain a clustering result which is closer to that of the TRACLUS algorithm, but the calculation is carried outThe efficiency is higher than the TRACLUS algorithm and the clustering result does not depend on the careful selection of user parameters. The CTECW algorithm is improved and then clustered, namely elements of level and included angle are added. Clustering results As shown in FIG. 3, 6 spatiotemporal trajectories form 4 classes on 3 consecutive timestamps
Figure BDA0002867281000000081
Assuming that the set number threshold m is 4, only the class with the number of track points larger than the set number threshold after clustering is provided
Figure BDA0002867281000000082
Calculating the distance between the historical spatiotemporal motion trajectory of the current agent and the clustered trajectories (sample trajectories) of the historical spatiotemporal motion trajectories of other agents, as shown in fig. 4:
the first condition is as follows: given that two lines of equal length are parallel to each other and that the line connecting the two start points or the line connecting the two end points is perpendicular to the two line segments, we can use the perpendicular distance d between themTo measure their similarity, the closer the distance, the more similar, and if the two line segments are completely coincident, it means that they are completely identical.
The calculation method of d ≠ is as follows:
Figure BDA0002867281000000091
wherein l⊥1、l⊥2Represents LjTo LiDistance of (D), LjTo LiThe distance of (c).
Case two: the two line segments are dislocated along the direction or the length of one line segment is changed, under the condition of equal d ^ two, obviously, the similarity of the two line segments is not enough only by using the vertical distance, and the difference in the horizontal direction is needed to measure, namely, the horizontal distance d is used||
d||=MIN(l||1,l||2) (2)
Wherein,l||1Represents LjTo LiProjected point of, and LiDistance of the corresponding end point,/||2Represents LjTo LiProjected point of, and LiThe distance of the corresponding end point.
Case three: when one of the line segments is rotated along a certain direction, the larger the included angle between the two line segments is, the smaller the similarity thereof is, so that the concept d of the included angle distance needs to be introducedθ. The definition of the distance is based on this idea. If the direction of the track is not considered, | | L can be usedjMeasured as | x sin (θ). Considering the track direction dθThe acquisition method comprises the following steps:
Figure BDA0002867281000000092
wherein θ represents LiAnd LjThe included angle of (a).
The three parts are integrated to be:
dist(Li,Lj)=ω.d(Li,Lj)+ω||.d||(Li,Lj)+ωθ.dθ(Li,Lj) (4)
wherein, ω is、ω||、ωθWeight values corresponding to vertical, horizontal and angle distances, dist (L)i,Lj) Indicating a final distance, and if the final distance is less than a set sample spacing threshold, clustering into a class, Li、LjRepresenting sample trajectories, historical spatiotemporal motion trajectories of the current agent. In FIG. 4, Si and ei denote LiThe endpoints of (Sj, ej) represent LjThe end points of (A) and (B) are each represented by LjAt both endpoints of LiThe projected point of (a).
Step A30, calculating the prior probability of the historical space-time motion trail of each agent of the confrontation party passing through the key location as a first probability.
In the embodiment, based on the historical space-time motion trail and the key location of the confrontation agent, the prior probability of the historical space-time motion trail of each agent body passing through the key location is calculated through Bayesian probability model statistics.
2. Adversary strategy inversion method
Step S10, acquiring the state information of each agent of the confrontation party in a visible range in real time as input information;
in the embodiment, the state information of each agent within the visible range of the confrontation party is acquired in real time; the status information is shown in step a 10.
Step S20, based on the input information, combining the pre-acquired first probability, and acquiring the posterior prediction probability corresponding to each intelligent agent forward route of the confrontation party through a deep confidence network model;
in the embodiment, n historical space-time motion trajectories of the opponents are clustered, and after the clustering, the distances D1, D2, Dn between the historical space-time motion trajectories and the actual space-time motion trajectory of the opponents at the current moment are calculated, and the similarity probability among the trajectories is calculated. At time 1, the enemy unit is in the base policy c1, exhibiting attribute state values [ D1, D2]1(ii) a By time 2, the base policy is converted to c2 and the corresponding attribute state values become [ D1, D2]2And so on, until the time t, the basic strategy is converted into ct, and the corresponding attribute state value becomes [ D1, D2]t. In the process, although the sequence modes of the sequences c1, c2, c3 and … cannot be known by the self, the state and the change situation of the attribute state values can be detected, and then the self can deduce and obtain the basic strategy state at the corresponding moment according to the change situation of the attribute state. This constitutes a BN model in time series, which is a typical DBN model. The probability membership degree of the basic behavior of the enemy combat unit at any moment can be calculated, and according to the relevant theory of the DBN model, the basic strategy sequences c1, c2 and c3 can be further smoothed and filtered, and the probability prediction of the future behavior of the confronters can be made. As shown in fig. 6.
And step S30, calculating the corresponding predicted maneuvering position of each agent of the confrontation party according to the speed of each agent and the advancing route with the maximum posterior prediction probability.
In this embodiment, from the selected forward route of the maximum a posteriori prediction probability, the predicted maneuver position corresponding to the competitor's miss can be predicted based on the common velocity value of the competitor agent along the forward route from the current time.
An adversary strategy inversion system according to a second embodiment of the present invention, as shown in fig. 2, specifically includes: the system comprises an information acquisition module 100, a probability calculation module 200 and a strategy inversion module 300;
the information acquisition module 100 is configured to acquire status information of each agent of the confrontation party in a visible range in real time as input information; the state information comprises ID, space-time motion trail, maneuvering state and speed;
the probability calculation module 200 is configured to obtain, based on the input information, a posterior prediction probability corresponding to each intelligent agent forward route of the confrontation party through a deep confidence network model in combination with a pre-obtained first probability;
the strategy inversion module 300 is configured to calculate, for each agent of the confrontation party, a corresponding predicted maneuvering position according to the speed of the agent and the advancing route with the maximum posterior prediction probability;
the first probability is the prior probability that the space-time motion trail of each agent of the confrontation party passes through a key place.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process and related description of the system described above may refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.
It should be noted that, the adversary policy inversion system provided in the foregoing embodiment is only illustrated by the division of the above functional modules, and in practical applications, the above functions may be allocated to different functional modules according to needs, that is, the modules or steps in the embodiments of the present invention are further decomposed or combined, for example, the modules in the foregoing embodiments may be combined into one module, or may be further split into multiple sub-modules, so as to complete all or part of the above described functions. The names of the modules and steps involved in the embodiments of the present invention are only for distinguishing the modules or steps, and are not to be construed as unduly limiting the present invention.
A storage device according to a third embodiment of the present invention stores therein a plurality of programs adapted to be loaded by a processor and to implement the above-described adversary policy inversion method.
A processing apparatus according to a fourth embodiment of the present invention includes a processor, a storage device; a processor adapted to execute various programs; a storage device adapted to store a plurality of programs; the program is adapted to be loaded and executed by a processor to implement the above-described adversary strategy inversion method.
It can be clearly understood by those skilled in the art that, for convenience and brevity, the specific working processes and related descriptions of the storage device and the processing device described above may refer to the corresponding processes in the foregoing method examples, and are not described herein again.
It should be noted that the computer readable medium mentioned above in the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The terms "first," "second," and the like are used for distinguishing between similar elements and not necessarily for describing or implying a particular order or sequence.
The terms "comprises," "comprising," or any other similar term are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.

Claims (8)

1. An adversary strategy inversion method is characterized by comprising the following steps:
step S10, acquiring the state information of each agent of the confrontation party in a visible range in real time as input information; the state information comprises ID, space-time motion trail, maneuvering state and speed;
step S20, based on the input information, combining the pre-acquired first probability, and acquiring the posterior prediction probability corresponding to each intelligent agent forward route of the confrontation party through a deep confidence network model;
step S30, calculating the corresponding predicted maneuvering position of each agent of the confrontation party according to the speed of each agent and the advancing route with the maximum posterior prediction probability;
the first probability is the prior probability that the space-time motion trail of each agent of the confrontation party passes through a key place.
2. The adversary strategy inversion method of claim 1, characterized in that the first probability is obtained by:
step A10, collecting the historical state information of each agent of the confrontation party;
step A20, performing track clustering on the historical state information according to the time sequence by a preset density clustering algorithm; after clustering, using the track points corresponding to the classes with the track points of which the number is greater than a set number threshold as key points;
step A30, calculating the prior probability of the historical space-time motion trail of each agent of the confrontation party passing through the key location as a first probability.
3. The adversary strategy inversion method of claim 2, wherein in the step a20, "track clustering is performed on the historical state information in time sequence by using a preset density clustering algorithm", the method comprises:
calculating the vertical distance, the horizontal distance and the included angle distance between the historical space-time motion track of the current agent and the sample track; the sample track is a track obtained after clustering historical space-time motion tracks of other agents;
combining preset weight, and performing weighted summation on the vertical distance, the horizontal distance and the included angle distance to serve as a final distance between the historical space-time motion track of the current agent and the sample track;
and if the final distance is smaller than a set sample interval threshold value, the final distance is classified into one class.
4. The adversary strategy inversion method of claim 3, wherein the final distance between the space-time motion trajectory of the current agent and the sample trajectory is calculated by:
dist(Li,Lj)=ω.d(Li,Lj)+ω||·d||(Li,Lj)+ωθ.dθ(Li,Lj)
Figure FDA0002867280990000021
d||=MIN(l||1,l||2)
Figure FDA0002867280990000022
wherein, dist (L)i,Lj) Denotes the final distance, d(Li,Lj)、d||(Li,Lj)、dθ(Li,Lj) Respectively represent the vertical distance, the horizontal distance, the included angle distance, omega、ω||、ωθRespectively representing the weight values corresponding to the vertical distance, the horizontal distance and the included angle distance, Li、LjRepresenting sample trajectories, historical spatio-temporal motion trajectories of the current agent,/⊥1、l⊥2Represents LjTo LiDistance of (D), LjTo LiA distance of l||1Represents LjTo LiProjected point of, and LiDistance of the corresponding end point,/||2Represents LjTo LiProjected point of, and LiDistance of the corresponding end point, theta represents LiAnd LjThe included angle of (a).
5. The adversary strategy inversion method of claim 2, wherein in step a30, "calculating the prior probability of the historical spatiotemporal motion trajectories of each agent of the confrontation passing through the key location" comprises: and counting and calculating the prior probability of the historical space-time motion trail of each agent of the confrontation party passing through the key location through a Bayesian model.
6. An adversary strategy inversion system, comprising: the system comprises an information acquisition module, a probability calculation module and a strategy inversion module;
the information acquisition module is configured to acquire the state information of each agent of the confrontation party in a visible range in real time as input information; the state information comprises ID, space-time motion trail, maneuvering state and speed;
the probability calculation module is configured to obtain a posterior prediction probability corresponding to each intelligent agent forward route of the confrontation party through a depth confidence network model based on the input information and in combination with a pre-obtained first probability;
the strategy inversion module is configured to calculate a corresponding predicted maneuvering position of each agent of the confrontation party according to the speed of each agent and the advancing route with the maximum posterior prediction probability;
the first probability is the prior probability that the space-time motion trail of each agent of the confrontation party passes through a key place.
7. A storage device having stored therein a plurality of programs, wherein said programs are adapted to be loaded and executed by a processor to implement the adversary strategy inversion method of claims 1-5.
8. A processing device comprising a processor, a storage device; a processor adapted to execute various programs; a storage device adapted to store a plurality of programs; characterized in that said program is adapted to be loaded and executed by a processor to implement the adversary strategy inversion method of claims 1-5.
CN202011586486.3A 2020-12-29 2020-12-29 Adversary strategy inversion method, system and device Active CN112529110B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011586486.3A CN112529110B (en) 2020-12-29 2020-12-29 Adversary strategy inversion method, system and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011586486.3A CN112529110B (en) 2020-12-29 2020-12-29 Adversary strategy inversion method, system and device

Publications (2)

Publication Number Publication Date
CN112529110A true CN112529110A (en) 2021-03-19
CN112529110B CN112529110B (en) 2023-04-07

Family

ID=74976820

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011586486.3A Active CN112529110B (en) 2020-12-29 2020-12-29 Adversary strategy inversion method, system and device

Country Status (1)

Country Link
CN (1) CN112529110B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108153332A (en) * 2018-01-09 2018-06-12 中国科学院自动化研究所 Trace simulation system based on big envelope curve game strategies
US20180268281A1 (en) * 2017-03-17 2018-09-20 The Regents Of The University Of Michigan Method And Apparatus For Constructing Informative Outcomes To Guide Multi-Policy Decision Making
CN111093191A (en) * 2019-12-11 2020-05-01 南京邮电大学 Crowd sensing position data issuing method based on differential privacy
CN111221352A (en) * 2020-03-03 2020-06-02 中国科学院自动化研究所 Control system based on cooperative game countermeasure of multiple unmanned aerial vehicles
CN111857134A (en) * 2020-06-29 2020-10-30 江苏大学 Target obstacle vehicle track prediction method based on Bayesian network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180268281A1 (en) * 2017-03-17 2018-09-20 The Regents Of The University Of Michigan Method And Apparatus For Constructing Informative Outcomes To Guide Multi-Policy Decision Making
CN108153332A (en) * 2018-01-09 2018-06-12 中国科学院自动化研究所 Trace simulation system based on big envelope curve game strategies
CN111093191A (en) * 2019-12-11 2020-05-01 南京邮电大学 Crowd sensing position data issuing method based on differential privacy
CN111221352A (en) * 2020-03-03 2020-06-02 中国科学院自动化研究所 Control system based on cooperative game countermeasure of multiple unmanned aerial vehicles
CN111857134A (en) * 2020-06-29 2020-10-30 江苏大学 Target obstacle vehicle track prediction method based on Bayesian network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
朱宇涛 等: "基于分布式并行协商机制的无人机蜂群在线协同任务分配方法", 《第八届中国指挥控制大会》 *
段勇等: "多智能体强化学习及其在足球机器人角色分配中的应用", 《控制理论与应用》 *

Also Published As

Publication number Publication date
CN112529110B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
CN105892480B (en) Isomery multiple no-manned plane systematic collaboration, which is examined, beats task self-organizing method
Zhang et al. An improved constrained differential evolution algorithm for unmanned aerial vehicle global route planning
Yang et al. Decentralized cooperative search by networked UAVs in an uncertain environment
US9240001B2 (en) Systems and methods for vehicle survivability planning
US9488441B2 (en) Method and system of mission planning
US9030347B2 (en) Preemptive signature control for vehicle survivability planning
WO2014021961A2 (en) Systems and methods for vehicle survivability planning
CN112925350A (en) Multi-unmanned aerial vehicle distributed cooperative target searching method
CN111783020A (en) Multidimensional characteristic battlefield entity target grouping method and system
CN109063819B (en) Bayesian network-based task community identification method
US8831793B2 (en) Evaluation tool for vehicle survivability planning
Sun et al. Route evaluation for unmanned aerial vehicle based on type-2 fuzzy sets
CN114397911A (en) Unmanned aerial vehicle cluster confrontation decision-making method based on multiple intelligent agents
CN114676743B (en) Low-speed small target track threat identification method based on hidden Markov model
CN112529110B (en) Adversary strategy inversion method, system and device
CN112749496B (en) Equipment system combat effectiveness evaluation method and system based on time sequence combat ring
McLemore et al. A model for geographically distributed combat interactions of swarming naval and air forces
AU2021102799A4 (en) Method for clustering battlefield entity targets based on multidimensional features and system thereof
CN106996789B (en) Multi-airborne radar cooperative detection airway planning method
CN114358127A (en) Aerial task group identification method
Salmond Tracking and guidance with intermittent obscuration and association uncertainty
Xiao et al. A robust target intention recognition method based on dynamic bayesian network
CN115686071B (en) Multi-unmanned aerial vehicle cooperative attack route real-time planning method and device
CN116628449B (en) Situation assessment method of graph-based adjacency point priority joint tree SAAD-JT algorithm
CN117787625A (en) Task allocation method in manned-unmanned aerial vehicle formation scene

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant