CN112529110B - Adversary strategy inversion method, system and device - Google Patents

Adversary strategy inversion method, system and device Download PDF

Info

Publication number
CN112529110B
CN112529110B CN202011586486.3A CN202011586486A CN112529110B CN 112529110 B CN112529110 B CN 112529110B CN 202011586486 A CN202011586486 A CN 202011586486A CN 112529110 B CN112529110 B CN 112529110B
Authority
CN
China
Prior art keywords
distance
agent
probability
track
historical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011586486.3A
Other languages
Chinese (zh)
Other versions
CN112529110A (en
Inventor
范国梁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN202011586486.3A priority Critical patent/CN112529110B/en
Publication of CN112529110A publication Critical patent/CN112529110A/en
Application granted granted Critical
Publication of CN112529110B publication Critical patent/CN112529110B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Economics (AREA)
  • Molecular Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Quality & Reliability (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the field of decision deduction, and particularly relates to an adversary strategy inversion method, system and device, aiming at solving the problems that the conventional strategy inversion method cannot effectively estimate the intention of an opponent and is poor in adaptivity. The method comprises the steps of acquiring state information of each agent of the confrontation party in a visible range in real time as input information; based on input information, combining with a first probability which is obtained in advance, obtaining posterior prediction probabilities which correspond to the advancing routes of the agents of the confrontation party through a deep confidence network model; calculating the corresponding predicted maneuvering position of each agent of the confrontation party according to the speed of each agent and the advancing route with the maximum posterior prediction probability; the first probability is the prior probability that the space-time motion trail of each agent of the confrontation party passes through the key location. The invention can effectively estimate the intention of the opponent, and improves the capability and the adaptability of the intelligent game countermeasure.

Description

Adversary strategy inversion method, system and device
Technical Field
The invention belongs to the field of decision deduction, and particularly relates to an adversary strategy inversion method, system and device.
Background
The multi-agent game has the characteristics of real-time confrontation, group cooperation, incomplete information game, huge search space, multiple complex tasks, time-space reasoning and the like, and is a very challenging problem in the current artificial intelligence field. Meanwhile, research results in the field have wide application prospects in the fields of social management, intelligent transportation, economy, military and the like. Situation assessment in gaming is the primary joint. There are many models for situation assessment, but the most common model belongs to an Endsley three-layer situation assessment model. Endsley considers situation evaluation as the understanding process of a decision maker on the meaning of elements in the surrounding environment and predicting the change of the future state of the elements in a certain time and space. Therefore, from the cognitive perspective of human, according to the thinking process of human, he divides the situation assessment into three layers of situation perception, situation understanding and situation prediction. 1) Situation awareness, namely, a commander acquires battlefield environment information through multiple channels, such as battlefield environment, military force deployment, combat attempt/combat target and the like. 2) Situational understanding, i.e., giving a deep sense and understanding of the perceived information factors in conjunction with the battlefield environment. 3) And (4) situation prediction, namely prediction of the development change of the future event after corresponding action is taken according to the result of situation perception and understanding.
The situation prediction is the most difficult situation prediction in situation evaluation, future behavior estimation and exploration are needed, and especially in the game countermeasure process, the estimation and inversion of the strategy and intention of the opponent are needed, which becomes the key point of the game countermeasure success. The existing strategy inversion method cannot effectively estimate the intention of an opponent.
In addition, distributed multi-agent countermeasure is migrated from a predefined distributed system protocol in order to achieve a single goal. Classical designs have a defined goal and then use a top-down design approach to scatter the operation. For example, operators throughout the battlefield first design optimal strategies for agents on a global scale and then inform each agent how to act based on the agent's local information. However, when an agent leaves the battlefield of the modified system, the previously designed strategy is no longer globally optimal. Thus, in a top-down design, the loss of a piece of work will lose the full effect. In this approach, the agent is programmed to design off-line, losing adaptability.
Disclosure of Invention
In order to solve the above problems in the prior art, that is, to solve the problems that the conventional strategy inversion method cannot effectively estimate the adversary intention and has poor adaptivity, a first aspect of the present invention provides an adversary strategy inversion method, which includes:
step S10, acquiring the state information of each agent of the confrontation party in a visible range in real time as input information; the state information comprises ID, space-time motion trail, maneuvering state and speed;
step S20, based on the input information, combining the pre-acquired first probability, and acquiring the posterior prediction probability corresponding to each intelligent agent advancing route of the confrontation party through a deep confidence network model;
step S30, calculating corresponding predicted maneuvering positions of the agents of the confrontation party according to the speeds of the agents and the advancing route with the maximum posterior prediction probability;
the first probability is the prior probability that the space-time motion trail of each agent of the confrontation party passes through a key place.
In some preferred embodiments, the first probability is obtained by:
step A10, collecting historical state information of each agent of the confrontation party;
step A20, performing track clustering on the historical state information according to a preset density clustering algorithm in time sequence; after clustering, using the track points corresponding to the classes with the track points of which the number is greater than a set number threshold as key points;
step A30, calculating prior probability of the historical space-time motion trail of each agent of the confrontation party passing through the key location as first probability.
In some preferred embodiments, in step a20, "performing track clustering on the historical status information in time sequence by using a preset density clustering algorithm" is performed by:
calculating the vertical distance, the horizontal distance and the included angle distance between the historical space-time motion track of the current intelligent agent and the sample track; the sample track is a track obtained after clustering historical space-time motion tracks of other agents;
combining preset weight, and performing weighted summation on the vertical distance, the horizontal distance and the included angle distance to serve as a final distance between the historical space-time motion track of the current agent and the sample track;
and if the final distance is smaller than a set sample interval threshold value, the final distance is classified into one class.
In some preferred embodiments, the "final distance between the spatiotemporal motion trajectory of the current agent and the sample trajectory" is calculated by:
ist(L i ,L j )=ω .d (L i ,L j )+ω || .d || (L i ,L j )+ω θ .d θ (L i ,L j )
Figure BDA0002867281000000031
d || =MIN(l ||1 ,l ||2 )
Figure BDA0002867281000000032
wherein, dist (L) i ,L j ) Denotes the final distance, d (L i ,L j )、d || (L i ,L j )、d θ (L i ,L j ) Respectively represent the vertical distance, the horizontal distance, the included angle distance, omega 、ω || 、ω θ Respectively represents the weight values corresponding to the vertical distance, the horizontal distance and the included angle distance, L i 、L j Representing sample trajectories, historical spatio-temporal motion trajectories of the current agent,/ ⊥1 、l ⊥2 Represents L j To L i Distance of (D), L j To L i A distance of l ||1 Represents L j To L i Projected point of, and L i Distance of the corresponding end point, l ||2 Represents L j To L i Projected point of, and L i Distance of the corresponding end point, theta represents L i And L j The included angle of (a).
In some preferred embodiments, in step a30, "calculating a priori probability that the spatiotemporal motion trajectory of each agent of the confrontation passes through the key location" is performed by: and counting and calculating the prior probability of the historical space-time motion trail of each agent of the confrontation party passing through the key location through a Bayesian model.
In a second aspect of the present invention, an adversary strategy inversion system is provided, which includes: the system comprises an information acquisition module, a probability calculation module and a strategy inversion module;
the information acquisition module is configured to acquire the state information of each agent of the confrontation party in a visible range in real time as input information; the state information comprises ID, space-time motion trail, maneuvering state and speed;
the probability calculation module is configured to obtain posterior prediction probabilities corresponding to the advancing routes of the agents of the confrontation party through a deep confidence network model based on the input information and in combination with a first probability which is obtained in advance;
the strategy inversion module is configured to calculate a corresponding predicted maneuvering position of each agent of the confrontation party according to the speed of each agent and the advancing route with the maximum posterior prediction probability;
the first probability is the prior probability that the space-time motion trail of each agent of the confrontation party passes through a key place.
In a third aspect of the present invention, a storage device is provided, in which a plurality of programs are stored, the programs being adapted to be loaded and executed by a processor to implement the above-mentioned adversary strategy inversion method.
In a fourth aspect of the present invention, a processing apparatus is provided, which includes a processor, a storage device; a processor adapted to execute various programs; a storage device adapted to store a plurality of programs; the program is adapted to be loaded and executed by a processor to implement the above-described adversary strategy inversion method.
The invention has the beneficial effects that:
the method can effectively estimate the intention of the opponent, and improves the capability and the adaptability of the intelligent game confrontation. The invention firstly clusters the routes of all the intelligent agents of the confrontation party and establishes a prior Bayes model. In the current action stage, part of the agent of the confrontation party can be seen in the moving process, and the attack route of the confrontation party can be estimated according to the information of the part of the confrontation party which can see the agent. The method can effectively estimate the action track and the action strategy of the opponent, improve the real-time estimation capability and the self-adaptability of the countermeasure scheme of the opponent, facilitate the subsequent formulation of a countermeasure scheme with stronger pertinence, and effectively improve the game capability of intelligent game countermeasure.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings.
FIG. 1 is a flow diagram of a method for adversary policy inversion according to an embodiment of the present invention;
FIG. 2 is a block diagram of an adversary strategy inversion system according to one embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating the clustering effect of the density clustering algorithm according to an embodiment of the present invention;
FIG. 4 is a schematic flow chart of distance calculation according to an embodiment of the present invention;
FIG. 5 is a detailed flow diagram of an adversary strategy inversion method according to an embodiment of the present invention;
FIG. 6 is a diagram of a behavior sequence DBN model according to an embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that, in the present application, the embodiments and features of the embodiments may be combined with each other without conflict.
An adversary strategy inversion method of the present invention is shown in fig. 1, and the method includes:
step S10, acquiring the state information of each agent of the confrontation party in a visible range in real time as input information; the state information comprises ID, space-time motion trail, maneuvering state and speed;
step S20, based on the input information, combining the pre-acquired first probability, and acquiring posterior prediction probabilities corresponding to the advancing routes of the agents of the confrontation party through a deep confidence network model;
step S30, calculating corresponding predicted maneuvering positions of the agents of the confrontation party according to the speeds of the agents and the advancing route with the maximum posterior prediction probability;
the first probability is the prior probability that the space-time motion trail of each agent of the confrontation party passes through a key place.
In order to more clearly explain the method for inverting the hand strategy of the present invention, the following will expand the detailed description of the steps in one embodiment of the method of the present invention with reference to the drawings.
In the following embodiments, a process of obtaining the prior probability that the spatiotemporal motion trajectory of each agent of the opponent passes through the key location is detailed first, and then a process of obtaining the predicted maneuver position corresponding to each agent of the opponent is detailed by the opponent strategy inversion method, as shown in fig. 5.
1. Obtaining process of prior probability of time-space motion trail of each agent of confrontation party passing through key site
Step A10, collecting historical state information of each agent of the confrontation party;
in the present embodiment, the historical state information of each moving object of the opponent is collected. The status information includes agent ID, formation (0-red 1-blue), type (1-infantry 2-vehicle 3-airplane), name, subdivision type (tank 0/chariot 1/personnel 2/artillery 3/unmanned chariot 4/unmanned aerial vehicle 5/helicopter 6/cruise missile 7), base speed, armor type (0-armless 1-light armor 2-medium armor 3-heavy armor 4-composite armor), whether there is a shooting capability between marches, whether to stack, carry weapon ID, number of remaining ammunitions, ammunition type (0-non-guided ammunition, 100-heavy missile, 101-medium missile, 102-small missile), whether the guided shooting capability exists, the score value, the bearable type, the maximum bearing number, the capacity, the observation distance, the maneuvering state (0-normal maneuvering 1-marching 2-primary-assault 3-secondary-assault 4-masking), the current coordinate, the percentage progress from the current frame to the next frame, the current maneuvering speed, the remaining time of maneuvering stop conversion, the flag bit of maneuvering stop conversion (only used for judging whether maneuvering can be continued in the process of stopping conversion, forced stopping cannot be continued, maneuvering can be continued in normal stopping 0-1-yes), whether the vehicle is static (0-no, 1-yes), the planned maneuvering path, the current blood volume, the maximum blood volume, the remaining time of switching state, the passenger list, the launching unit list, whether the control is lost, the remaining survival time of flying missiles, the remaining time of getting on the vehicle, the vehicle, the remaining time of alighting, vehicle ID, target state (0-normal maneuvering 1-marching 2-first-class submachine 3-second-class submachine 4-masking), remaining cooling time of weapon, remaining deployment time of weapon, and observation enemy operator list.
Step A20, performing track clustering on the historical state information according to a preset density clustering algorithm in time sequence; after clustering, using the track points corresponding to the classes with the track points of which the number is greater than a set number threshold as key points;
in this embodiment, a density clustering algorithm based on a specific track distance function is used. The distance function is D (tr 1, m1, tr2, m 2) -R1R2 according to the related mark of the first step filtering output>0. For a given pair of tracks tr1, tr2, and their associated labels m1, m2, the output is a non-negative real number representing the distance between the two tracks. On the basis of the clustering function, the clustering tracks are determined to be grouped, and the tracks which are not clustered are called noise. The output of the clustering operation is a clustering marker of the trajectory, L i = C1, C2, ·, cm, noise }, where Cm is the clustering result. The clustering adopts a progressive clustering method, and the clustering based on density relates to two parameters, namely a neighborhood radius NR and the minimum neighbor number NN of an object which becomes a clustering core object. These two parameters are modified to progressively cluster. The most dense clusters are first determined using larger NNs and smaller NRs. The clustering operation is then iteratively applied to the previous clustering results. In each step, the parameter settings are relaxed by increasing NR or decreasing NN, resulting in better clustering results. The gradual mode is also characterized in that if some clustering result has a certain part with poor clustering effect, the clustering result can be re-clustered for the part independently. Multiple attempts are needed to form a good clustering effect, the clustering effect is judged manually through visual expression of clustering, or the distance sum of clustering tracks is countedAnd (4) judging.
Trajectory clustering is a very important research topic in the fields of statistical pattern recognition, data mining and the like. The motion track of the war chess deduced combat entity contains the actual combat intention of the commander, the overall behavior mode between the combat entities can be found through track clustering, the combat intention of the commander can be further assisted and judged, and the CTECW algorithm is mainly adopted. The CTECW algorithm is divided into three parts: track preprocessing, track segmentation clustering and visual representation. And the track preprocessing converts the original track of the entity into the simplified track of the entity, and then the simplified track is further processed into track segments. The algorithm introduces the concept of a density function in the DENCLUE algorithm under the basic framework of the DBSCAN algorithm, and clusters track segments based on the proposed similarity measurement function. The visual representation shows the track segmented clustering result to the commander in a form of endowing military meaning, so that the commander can understand and accept the clustering result more easily. Theoretical analysis and experimental results show that the CTECW algorithm can obtain a clustering result which is closer to the TRACLUS algorithm, but the calculation efficiency is higher than that of the TRACLUS algorithm, and the clustering result does not depend on careful selection of user parameters. The CTECW algorithm is improved and then clustered, namely elements of level and included angle are added. Clustering results As shown in FIG. 3, 6 spatiotemporal trajectories form 4 classes on 3 consecutive timestamps
Figure BDA0002867281000000081
Assuming that the set number threshold m =4, only the cluster class having track points greater than the set number threshold has £ or>
Figure BDA0002867281000000082
Calculating the distance between the historical spatiotemporal motion trajectory of the current agent and the clustered trajectories (sample trajectories) of the historical spatiotemporal motion trajectories of other agents, as shown in fig. 4:
the first condition is as follows: given that two lines of equal length are parallel to each other and that the line connecting the two start points or the line connecting the two end points is perpendicular to the two line segments, we can use the perpendicular distance d between them To measure themThe more similar the distance is, the more similar the two line segments are, and if the two line segments are completely coincident, the two line segments are completely identical.
The calculation method of d ≠ is as follows:
Figure BDA0002867281000000091
wherein l ⊥1 、l ⊥2 Represents L j To L i Distance of (L), L j To L i The distance of (c).
Case two: the two line segments are dislocated along the direction or the length of one line segment is changed, under the condition of equal d ^ two, obviously, the similarity of the two line segments is not enough only by using the vertical distance, and the difference in the horizontal direction is needed to measure, namely, the horizontal distance d is used ||
d || =MIN(l ||1 ,l ||2 ) (2)
Wherein l ||1 Represents L j To L i Projected point of, and L i Distance of the corresponding end point,/ ||2 Represents L j To L i Projected point of (a) and L i The distance corresponding to the end point.
Case three: when one of the line segments is rotated along a certain direction, the larger the included angle between the two line segments is, the smaller the similarity thereof is, so that the concept d of the included angle distance needs to be introduced θ . The definition of the distance is based on this idea. If the direction of the track is not considered, | | L can be used j Measured as | x sin (θ). Considering the track direction d θ The acquisition method comprises the following steps:
Figure BDA0002867281000000092
wherein θ represents L i And L j The included angle of (c).
The three parts are integrated to be:
dist(L i ,L j )=ω .d (L i ,L j )+ω || .d || (L i ,L j )+ω θ .d θ (L i ,L j ) (4)
wherein, ω is 、ω || 、ω θ Indicates the weight values corresponding to the vertical distance, the horizontal distance and the angle distance, dist (L) i ,L j ) Indicating a final distance, and if the final distance is less than a set sample spacing threshold, clustering into a class, L i 、L j Representing sample trajectories, historical spatiotemporal motion trajectories of the current agent. In FIG. 4, si and ei denote L i The endpoints of (Sj, ej) represent L j The end points of (A) and (B) are each represented by L j At two end points of L i The projected point of (a).
Step A30, calculating prior probability of the historical space-time motion trail of each agent of the confrontation party passing through the key location as first probability.
In this embodiment, based on the historical spatiotemporal motion trajectories and key locations of the agent agents of the confrontation side, the prior probability of the historical spatiotemporal motion trajectories of the agent agents passing through the key locations is calculated through bayesian probability model statistics.
2. Adversary strategy inversion method
Step S10, acquiring the state information of each agent of the confrontation party in a visible range in real time as input information;
in the embodiment, the state information of each agent within the visible range of the confrontation party is acquired in real time; the status information is shown in step a 10.
Step S20, based on the input information, combining the pre-acquired first probability, and acquiring the posterior prediction probability corresponding to each intelligent agent advancing route of the confrontation party through a deep confidence network model;
in the embodiment, n historical space-time motion trajectories of the opponents are clustered, and after the clustering, the distances D1, D2, dn between the historical space-time motion trajectories and the actual space-time motion trajectory of the opponents at the current moment are calculated, and the similarity probability among the trajectories is calculated. At time 1, the enemy unit is in base policy c1, exhibiting attributesThe state values [ D1, D2.,. Dn] 1 (ii) a By time 2, the base policy is converted to c2 and the corresponding attribute state value becomes [ D1, D2] 2 And so on, until time t, the basic policy is converted to ct, and the corresponding attribute state value becomes [ D1, D2] t . In the process, although the sequence mode of the sequence c1, c2, c3, \8230 #, cannot be known by the party, the state and the change condition of the attribute state value can be detected, and then the party can deduce and obtain the basic strategy state at the corresponding moment according to the change condition of the attribute state. This constitutes a BN model in time series, which is a typical DBN model. The probability membership degree of the basic behavior of the enemy combat unit at any moment can be calculated, and according to the relevant theory of the DBN model, the basic strategy sequences c1, c2 and c3 can be further smoothed and filtered, so that the probability prediction of the future behavior of the confronters can be made. As shown in fig. 6.
And S30, calculating the corresponding predicted maneuvering positions of the agents of the confrontation party according to the speeds of the agents and the advancing route with the maximum posterior prediction probability.
In this embodiment, from the selected forward route of the maximum a posteriori prediction probability, the predicted maneuver position corresponding to the competitor's miss can be predicted based on the common velocity value of the competitor agent along the forward route from the current time.
An adversary strategy inversion system according to a second embodiment of the present invention, as shown in fig. 2, specifically includes: the system comprises an information acquisition module 100, a probability calculation module 200 and a strategy inversion module 300;
the information acquisition module 100 is configured to acquire status information of each agent of the confrontation party in a visible range in real time as input information; the state information comprises ID, space-time motion trail, maneuvering state and speed;
the probability calculation module 200 is configured to obtain, based on the input information and in combination with the pre-obtained first probability, a posterior prediction probability corresponding to each intelligent agent forward route of the confrontation party through a deep belief network model;
the strategy inversion module 300 is configured to calculate, for each agent of the confrontation party, a corresponding predicted maneuvering position according to the speed of the agent and the advancing route with the maximum posterior prediction probability;
the first probability is the prior probability that the space-time motion trail of each agent of the confrontation party passes through a key place.
It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working process and related description of the system described above may refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.
It should be noted that, the adversary policy inversion system provided in the foregoing embodiment is only illustrated by the division of the above functional modules, and in practical applications, the above functions may be allocated to different functional modules according to needs, that is, the modules or steps in the embodiments of the present invention are further decomposed or combined, for example, the modules in the foregoing embodiments may be combined into one module, or may be further split into multiple sub-modules, so as to complete all or part of the above described functions. The names of the modules and steps involved in the embodiments of the present invention are only for distinguishing the modules or steps, and are not to be construed as unduly limiting the present invention.
A storage device according to a third embodiment of the present invention stores therein a plurality of programs adapted to be loaded by a processor and to implement the above-described adversary policy inversion method.
A processing apparatus according to a fourth embodiment of the present invention includes a processor, a storage device; a processor adapted to execute various programs; a storage device adapted to store a plurality of programs; the program is adapted to be loaded and executed by a processor to implement the above-described adversary strategy inversion method.
It can be clearly understood by those skilled in the art that, for convenience and brevity not described, the specific working processes and related descriptions of the storage device and the processing device described above may refer to the corresponding processes in the foregoing method examples, and are not described herein again.
It should be noted that the computer readable medium mentioned above in the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The terms "first," "second," and the like are used for distinguishing between similar elements and not necessarily for describing or implying a particular order or sequence.
The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is apparent to those skilled in the art that the scope of the present invention is not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.

Claims (4)

1. An adversary strategy inversion method is characterized by comprising the following steps:
step S10, acquiring the state information of each agent of the confrontation party in a visible range in real time as input information; the state information comprises ID, space-time motion trail, maneuvering state and speed;
step S20, based on the input information, combining the pre-acquired first probability, and acquiring posterior prediction probabilities corresponding to the advancing routes of the agents of the confrontation party through a deep confidence network model;
the first probability is obtained by the following method:
step A10, collecting historical state information of each agent of the confrontation party;
step A20, performing track clustering on the historical state information according to a preset density clustering algorithm in time sequence; after clustering, using the track points corresponding to the classes with the track points of which the number is greater than a set number threshold as key points; the method comprises the following steps of carrying out track clustering on the historical state information according to a preset density clustering algorithm in time sequence, wherein the track clustering method comprises the following steps:
calculating the vertical distance, the horizontal distance and the included angle distance between the historical space-time motion track of the current agent and the sample track; the sample track is a track obtained after clustering historical space-time motion tracks of other agents;
and combining preset weight, and performing weighted summation on the vertical distance, the horizontal distance and the included angle distance to serve as a final distance between the historical space-time motion trajectory of the current agent and the sample trajectory:
dist(L i ,L j )=ω .d (L i ,L j )+ω || .d || (L i ,L j )+ω θ .d θ (L i ,L j )
Figure FDA0003972473740000011
d || =MIN(l ||1 ,l ||2 )
Figure FDA0003972473740000012
wherein, dist (L) i ,L j ) Denotes the final distance, d (L i ,L j )、d || (L i ,L j )、d θ (L i ,L j ) Respectively represent the vertical distance, the horizontal distance, the included angle distance, omega 、ω || 、ω θ Respectively representing the weight values corresponding to the vertical distance, the horizontal distance and the included angle distance, L i 、L j Representing sample trajectories, historical spatio-temporal motion trajectories of the current agent,/ ⊥1 、l ⊥2 Represents L j To L i Distance of (D), L j To L i A distance of l ||1 Represents L j To L i Projected point of, and L i Distance of the corresponding end point,/ ||2 Represents L j To L i Projected point of, and L i Distance of the corresponding end point, theta represents L i And L j The included angle of (c);
if the final distance is smaller than a set sample interval threshold value, the final distance is classified into one class;
step A30, calculating prior probability of the historical space-time motion trail of each agent of the confrontation party passing through a key location as a first probability:
calculating the prior probability of the historical space-time motion trail of each agent of the confrontation party passing through the key location through Bayesian model statistics;
step S30, calculating corresponding predicted maneuvering positions of the agents of the confrontation party according to the speeds of the agents and the advancing route with the maximum posterior prediction probability;
the first probability is the prior probability that the space-time motion trail of each agent of the confrontation party passes through a key place.
2. An adversary strategy inversion system, comprising: the system comprises an information acquisition module, a probability calculation module and a strategy inversion module;
the information acquisition module is configured to acquire the state information of each agent of the confrontation party in a visible range in real time as input information; the state information comprises ID, space-time motion trail, maneuvering state and speed;
the probability calculation module is configured to obtain posterior prediction probabilities corresponding to the advancing routes of the agents of the confrontation party through a deep confidence network model based on the input information and in combination with a first probability which is obtained in advance;
the first probability is obtained by the following method:
step A10, collecting historical state information of each agent of the confrontation party;
step A20, performing track clustering on the historical state information according to a preset density clustering algorithm in time sequence; after clustering, taking track points corresponding to the classes with the track points of which the number is greater than a set number threshold value as key points; the method comprises the following steps of carrying out track clustering on the historical state information according to a preset density clustering algorithm in a time sequence, wherein the method comprises the following steps:
calculating the vertical distance, the horizontal distance and the included angle distance between the historical space-time motion track of the current agent and the sample track; the sample track is a track obtained after clustering historical space-time motion tracks of other agents;
and combining preset weight, and performing weighted summation on the vertical distance, the horizontal distance and the included angle distance to serve as a final distance between the historical space-time motion trajectory of the current agent and the sample trajectory:
dist(L i ,L j )=ω .d (L i ,L j )=ω || .d || (L i ,L j )+ω θ .d θ (L i ,L j )
Figure FDA0003972473740000031
d || =MIN(l ||1 ,l ||2 )
Figure FDA0003972473740000032
wherein, dist (L) i ,L j ) Denotes the final distance, d (L i ,L j )、d || (L i ,L j )、d θ (L i ,L j ) Respectively represent the vertical distance, the horizontal distance, the included angle distance, omega 、ω || 、ω θ Respectively representing the weight values corresponding to the vertical distance, the horizontal distance and the included angle distance, L i 、L j Representing sample trajectories, historical spatio-temporal motion trajectories of the current agent,/ ⊥1 、l ⊥2 Represents L j To L i Distance of (D), L j To L i A distance of l ||1 Represents L j To L i Projected point of (a) and L i Distance of the corresponding end point,/ ||2 Represents L j To L i Projected point of, and L i Distance of the corresponding end point, theta represents L i And L j The included angle of (A);
if the final distance is smaller than a set sample interval threshold value, the final distance is classified into one class;
step A30, calculating prior probability of the historical space-time motion trail of each agent of the confrontation party passing through a key location as a first probability:
calculating and calculating the prior probability of the historical space-time motion trail of each agent of the confrontation party passing through a key place through a Bayesian model;
the strategy inversion module is configured to calculate a corresponding predicted maneuvering position of each agent of the confrontation party according to the speed of each agent and the advancing route with the maximum posterior prediction probability;
the first probability is the prior probability that the space-time motion trail of each agent of the confrontation party passes through a key place.
3. A storage device having stored therein a plurality of programs, wherein said programs are adapted to be loaded and executed by a processor to implement the adversary strategy inversion method of claim 1.
4. A processing device comprising a processor, a storage device; a processor adapted to execute programs; a storage device adapted to store a plurality of programs; wherein the program is adapted to be loaded and executed by a processor to implement the adversary strategy inversion method of claim 1.
CN202011586486.3A 2020-12-29 2020-12-29 Adversary strategy inversion method, system and device Active CN112529110B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011586486.3A CN112529110B (en) 2020-12-29 2020-12-29 Adversary strategy inversion method, system and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011586486.3A CN112529110B (en) 2020-12-29 2020-12-29 Adversary strategy inversion method, system and device

Publications (2)

Publication Number Publication Date
CN112529110A CN112529110A (en) 2021-03-19
CN112529110B true CN112529110B (en) 2023-04-07

Family

ID=74976820

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011586486.3A Active CN112529110B (en) 2020-12-29 2020-12-29 Adversary strategy inversion method, system and device

Country Status (1)

Country Link
CN (1) CN112529110B (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11087200B2 (en) * 2017-03-17 2021-08-10 The Regents Of The University Of Michigan Method and apparatus for constructing informative outcomes to guide multi-policy decision making
CN108153332B (en) * 2018-01-09 2020-05-19 中国科学院自动化研究所 Track simulation system based on large envelope game strategy
CN111093191B (en) * 2019-12-11 2022-09-23 南京邮电大学 Crowd sensing position data issuing method based on differential privacy
CN111221352B (en) * 2020-03-03 2021-01-29 中国科学院自动化研究所 Control system based on cooperative game countermeasure of multiple unmanned aerial vehicles
CN111857134B (en) * 2020-06-29 2022-09-16 江苏大学 Target obstacle vehicle track prediction method based on Bayesian network

Also Published As

Publication number Publication date
CN112529110A (en) 2021-03-19

Similar Documents

Publication Publication Date Title
CN111176334B (en) Multi-unmanned aerial vehicle cooperative target searching method
US9488441B2 (en) Method and system of mission planning
CN112925350A (en) Multi-unmanned aerial vehicle distributed cooperative target searching method
CN111783020A (en) Multidimensional characteristic battlefield entity target grouping method and system
CN109063819B (en) Bayesian network-based task community identification method
CN112749496B (en) Equipment system combat effectiveness evaluation method and system based on time sequence combat ring
CN114676743B (en) Low-speed small target track threat identification method based on hidden Markov model
CN114397911A (en) Unmanned aerial vehicle cluster confrontation decision-making method based on multiple intelligent agents
CN105893621A (en) Method for mining target behavior law based on multi-dimensional track clustering
Sun et al. Route evaluation for unmanned aerial vehicle based on type-2 fuzzy sets
CN116757249A (en) Unmanned aerial vehicle cluster strategy intention recognition method based on distributed reinforcement learning
CN112529110B (en) Adversary strategy inversion method, system and device
CN110825112A (en) Oil field dynamic invasion target tracking system and method based on multiple unmanned aerial vehicles
CN114254875A (en) Task-oriented multi-dimensional efficiency evaluation method
McLemore et al. A model for geographically distributed combat interactions of swarming naval and air forces
CN116067232A (en) Unmanned aerial vehicle group induction countering method and system based on leading group drive-off
CN116027793A (en) Multi-unmanned vehicle cooperative target capturing method and system based on road network information
AU2021102799A4 (en) Method for clustering battlefield entity targets based on multidimensional features and system thereof
CN115456090A (en) Knowledge inference engine-based multi-level composite intent prediction method for fighter
CN114358127A (en) Aerial task group identification method
Xiao et al. A robust target intention recognition method based on dynamic bayesian network
CN109658742B (en) Dense flight autonomous conflict resolution method based on preorder flight information
Zhang et al. Path Prediction Method for Automotive Applications Based on Cubic Spline Interpolation
CN118394127B (en) Unmanned aerial vehicle maneuver decision determining method and device
CN116753964B (en) Unmanned aerial vehicle path planning method and system based on influence degree

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant