US20110178978A1 - Characterizing and predicting agents via multi-agent evolution - Google Patents

Characterizing and predicting agents via multi-agent evolution Download PDF

Info

Publication number
US20110178978A1
US20110178978A1 US13/079,766 US201113079766A US2011178978A1 US 20110178978 A1 US20110178978 A1 US 20110178978A1 US 201113079766 A US201113079766 A US 201113079766A US 2011178978 A1 US2011178978 A1 US 2011178978A1
Authority
US
United States
Prior art keywords
agent
agents
environment
behavior
simulation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US13/079,766
Inventor
H. Van Dyke Parunak
Sven Brueckner
Robert S. Matthews
John A. Sauter
Steven M. Brophy
Robert J. Bisson
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US13/079,766 priority Critical patent/US20110178978A1/en
Publication of US20110178978A1 publication Critical patent/US20110178978A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]

Definitions

  • This invention relates generally to agent behavior and, in particular, to a system and method that characterizes an agent's internal state by evolution against observed behavior, and predicts future behavior, taking into account the dynamics of agent interaction with their environment.
  • An agent's beliefs are propositions about the state of the world that it considers true, based on its perceptions. Its desires are propositions about the world that it would like to be true. Desires are not necessarily consistent with one another: an agent might desire both to be rich and not to work at the same time.
  • An agent's intentions, or goals are a subset of its desires that it has selected, based on its beliefs, to guide its future actions. Unlike desires, goals must be consistent with one another (or at least believed to be consistent by the agent).
  • An agent's goals guide its actions. Thus one ought to be able to learn something about an agent's goals by observing its past actions, and knowledge of the agent's goals in turn enables conclusions about what the agent may do in the future.
  • Plan recognition is seldom pursued for its own sake. It usually supports a higher-level function. For example, in human-computer interfaces, recognizing a user's plan can enable the system to provide more appropriate information and options for user action. In a tutoring system, inferring the student's plan is a first step to identifying buggy plans and providing appropriate remediation. In many cases, the higher-level function is predicting likely future actions by the entity whose plan is being inferred.
  • Domains that exhibit these constraints can often be characterized as adversarial, and include military combat, competitive business tactics, and multi-player computer games.
  • the present invention comprises a method of predicting the behavior of software agents in a simulated environment.
  • the method involves modeling a plurality of software agents representing entities to be analyzed, which may be human beings.
  • the internal state of at least one of the agents is estimated by its behavior in the simulation, including its movement within the environment. This facilitates a prediction of the likely future behavior of the agent based solely upon its internal state; that is, without recourse to any intentional agent communications.
  • the simulated environment is based upon a digital pheromone infrastructure.
  • the digital pheromones are scalar variables that agents can sense and which they deposit at their current location in the environment. The agents respond to the local concentrations of the digital pheromones tropistically through climbing or descending local gradients.
  • the pheromone infrastructure runs on the nodes of a graph-structured environment, preferably a rectangular lattice. Each agent is capable of aggregating pheromone deposits from individual agents, thereby fusing information across multiple agents over time. Each agent is further capable of evaporating pheromones over time to remove inconsistencies that result from changes in the simulation, and diffusing pheromones to nearby places, thereby disseminating information for access by nearby agents.
  • this invention is capable of providing an estimate of the entity's internal state, and extrapolating that estimate into a prediction of the entity's likely future behavior.
  • the system and method called BEE (Behavioral Evolution and Extrapolation)
  • BEE Behavioral Evolution and Extrapolation
  • This simulation integrates knowledge of threat regions, a cognitive analysis of the agent's beliefs, desires, and intentions, a model of the agent's emotional disposition and state, and the dynamics of interactions with the environment.
  • By evolving agents in this rich environment their internal state can be fitted to their observed behavior. In realistic wargame scenarios, the system successfully detects deliberately played emotions and makes reasonable predictions about the entities' future behavior.
  • FIG. 2 is a diagram of a Behavioral Emulation and Extrapolation (BEE) Integrated Rational and Emotive Personality Model.
  • BEE Behavioral Emulation and Extrapolation
  • FIG. 3 is graphical representation of an exemplary embodiment of the BEE model, wherein each avatar generates a stream of ghosts that sample the personality space of the entity it represents. They evolve against the observed behavior of the entity in the recent past, and the fittest ghosts then run into the future to generate predictions.
  • FIG. 4 is a Delta Disposition chart for a “Chicken's ghosts” embodiment.
  • FIG. 5 is a Delta Disposition chart for a “Rambo” embodiment.
  • FIG. 6 shows a table for evaluating predictions, where each row corresponds to a successive prediction for a given unit, and each column to a time in the real world that is covered by some set of these predictions.
  • the shaded cells show which predictions cover which time periods.
  • Each cell (a) contains the location error, that is, how far the unit is at the time indicated by the column from where the prediction indicated by the row said it would be.
  • FIG. 7 shows a graphic representation of path characteristics: angle ⁇ , straight-line radius ⁇ , and actual length ⁇ .
  • FIG. 8 shows graphs for exemplary stepwise metrics, including, from left to right, average prospective, retrospective, and horizon error.
  • the thin line is the average of metrics from 100 random walks.
  • the vertical line indicates when the unit dies. Since these are error curves, lower is better.
  • FIG. 9 shows graphs for exemplary component metrics.
  • the thin line is the random baseline. Since these metrics indicate degree of agreement between prediction and baseline, higher is better.
  • the present system provides a Behavioral Evolution and Extrapolation (BEE) method and approach to addressing the recognition of the rational and emotional state of multiple interacting agents based solely on their behavior, without recourse to intentional communications from them.
  • BEE Behavioral Evolution and Extrapolation
  • embodiments of the present invention focus on plan recognition in support of prediction.
  • An agent's plan is a necessary input to a prediction of its future behavior, but hardly a sufficient one. At least two other influences, one internal and one external, need to be taken into account.
  • the external influence is the dynamics of the environment, which may include other agents.
  • the dynamics of the real world impose significant constraints.
  • the environment is autonomous (it may do things on its own that interfere with the desires of the agent) [3, 8].
  • Most interactions among agents, and between agents and the world, are nonlinear. When iterated, these can generate rapid divergence of trajectories (“chaos,” sensitivity to initial conditions).
  • BEE predicts the future by observing the emergent behavior of agents representing the entities of interest in a fine-grained agent simulation.
  • Key elements of the BEE architecture include the model of an individual agent, the pheromone infrastructure through which agents interact, the information sources that guide them, and the overall evolutionary cycle that they execute.
  • the agents in BEE are inspired by two bodies of work. The first is our own previous work on fine-grained agents that coordinate their actions stigmergically, through digital pheromones in a shared environment [1, 11, 13, 14, 16]. The second inspiration is the success of previous agent-based combat modeling in EINSTein and MAUI.
  • Digital pheromones are scalar variables that agents deposit at their current location in the environment, and that they can sense. Agents respond to the local concentrations of these variables tropistically, typically climbing or descending local gradients. Their movements in turn change the deposit patterns. This feedback loop, together with processes of evaporation and propagation in the environment, can support complex patterns of interaction and coordination among the agents [12].
  • Table 1 shows the pheromone flavors currently used in the BEE.
  • ghosts take into account their distance from distinguished static locations, a mechanism that we call “virtual pheromones,” since it has the same effect as propagating a pheromone field from such a location, but with lower computational costs.
  • EINSTein [5] represents an agent as a set of six weights, each in [ ⁇ 1, 1], describing the agent's response to six kinds of information. Four of these describe the number of alive friendly, alive enemy, injured friendly, and injured enemy troops within the agent's sensor range. The other two weights relate to the model's use of a childhood game, “capture the flag,” as a prototype of combat. Each team has a flag, and seeks to protect it from the other team while capturing the other team's flag. The fifth and sixth weights describe how far the agent is from its own and its adversary's flag. A positive weight indicates that the agent is attracted to the entity described by the weight, while a negative weight indicates that it is repelled.
  • MANA [7] extends the concepts in EINSTein. Friendly and enemy flags are replaced by the waypoints being pursued by each side. MANA includes four additional components: low, medium, and high threat enemies. In addition, it defines a set of triggers (e.g., reaching a waypoint, being shot at, making contact with the enemy, being injured) that shift the agent from one personality vector to another. A default state defines the personality vector when no trigger state is active.
  • the personality vectors in MANA and EINSTein reflect both rational and emotive aspects of decision-making.
  • the notion of being attracted or repelled by friendly or adversarial forces in various states of health is an important component of what we informally think of as emotion (e.g., fear, compassion, aggression), and the use of the term “personality” in both EINSTein and MANA suggests that the system designers are thinking anthropomorphically, though they do not use “emotion” to describe the effect they are trying to achieve.
  • the notion of waypoints to which an agent is attracted reflects goal-oriented rationality.
  • BEE embodies an integrated rational-emotive personality model.
  • a BEE agent's rationality is modeled as a vector of seven desires, which are values in [ ⁇ 1, +1]: ProtectRed (the adversary), ProtectBlue (friendly forces), ProtectGreen (civilians), ProtectKeySites, AvoidCombat, AvoidDetection, and Survive. Negative values reverse the sense suggested by the label. For example, a negative value of ProtectRed indicates a desire to harm Red.
  • Table 2 shows which pheromones A(ttract) or R(epel) an agent with a given desire, and how that tendency translates into action.
  • the emotive component of a BEE's personality is based on the Ortony-Clore-Collins (OCC) framework [9], and described in detail elsewhere [10].
  • OCC organocortic reaction
  • MANA MANA's trigger states.
  • An important advance in BEE's emotional model with respect to MANA and EINSTein is the recognition that agents may differ in how sensitive they are to triggers. For example, threatening situations tend to stimulate the emotion of fear, but a given level of threat will produce more fear in a new recruit than in a seasoned combat veteran.
  • the present model includes not only Emotions, but Dispositions. Each Emotion has a corresponding Disposition. Dispositions are relatively stable, and considered constant over the time horizon of a run of the BEE, while Emotions vary based on the agent's disposition and the stimuli to which it is exposed.
  • the effect of a non-zero emotion is to modify actions.
  • An elevated level of Anger will increase movement likelihood, weapon firing likelihood, and tendency toward an exposed posture.
  • An increasing level of Fear will decrease these likelihoods.
  • FIG. 2 summarizes one embodiment of the BEE's personality model.
  • the left two columns are a straightforward BDI model (where we prefer the term “goal” to “intention”).
  • the right-hand column is the emotive component, where an appraisal of the agent's beliefs, moderated by the disposition, leads to an emotion that in turn influences the BDI analysis.
  • a major innovation in BEE is an extension of the nonlinear systems technique described herein to characterize agents based on their past behavior and extrapolate their future behavior based on this characterization. This section describes this process at a high level, then discusses in more detail the multi-page pheromone infrastructure that implements it.
  • FIG. 3 is an overview of one embodiment of the BEE process.
  • Each active entity in the battlespace has an avatar that continuously generates a stream of ghost agents representing itself. ghosts live on a timeline indexed by ⁇ that begins in the past at the insertion horizon and runs into the future to the prediction horizon. ⁇ is offset with respect to the current time t in the domain being modeled.
  • the timeline is divided into discrete “pages,” each representing a successive value of ⁇ .
  • each ghost's behavioral parameters (desires and dispositions) are sampled from distributions to explore alternative personalities of the entity it represents.
  • the fittest ghosts have three functions:
  • fittest ghosts are bred genetically and their offspring are reintroduced at the insertion horizon to continue the fitting process.
  • the fittest ghosts for each entity form the basis for a population of ghosts that are allowed to run past the avatar's present into the future.
  • Each ghost that is allowed to run into the future explores a different possible future of the battle, analogous to how some people plan ahead by mentally simulating different ways that a situation might unfold. Analysis of the behaviors of these different possible futures yields predictions.
  • Domain time t is the current time in the domain being modeled. This time may be the same as real-world time, if BEE is being applied to a real-world situation. In our current experiments, we apply BEE to a battle taking place in a simulator, the OneSAF Test Bed (OTB), and domain time is the time stamp published by OTB. During actual runs, OTB is often paused, so domain time runs slower than real time. When we replay logs from simulation runs, we can speed them up so that domain time runs faster than real time.
  • OTB OneSAF Test Bed
  • BEE time ⁇ for a specific page records the domain time corresponding to the state of the world represented on that page, and is offset from the current domain time.
  • Shift time is incremented every time the ghosts move from one page to the next.
  • the relation between shift time and real time depends on the processing resources available.
  • BEE must operate very rapidly in order to keep pace with an ongoing evolution of a battle or other complex situation.
  • each pheromone flavor over the environment forms a scalar field that represents some aspect of the state of the world at an instant in time.
  • Each page of the timeline discussed in the previous section is a complete pheromone field for the world at the BEE time ⁇ represented by that page.
  • the behavior of the pheromones on each page depends on whether the page represents the past or the future.
  • Execution of the pheromone infrastructure proceeds on two time scales, running in separate threads.
  • the first thread updates the book of pages each time the domain time advances past the next page boundary. At each step:
  • the former “now+1” page is replaced with a new current page, whose pheromones correspond to the locations and strengths of observed units;
  • the second thread moves the ghosts from one page to the next, as fast as the processor allows.
  • the system computes the next state of each page, including executing the actions elected by the ghosts, and (in future pages) evaporating pheromones and recording new deposits from the recently arrived ghosts.
  • Ghost movement based on pheromone gradients is a very simple process, so this system can support realistic agent populations without excessive computer load.
  • each avatar generates eight ghosts per shift. Since there are about 50 entities in the battlespace (about 20 units each of Red and Blue and about 5 of Green), we must support about 400 ghosts per page, or about 24000 over the entire book.
  • AI-based plan recognition is motivated by the recognition that prediction requires not only analysis of an entity's intentions, but also its internal emotional state and the dynamics it experiences externally in interacting with the environment. While plan recognition is not sufficient for effective prediction, it is a valuable input.
  • a Bayes net is dynamically configured based on heuristics to identify the likely goals that each entity may hold. This process is known as KIP (Knowledge-based Intention Projection).
  • KIP Knowledge-based Intention Projection
  • the destinations of these goals function as “virtual pheromones.” As described below, ghosts include their distance to such points in their action decisions, achieving the result of gradient following without the computational expense of maintaining a pheromone field.
  • BEE has been tested in a series of experiments in which human wargamers make decisions that are played out in a real-time battlefield simulator.
  • the commander for each side (Red and Blue) has at his disposal a team of pucksters, human operators who set waypoints for individual units in the simulator. Each puckster is responsible for four to six units.
  • the simulator moves the units, determines firing actions, and resolves the outcome of conflicts.
  • FIG. 4 shows the delta disposition for each of the eight fittest ghosts at each time step, plotted against the time step in seconds, for a unit played as a Chicken in an actual run. The values clearly trend negative.
  • FIG. 5 is a shows a similar plot for a Rambo. Units played with an aggressive personality tend to die very soon, and often do not give their ghosts enough time to evolve a clear picture of their personality, but in this case the positive Delta Disposition is clearly evident before the unit's demise.
  • Table 4 shows the percentages of emotional units detected in a recent series of experiments.
  • a Rambo is never called a Chicken, and examination of the logs for the one case where a Chicken is called a Rambo shows that in fact the unit was being played aggressively, rushing toward oncoming Blue forces. Because the brave die young, we almost never detect units played intentionally as Rambos.
  • Each ghost that runs into the future generates a possible future path that its unit might follow.
  • the set of such paths for all ghosts embodies a number of distinct predictions, including the most or least likely future, the future that poses the greatest or least risk to the opposite side, the future that poses the greatest or least risk to one's own side, and so forth.
  • the future whose ghost receives the most guidance from pheromones in the environment was selected at each step along the way. In this sense, it is the most likely future.
  • Assessing the accuracy of these predictions requires a set of metrics, and a baseline against which they can be compared.
  • two sets of metrics may be used. One set evaluates predictions in terms of their individual steps. The other examines several characteristics of an entire prediction.
  • the step-wise evaluations are based on the structure summarized schematically in FIG. 6 .
  • Each row in the matrix is a successive prediction.
  • Each column describes a real-world time step.
  • a given cell records the distance between where the row's prediction indicated the unit would be at the column's time, and where it actually was.
  • the figure shows how these cells can be averaged meaningfully to yield three different measures: the prospective accuracy of a single prediction issued at a point in time, the retrospective accuracy of all predictions concerning a given point in time, or the offset accuracy showing how predictions vary as a function of look-ahead depth.
  • the second set of metrics is based on characteristics of an entire prediction.
  • FIG. 7 summarizes three such characteristics of a path (whether real or predicted): the overall angle ⁇ it subtends, the straight-line radius ⁇ from start to end, and the actual length ⁇ integrated along the path.
  • a fourth characteristic of interest is the number of time intervals ⁇ during which the unit was moving. Each of these four values provides a basis of comparison between a prediction and a unit's actual movement (or between any two paths).
  • . The angle score is (with angles expressed in degrees) AScore 1 ⁇ Min( ⁇ , 360 ⁇ )/180.
  • RScore (Range Score).—Let ⁇ p be the straight-line distance from the current position to the end of the prediction, and ⁇ a the straight-line distance for the actual path.
  • LScore Let ⁇ p be the sum of path segment distances for the prediction, and ⁇ a the sum of path segment distances for the actual path.
  • LScore indicates what percentage the shorter path length is of the longer path length. Special logic returns an LScore of 0 if just one of the lengths is 0, and 1 if both are 0.
  • TScore Time Score
  • Let ⁇ p be the number of minutes that the unit is predicted to move, and ⁇ a the number of minutes that it actually moves.
  • TScore indicates what percentage the shorter path length is of the longer path length. Special logic returns a TScore of 0 if just one of the times is 0, and 1 if both are 0.
  • a random-walk predictor can be implemented. This process starts at a unit's current location, then takes 30 random steps.
  • a random step consists of picking a random number uniformly distributed between 0 and 120 indicating the next cell to move to in an 11-by-11 grid with the current position at the center. (The grid was size 11 because the BEE movement model allows the ghosts to move from 0 to 5 cells in the x and y directions at each step.)
  • the random prediction is generated 100 times, and each of these runs is used to generate one of the metrics discussed above.
  • the baseline reported is the average of these 100-instances.
  • FIG. 8 illustrates the three stepwise metrics for a single unit in a single run.
  • BEE was able to formulate good predictions, which are superior to the baseline in all three metrics. It is particularly encouraging that the horizon error increases so gradually. In a complex nonlinear system, trajectories may diverge at some point, making prediction physically impossible. One would expect to see a discontinuity in the horizon error if the system were reaching this limit. The gentle increase of the horizon error suggests that we are not near this position.
  • FIG. 9 illustrates the four component metrics for the same unit and the same run. In general, these metrics support the conclusion that these predictions are superior to the baseline, and make clear which characteristics of the prediction are most reliable.
  • the BEE architecture lends itself to extension in several areas.
  • the various inputs being integrated by the BEE are only an example of the kinds of information that can be handled.
  • the basic principle of using a dynamical simulation to integrate a wide range of influences can be extended to other inputs as well, requiring much less additional engineering than other more traditional ways of reasoning about how different knowledge sources come together in impacting an agent's behavior.
  • the initial limited repertoire of emotions is a small subset of those that have been distinguished by psychologists, and that might be useful for understanding and projecting behavior.
  • the set of emotions and supporting dispositions that BEE can detect can be extended.
  • mapping between an agent's psychological (cognitive and emotional) state and its outward behavior is not one-to-one.
  • Several different internal states might be consistent with a given observed behavior under one set of environmental conditions, but might yield distinct behaviors under other conditions. If the environment in the recent past is one that confounds such distinct internal states, one will be unable to distinguish them, and if the environment shifts to a condition in which they yield different behaviors, any predictions will suffer.

Abstract

A method of predicting the behavior of software agents in a simulated environment involving modeling a plurality of software agents representing entities to be analyzed, which may be human beings. Using a set of parameters that governs the behavior of the agents, the internal state of at least one of the agents is estimated by its behavior in the simulation, including its movement within the environment. This facilitates a prediction of the likely future behavior of the agent based solely upon its internal state; that is, without recourse to any intentional agent communications. In one embodiment, the simulated environment is based upon a digital pheromone infrastructure. The simulation integrates knowledge of threat regions, a cognitive analysis of the agent's beliefs, desires, and intentions, a model of the agent's emotional disposition and state, and the dynamics of interactions with the environment.

Description

  • This application is a continuation of and claim priority to U.S. application Ser. No. 11/548,909, which claims benefit of and priority to U.S. Provisional Application No. 60/725,854, filed Oct. 12, 2005, and is entitled to that filing date for priority. The specification, figures and complete disclosures of U.S. Provisional Application No. 60/725,854 and application Ser. No. 11/548,909 are incorporated herein by specific reference for all purposes.
  • This application is based in part upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No. NBCHC040153. Any opinions, findings and conclusions or recommendations expressed in this material are those of the inventors and do not necessarily reflect the views of the DARPA or the Department of Interior-National Business Center DOI-NBC). Distribution Statement “A” (Approved for Public Release, Distribution Unlimited).
  • FIELD OF INVENTION
  • This invention relates generally to agent behavior and, in particular, to a system and method that characterizes an agent's internal state by evolution against observed behavior, and predicts future behavior, taking into account the dynamics of agent interaction with their environment.
  • BACKGROUND OF THE INVENTION
  • Reasoning about agents that we observe in the world must integrate two disparate levels. Our observations are often limited to the agent's external behavior, which can frequently be summarized: numerically as a trajectory in space-time (perhaps punctuated by actions from a fairly limited vocabulary). However, this behavior is driven by the agent's internal state, which (in the case of a human) may involve high-level psychological and cognitive concepts such as intentions and emotions. A central challenge in many application domains is reasoning from external observations of agent behavior to an estimate of their internal state. Such reasoning is motivated by a desire to predict the agent's behavior. Work to date focuses almost entirely on recognizing the rational state (as opposed to the emotional state) of a single agent (as opposed to an interacting community), and frequently takes advantage of explicit communications between agents (as in managing conversational protocols).
  • It is increasingly common in agent theory to describe the cognitive state of an agent in terms of its beliefs, desires, and intentions (the so-called “BDI” model [4, 15]). An agent's beliefs are propositions about the state of the world that it considers true, based on its perceptions. Its desires are propositions about the world that it would like to be true. Desires are not necessarily consistent with one another: an agent might desire both to be rich and not to work at the same time. An agent's intentions, or goals, are a subset of its desires that it has selected, based on its beliefs, to guide its future actions. Unlike desires, goals must be consistent with one another (or at least believed to be consistent by the agent).
  • An agent's goals guide its actions. Thus one ought to be able to learn something about an agent's goals by observing its past actions, and knowledge of the agent's goals in turn enables conclusions about what the agent may do in the future.
  • There is a considerable body of work in the AI and multi-agent community on reasoning from an agent's actions to the goals that motivate them. This process is known as “plan recognition” or “plan inference.” A recent survey is available at [2]. This body of work is rich and varied. It covers both single-agent and multi-agent (e.g., robot soccer team) plans, intentional vs. non-intentional actions, speech vs. non-speech behavior, adversarial vs. cooperative intent, complete vs. incomplete world knowledge, and correct vs. faulty plans, among other dimensions.
  • Plan recognition is seldom pursued for its own sake. It usually supports a higher-level function. For example, in human-computer interfaces, recognizing a user's plan can enable the system to provide more appropriate information and options for user action. In a tutoring system, inferring the student's plan is a first step to identifying buggy plans and providing appropriate remediation. In many cases, the higher-level function is predicting likely future actions by the entity whose plan is being inferred.
  • Many realistic problems deviate from these conditions:
      • Increasing the number of agents leads to a combinatorial explosion of possibilities that can swamp conventional analysis.
      • The dynamics of the environment can frustrate the intentions of an agent.
      • The agents often are trying to hide their intentions (and even their presence), rather than intentionally sharing information.
      • An agent's emotional state may be at least as important as its rational state in determining its behavior.
  • Domains that exhibit these constraints can often be characterized as adversarial, and include military combat, competitive business tactics, and multi-player computer games.
  • SUMMARY OF INVENTION
  • In various embodiments, the present invention comprises a method of predicting the behavior of software agents in a simulated environment. The method involves modeling a plurality of software agents representing entities to be analyzed, which may be human beings. Using a set of parameters that governs the behavior of the agents, the internal state of at least one of the agents is estimated by its behavior in the simulation, including its movement within the environment. This facilitates a prediction of the likely future behavior of the agent based solely upon its internal state; that is, without recourse to any intentional agent communications.
  • In one embodiment, the simulated environment is based upon a digital pheromone infrastructure. The digital pheromones are scalar variables that agents can sense and which they deposit at their current location in the environment. The agents respond to the local concentrations of the digital pheromones tropistically through climbing or descending local gradients. The pheromone infrastructure runs on the nodes of a graph-structured environment, preferably a rectangular lattice. Each agent is capable of aggregating pheromone deposits from individual agents, thereby fusing information across multiple agents over time. Each agent is further capable of evaporating pheromones over time to remove inconsistencies that result from changes in the simulation, and diffusing pheromones to nearby places, thereby disseminating information for access by nearby agents.
  • By reasoning from an entity's observed behavior, this invention is capable of providing an estimate of the entity's internal state, and extrapolating that estimate into a prediction of the entity's likely future behavior. The system and method, called BEE (Behavioral Evolution and Extrapolation), performs these and other tasks using a faster-than-real-time simulation of lightweight swarming agents, coordinated through digital pheromones. This simulation integrates knowledge of threat regions, a cognitive analysis of the agent's beliefs, desires, and intentions, a model of the agent's emotional disposition and state, and the dynamics of interactions with the environment. By evolving agents in this rich environment, their internal state can be fitted to their observed behavior. In realistic wargame scenarios, the system successfully detects deliberately played emotions and makes reasonable predictions about the entities' future behavior.
  • DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a graphic model of a tracking nonlinear dynamical system wherein a=system state space; b=system trajectory over time; c=recent measurements of system state; and d=short-range prediction.
  • FIG. 2 is a diagram of a Behavioral Emulation and Extrapolation (BEE) Integrated Rational and Emotive Personality Model.
  • FIG. 3 is graphical representation of an exemplary embodiment of the BEE model, wherein each avatar generates a stream of ghosts that sample the personality space of the entity it represents. They evolve against the observed behavior of the entity in the recent past, and the fittest ghosts then run into the future to generate predictions.
  • FIG. 4 is a Delta Disposition chart for a “Chicken's Ghosts” embodiment.
  • FIG. 5 is a Delta Disposition chart for a “Rambo” embodiment.
  • FIG. 6 shows a table for evaluating predictions, where each row corresponds to a successive prediction for a given unit, and each column to a time in the real world that is covered by some set of these predictions. The shaded cells show which predictions cover which time periods. Each cell (a) contains the location error, that is, how far the unit is at the time indicated by the column from where the prediction indicated by the row said it would be. One can average these errors across a single prediction (b) to estimate the prospective accuracy of a single prediction, across a single time (c) to estimate the retrospective accuracy of all previous predictions referring to a given time, or across a given offset from the start of the prediction (d) to estimate the horizon error, i.e, how prediction accuracy varies with look-ahead depth.
  • FIG. 7 shows a graphic representation of path characteristics: angle θ, straight-line radius ρ, and actual length λ.
  • FIG. 8 shows graphs for exemplary stepwise metrics, including, from left to right, average prospective, retrospective, and horizon error. The thin line is the average of metrics from 100 random walks. The vertical line indicates when the unit dies. Since these are error curves, lower is better.
  • FIG. 9 shows graphs for exemplary component metrics. The thin line is the random baseline. Since these metrics indicate degree of agreement between prediction and baseline, higher is better.
  • DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS
  • In one exemplary embodiment, the present system provides a Behavioral Evolution and Extrapolation (BEE) method and approach to addressing the recognition of the rational and emotional state of multiple interacting agents based solely on their behavior, without recourse to intentional communications from them. It is inspired by techniques used to predict the behavior of nonlinear dynamical systems, in which a representation of the system is continually fit to its recent past behavior. In such analysis of nonlinear dynamical systems, the representation takes the form of a closed form mathematical equation. In BEE, it takes the form of a set of parameters governing the behavior of software agents representing the individuals being analyzed.
  • In contrast to previous research in AI (plan recognition) and nonlinear dynamics systems (trajectory prediction), embodiments of the present invention focus on plan recognition in support of prediction. An agent's plan is a necessary input to a prediction of its future behavior, but hardly a sufficient one. At least two other influences, one internal and one external, need to be taken into account.
  • The external influence is the dynamics of the environment, which may include other agents. The dynamics of the real world impose significant constraints. The environment is autonomous (it may do things on its own that interfere with the desires of the agent) [3, 8]. Most interactions among agents, and between agents and the world, are nonlinear. When iterated, these can generate rapid divergence of trajectories (“chaos,” sensitivity to initial conditions).
  • A rational analysis of an agent's goals may enable one to predict what it will attempt, but any nontrivial plan with several steps will depend sensitively at each step to the reaction of the environment, and predictions must take this into account as well. Actual simulation of futures is one way to deal with these.
  • In the case of human agents, an internal influence also comes into play. The agent's emotional state can modulate its decision process and its focus of attention (and thus its perception of the environment). In extreme cases, emotion can lead an agent to choose actions that from the standpoint of a logical analysis may appear irrational.
  • Current work on plan recognition for prediction focuses on the rational plan, and does not take into account either external environmental influences or internal emotional biases. BEE integrates all three elements into its predictions.
  • Real-Time Fitting in Nonlinear Systems Analysis
  • Many systems of interest can be described in terms of a vector of real numbers that changes as a function of time. The dimensions of the vector define the system's state space. Notionally, one typically analyzes such systems as vector differential equations, e.g., dx/dt=ƒ(x).
  • When ƒ is nonlinear, the system can be formally chaotic, and starting points arbitrarily close to one another can lead to trajectories that diverge exponentially rapidly, becoming uncorrelated. Long-range prediction of the behavior of such a system is impossible in principle. However, it is often useful to anticipate the system's behavior a short distance into the future. To do so, a common technique is to fit a convenient functional form for ƒ to the system's trajectory in the recent past, and then extrapolate this fit into the future, as seen in FIG. 1. [6] This process is repeated constantly, in real time, providing the user with a limited look-ahead into the system's future.
  • While this approach is robust and widely applied, it requires systems that can efficiently be described in terms of mathematical equations that can be fit using optimization methods such as least squares. BEE applies this approach to agent behaviors, which it fits to observed behavior using a genetic algorithm.
  • Architecture
  • BEE predicts the future by observing the emergent behavior of agents representing the entities of interest in a fine-grained agent simulation. Key elements of the BEE architecture include the model of an individual agent, the pheromone infrastructure through which agents interact, the information sources that guide them, and the overall evolutionary cycle that they execute.
  • Agent Model
  • The agents in BEE are inspired by two bodies of work. The first is our own previous work on fine-grained agents that coordinate their actions stigmergically, through digital pheromones in a shared environment [1, 11, 13, 14, 16]. The second inspiration is the success of previous agent-based combat modeling in EINSTein and MAUI.
  • Digital pheromones are scalar variables that agents deposit at their current location in the environment, and that they can sense. Agents respond to the local concentrations of these variables tropistically, typically climbing or descending local gradients. Their movements in turn change the deposit patterns. This feedback loop, together with processes of evaporation and propagation in the environment, can support complex patterns of interaction and coordination among the agents [12]. Table 1 shows the pheromone flavors currently used in the BEE. In addition, ghosts take into account their distance from distinguished static locations, a mechanism that we call “virtual pheromones,” since it has the same effect as propagating a pheromone field from such a location, but with lower computational costs.
  • TABLE 1
    PHEROMONE FLAVORS IN RAID
    RedAlive Emitted by a living or dead entity of
    RedCasualty the appropriate group
    Blue Alive (Red = enemy, Blue = friendly, Green = neutral)
    BlueCasualty
    GreenAlive
    GreenCasualty
    Weapons Fire Emitted by a firing weapon
    KeySite Emitted by a site of particular importance to Red
    Cover Emitted by locations that afford cover from fire
    Mobility Emitted by roads and other structures that enhance agent
    mobility
    RedThreat Determined by external process
    Blue Threat
  • The use of agents to model combat is inspired by EINSTein and MAUI. EINSTein [5] represents an agent as a set of six weights, each in [−1, 1], describing the agent's response to six kinds of information. Four of these describe the number of alive friendly, alive enemy, injured friendly, and injured enemy troops within the agent's sensor range. The other two weights relate to the model's use of a childhood game, “capture the flag,” as a prototype of combat. Each team has a flag, and seeks to protect it from the other team while capturing the other team's flag. The fifth and sixth weights describe how far the agent is from its own and its adversary's flag. A positive weight indicates that the agent is attracted to the entity described by the weight, while a negative weight indicates that it is repelled.
  • MANA [7] extends the concepts in EINSTein. Friendly and enemy flags are replaced by the waypoints being pursued by each side. MANA includes four additional components: low, medium, and high threat enemies. In addition, it defines a set of triggers (e.g., reaching a waypoint, being shot at, making contact with the enemy, being injured) that shift the agent from one personality vector to another. A default state defines the personality vector when no trigger state is active.
  • The personality vectors in MANA and EINSTein reflect both rational and emotive aspects of decision-making. The notion of being attracted or repelled by friendly or adversarial forces in various states of health is an important component of what we informally think of as emotion (e.g., fear, compassion, aggression), and the use of the term “personality” in both EINSTein and MANA suggests that the system designers are thinking anthropomorphically, though they do not use “emotion” to describe the effect they are trying to achieve. The notion of waypoints to which an agent is attracted reflects goal-oriented rationality.
  • BEE embodies an integrated rational-emotive personality model. In one embodiment, a BEE agent's rationality is modeled as a vector of seven desires, which are values in [−1, +1]: ProtectRed (the adversary), ProtectBlue (friendly forces), ProtectGreen (civilians), ProtectKeySites, AvoidCombat, AvoidDetection, and Survive. Negative values reverse the sense suggested by the label. For example, a negative value of ProtectRed indicates a desire to harm Red.
  • Table 2 shows which pheromones A(ttract) or R(epel) an agent with a given desire, and how that tendency translates into action.
  • The emotive component of a BEE's personality is based on the Ortony-Clore-Collins (OCC) framework [9], and described in detail elsewhere [10]. OCC define emotions as “valanced reactions to agents, states, or events in the environment.” This notion of reaction is captured in MANA's trigger states. An important advance in BEE's emotional model with respect to MANA and EINSTein is the recognition that agents may differ in how sensitive they are to triggers. For example, threatening situations tend to stimulate the emotion of fear, but a given level of threat will produce more fear in a new recruit than in a seasoned combat veteran. Thus, the present model includes not only Emotions, but Dispositions. Each Emotion has a corresponding Disposition. Dispositions are relatively stable, and considered constant over the time horizon of a run of the BEE, while Emotions vary based on the agent's disposition and the stimuli to which it is exposed.
  • Based on interviews with military domain experts we identified the two most crucial emotions for combat behavior as Anger (with the corresponding disposition Irritability) and Fear (whose disposition is Cowardice). Table 3 shows which pheromones trigger which emotions. Emotions are modeled as agent hormones (internal pheromones) that are augmented in the presence of the triggering environmental condition and evaporate over time.
  • TABLE 3
    INTERACTIONS OF PHEROMONES AND DISPOSITIONS/EMOTIONS
    Red Perspective Blue Perspective Green Perspective
    Irritability/ Cowardice/ Irritability/ Cowardice/ Irritability/ Cowardice/
    Anger Fear Anger Fear Anger Fear
    Pheromone
    RedAlive X X
    RedCasualty X X
    BlueAlive X X X X
    BlueCasualty X X
    GreenCasualty X X X X
    WeaponsFire X X X X X X
    KeySites X X
  • The effect of a non-zero emotion is to modify actions. An elevated level of Anger will increase movement likelihood, weapon firing likelihood, and tendency toward an exposed posture. An increasing level of Fear will decrease these likelihoods.
  • FIG. 2 summarizes one embodiment of the BEE's personality model. The left two columns are a straightforward BDI model (where we prefer the term “goal” to “intention”). The right-hand column is the emotive component, where an appraisal of the agent's beliefs, moderated by the disposition, leads to an emotion that in turn influences the BDI analysis.
  • The BEE Cycle
  • A major innovation in BEE is an extension of the nonlinear systems technique described herein to characterize agents based on their past behavior and extrapolate their future behavior based on this characterization. This section describes this process at a high level, then discusses in more detail the multi-page pheromone infrastructure that implements it.
  • Overview
  • FIG. 3 is an overview of one embodiment of the BEE process. Each active entity in the battlespace has an avatar that continuously generates a stream of ghost agents representing itself. Ghosts live on a timeline indexed by τ that begins in the past at the insertion horizon and runs into the future to the prediction horizon. τ is offset with respect to the current time t in the domain being modeled. The timeline is divided into discrete “pages,” each representing a successive value of τ. The avatar inserts the ghosts at the insertion horizon. In our current system, the insertion horizon is at τ−t=−30, meaning that ghosts are inserted into a page representing the state of the world 30 minutes ago. At the insertion horizon, each ghost's behavioral parameters (desires and dispositions) are sampled from distributions to explore alternative personalities of the entity it represents.
  • Each page between the insertion horizon and τ=t (“now,” the pace corresponding to the state of the world at the current domain time) records the historical state of the world at the point in the past to which it corresponds. As ghosts move from page to page, they interact with this past state, based on their behavioral parameters. These interactions mean that their fitness depends not just on their own actions, but also on the behaviors of the rest of the population, which is also evolving. Because τ advances faster than real time, eventually τ=t (actual time). At this point, each ghost is evaluated based on its location compared with the actual location of its corresponding real-world entity.
  • The fittest ghosts have three functions:
  • 1. The personality of the fittest ghost for each entity is reported to the rest of the system as the likely personality of the corresponding entity. This information enables us to characterize individual warriors as unusually cowardly or brave.
  • 2. The fittest ghosts are bred genetically and their offspring are reintroduced at the insertion horizon to continue the fitting process.
  • 3. The fittest ghosts for each entity form the basis for a population of ghosts that are allowed to run past the avatar's present into the future. Each ghost that is allowed to run into the future explores a different possible future of the battle, analogous to how some people plan ahead by mentally simulating different ways that a situation might unfold. Analysis of the behaviors of these different possible futures yields predictions.
  • A review of this process shows that BEE has three distinct notions of time, all of which may be distinct from real-world time.
  • 1. Domain time t is the current time in the domain being modeled. This time may be the same as real-world time, if BEE is being applied to a real-world situation. In our current experiments, we apply BEE to a battle taking place in a simulator, the OneSAF Test Bed (OTB), and domain time is the time stamp published by OTB. During actual runs, OTB is often paused, so domain time runs slower than real time. When we replay logs from simulation runs, we can speed them up so that domain time runs faster than real time.
  • 2. BEE time τ for a specific page records the domain time corresponding to the state of the world represented on that page, and is offset from the current domain time.
  • 3. Shift time is incremented every time the ghosts move from one page to the next. The relation between shift time and real time depends on the processing resources available.
  • Pheromone Infrastructure
  • BEE must operate very rapidly in order to keep pace with an ongoing evolution of a battle or other complex situation. Thus we use simple agents coordinated using pheromone mechanisms. We have described the basic dynamics of our pheromone infrastructure elsewhere [1]. This infrastructure runs on the nodes of a graph-structured environment (in the case of BEE, a rectangular lattice). Each node maintains a scalar value for each flavor of pheromone, and provides three functions:
  • 1. It aggregates deposits from individual agents, fusing information across multiple agents and through time.
  • 2. It evaporates pheromones over time. This dynamic is an innovative alternative to traditional truth maintenance in artificial intelligence. Traditionally, knowledge bases remember everything they are told unless they have a reason to forget something, and expend large amounts of computation in the NP-complete problem of reviewing their holdings to detect inconsistencies that result from changes in the domain being modeled. Ants immediately begin to forget everything they learn, unless it is continually reinforced. Thus inconsistencies automatically remove themselves within a known period.
  • 3. It diffuses pheromones to nearby places, disseminating information for access by nearby agents.
  • The distribution of each pheromone flavor over the environment forms a scalar field that represents some aspect of the state of the world at an instant in time. Each page of the timeline discussed in the previous section is a complete pheromone field for the world at the BEE time τ represented by that page. The behavior of the pheromones on each page depends on whether the page represents the past or the future.
  • In pages representing the future (τ>t), the usual pheromone mechanisms apply. Ghosts deposit pheromone each time they move to a new page, and pheromones evaporate and propagate from one page to the next.
  • In pages representing the domain past (τ.ltoreq. t), one has an observed state of the real world. This has two consequences for pheromone management. First, we can generate the pheromone fields directly from the observed locations of individual entities, so there is no need for the ghosts to make deposits. Second, we can adjust the pheromone intensities based on the changed locations of entities from page to page, so we do not need to evaporate or propagate the pheromones. Both of these simplifications reflect the fact that in our current system, we have complete knowledge of the past. When we introduce noise and uncertainty, we will probably need to introduce dynamic pheromones in the past as well as the future.
  • Execution of the pheromone infrastructure proceeds on two time scales, running in separate threads.
  • The first thread updates the book of pages each time the domain time advances past the next page boundary. At each step:
  • 1. The former “now+1” page is replaced with a new current page, whose pheromones correspond to the locations and strengths of observed units;
  • 2. An empty page is added at the prediction horizon; and
  • 3. The oldest page is discarded, since it has passed the insertion horizon.
  • The second thread moves the ghosts from one page to the next, as fast as the processor allows. At each step:
  • 1. Ghosts reaching the τ=t page are evaluated for fitness and removed or evolved;
  • 2. New ghosts from the avatars and from the evolutionary process are inserted at the insertion horizon;
  • 3. A population of ghosts based on the fittest ghosts are inserted at τ=t to run into the future;
  • 4. Ghosts that have moved beyond the prediction horizon are removed;
  • 5. All ghosts plan their next actions based on the pheromone field in the pages they currently occupy;
  • 6. The system computes the next state of each page, including executing the actions elected by the ghosts, and (in future pages) evaporating pheromones and recording new deposits from the recently arrived ghosts.
  • Ghost movement based on pheromone gradients is a very simple process, so this system can support realistic agent populations without excessive computer load. In our current system, each avatar generates eight ghosts per shift. Since there are about 50 entities in the battlespace (about 20 units each of Red and Blue and about 5 of Green), we must support about 400 ghosts per page, or about 24000 over the entire book.
  • How fast a processor do we need? Let p be the real-time duration of a page in seconds. If each page represents 60 seconds of domain time, and we are replaying a simulation at 2× domain time, p=30. Let n be the number of pages between the insertion horizon and τ=t. In our current system, n=30. Then a shift rate of n/p shifts per second will permit ghosts to run from the insertion horizon to the current time at least once before a new page is generated. Empirically, we have found this level a reasonable lower bound for reasonable performance, and easily achievable on stock WinTel platforms.
  • Information Sources
  • The flexibility of the BEE's pheromone infrastructure permits the integration of numerous information sources as input to our characterizations of entity personalities and predictions of their future behavior. Our current system draws on three sources of information, but others can readily be added.
  • Real-world observations.—Observations from the real world are encoded into the pheromone field each increment of BEE time, as a new “current page” is generated. Table 1 identifies the entities that generate each flavor of pheromone.
  • Statistical estimates of threat regions.—An independent process (known as SAD (Statistical Anomaly Detection) developed by Rafael Alonso, Hua Li, and John Asmuth at Sarnoff Corporation) uses statistical techniques to estimate the level of threat to each force (Red or Blue), based on the topology of the battlefield and the known disposition of forces. For example, a broad open area with no cover is particularly threatening, especially if the opposite force occupies its margins. The results of this process are posted to the pheromone pages as “RedThreat” pheromone (representing a threat to red) and “BlueThreat” pheromone (representing a threat to Blue).
  • AI-based plan recognition.—BEE is motivated by the recognition that prediction requires not only analysis of an entity's intentions, but also its internal emotional state and the dynamics it experiences externally in interacting with the environment. While plan recognition is not sufficient for effective prediction, it is a valuable input. In the current system, a Bayes net is dynamically configured based on heuristics to identify the likely goals that each entity may hold. This process is known as KIP (Knowledge-based Intention Projection). The destinations of these goals function as “virtual pheromones.” As described below, ghosts include their distance to such points in their action decisions, achieving the result of gradient following without the computational expense of maintaining a pheromone field.
  • Experimental Results
  • BEE has been tested in a series of experiments in which human wargamers make decisions that are played out in a real-time battlefield simulator. The commander for each side (Red and Blue) has at his disposal a team of pucksters, human operators who set waypoints for individual units in the simulator. Each puckster is responsible for four to six units. The simulator moves the units, determines firing actions, and resolves the outcome of conflicts.
  • Fitting Dispositions
  • To test the system's ability to fit personalities based on behavior, one Red puckster responsible for four units was designated the “emotional” puckster. His instructions were to select two of his units to be cowardly (“chickens”) and two to be irritable (“Rambos”). He did not disclose this assignment during the run. His instructions were to move each unit according to the commander's orders until the unit encountered circumstances that would trigger the emotion associated with the unit's disposition. Then he would manipulate chickens as though they were fearful (typically avoiding combat and moving away from Blue), and would move Rambos into combat as quickly as possible.
  • It has been found that the difference between the two disposition values (Cowardice-Irritability) of the fittest ghosts is a better indicator of the emotional state of the corresponding entity than either value by itself.
  • FIG. 4 shows the delta disposition for each of the eight fittest ghosts at each time step, plotted against the time step in seconds, for a unit played as a Chicken in an actual run. The values clearly trend negative.
  • FIG. 5 is a shows a similar plot for a Rambo. Units played with an aggressive personality tend to die very soon, and often do not give their ghosts enough time to evolve a clear picture of their personality, but in this case the positive Delta Disposition is clearly evident before the unit's demise.
  • To distill such a series of points into a characterization of a unit's personality, we maintain a 800-second exponentially weighted moving average of the Delta Disposition, and declare the unit to be a Chicken or Rambo if this value passes a negative or positive threshold, respectively. Currently, this threshold is set at 0.25. Other filters may be used. For example, a rapid rate of increase enhances the likelihood of calling a Rambo; units that seek to avoid detection and avoid combat are more readily called Chicken.
  • Table 4 shows the percentages of emotional units detected in a recent series of experiments. A Rambo is never called a Chicken, and examination of the logs for the one case where a Chicken is called a Rambo shows that in fact the unit was being played aggressively, rushing toward oncoming Blue forces. Because the brave die young, we almost never detect units played intentionally as Rambos.
  • TABLE 4
    EXPERIMENTAL RESULTS ON FITTING DISPOSITIONS
    (16 runs)
    Called Correctly Called Incorrectly Note Called
    Chickens 68% 5% 27%
    Rambos
     5% 0% 95%
  • In addition to these results on units intentionally played as emotional, there are a number of cases where other units were detected as cowardly or brave. Analysis of the behavior of these units shows that these characterizations were appropriate: units that flee in the face of enemy forces or weapons fire are detected as Chickens, while those that stand their ground or rush the adversary are denominated as Rambos.
  • Integrated Predictions
  • Each ghost that runs into the future generates a possible future path that its unit might follow. The set of such paths for all ghosts embodies a number of distinct predictions, including the most or least likely future, the future that poses the greatest or least risk to the opposite side, the future that poses the greatest or least risk to one's own side, and so forth. In the experiments reported here, the future whose ghost receives the most guidance from pheromones in the environment was selected at each step along the way. In this sense, it is the most likely future.
  • Assessing the accuracy of these predictions requires a set of metrics, and a baseline against which they can be compared.
  • Metrics for Predictions
  • In one embodiment, two sets of metrics may be used. One set evaluates predictions in terms of their individual steps. The other examines several characteristics of an entire prediction.
  • The step-wise evaluations are based on the structure summarized schematically in FIG. 6. Each row in the matrix is a successive prediction. Each column describes a real-world time step. A given cell records the distance between where the row's prediction indicated the unit would be at the column's time, and where it actually was.
  • The figure shows how these cells can be averaged meaningfully to yield three different measures: the prospective accuracy of a single prediction issued at a point in time, the retrospective accuracy of all predictions concerning a given point in time, or the offset accuracy showing how predictions vary as a function of look-ahead depth.
  • The second set of metrics is based on characteristics of an entire prediction. FIG. 7 summarizes three such characteristics of a path (whether real or predicted): the overall angle θ it subtends, the straight-line radius τ from start to end, and the actual length λ integrated along the path. A fourth characteristic of interest is the number of time intervals τ during which the unit was moving. Each of these four values provides a basis of comparison between a prediction and a unit's actual movement (or between any two paths).
  • AScore (Angle Score).—Let θp be the angle associated with the prediction, and θa the angle associated with the unit's actual path over the period covered by the prediction. Let Δθ=|θp−θa|. The angle score is (with angles expressed in degrees) AScore=1−Min(Δθ, 360−Δθ)/180.
  • If Δθ=0, AScore=1. If Δθ=180, AScore=0. The average of a set of random predictions will produce a score approaching 0.5.
  • RScore (Range Score).—Let ρp be the straight-line distance from the current position to the end of the prediction, and ρa the straight-line distance for the actual path. The range score is: RScore=1.0−|ρp−ρa|/Max(ρp, ρa).
  • If the prediction is perfect, ρpa, and RScore=1. If the ranges are different, RScore gives the percentage that the shorter range is of the longer one. Special logic returns an RScore of 0 if just one of the ranges is 0, and 1 if both are 0.
  • LScore (Length Score).—Let λp be the sum of path segment distances for the prediction, and λa the sum of path segment distances for the actual path. The length score is: LScore=1.0−|λp−λa|/Max(λp, λa).
  • If the prediction is perfect, λpa, and LScore=1. If both lengths are non-zero, LScore indicates what percentage the shorter path length is of the longer path length. Special logic returns an LScore of 0 if just one of the lengths is 0, and 1 if both are 0.
  • TScore (Time Score).—Let τp be the number of minutes that the unit is predicted to move, and τa the number of minutes that it actually moves. The time score is: TScore=1.0−|τp−τa|/Max(τp, τa).
  • If the prediction is perfect, τpa, and LScore=1. If both times are non-zero, TScore indicates what percentage the shorter path length is of the longer path length. Special logic returns a TScore of 0 if just one of the times is 0, and 1 if both are 0.
  • Baseline
  • As a baseline for comparison, a random-walk predictor can be implemented. This process starts at a unit's current location, then takes 30 random steps. A random step consists of picking a random number uniformly distributed between 0 and 120 indicating the next cell to move to in an 11-by-11 grid with the current position at the center. (The grid was size 11 because the BEE movement model allows the ghosts to move from 0 to 5 cells in the x and y directions at each step.) The random number r is translated into x and y steps, Δx, Δy, using the equations Δx=r/11−5, Δy=(r mod 11)−5.
  • To compile a baseline, the random prediction is generated 100 times, and each of these runs is used to generate one of the metrics discussed above. The baseline reported is the average of these 100-instances.
  • EXAMPLES
  • FIG. 8 illustrates the three stepwise metrics for a single unit in a single run. In the case of this unit, BEE was able to formulate good predictions, which are superior to the baseline in all three metrics. It is particularly encouraging that the horizon error increases so gradually. In a complex nonlinear system, trajectories may diverge at some point, making prediction physically impossible. One would expect to see a discontinuity in the horizon error if the system were reaching this limit. The gentle increase of the horizon error suggests that we are not near this position.
  • FIG. 9 illustrates the four component metrics for the same unit and the same run. In general, these metrics support the conclusion that these predictions are superior to the baseline, and make clear which characteristics of the prediction are most reliable.
  • The BEE architecture lends itself to extension in several areas. The various inputs being integrated by the BEE are only an example of the kinds of information that can be handled. The basic principle of using a dynamical simulation to integrate a wide range of influences can be extended to other inputs as well, requiring much less additional engineering than other more traditional ways of reasoning about how different knowledge sources come together in impacting an agent's behavior.
  • The initial limited repertoire of emotions is a small subset of those that have been distinguished by psychologists, and that might be useful for understanding and projecting behavior. The set of emotions and supporting dispositions that BEE can detect can be extended.
  • The mapping between an agent's psychological (cognitive and emotional) state and its outward behavior is not one-to-one. Several different internal states might be consistent with a given observed behavior under one set of environmental conditions, but might yield distinct behaviors under other conditions. If the environment in the recent past is one that confounds such distinct internal states, one will be unable to distinguish them, and if the environment shifts to a condition in which they yield different behaviors, any predictions will suffer. One can probe the real world, perturbing it in ways that would stimulate distinct behaviors from entities whose psychological state is otherwise indistinguishable. BEE's faster-than-real-time simulation can allow the user to identify appropriate probing actions, greatly increasing the effectiveness of intelligence efforts.
  • While BEE has been developed in the context of adversarial reasoning in urban warfare, it is applicable in a much wider range of applications, including computer games, business strategy, and sensor fusion.
  • Thus, it should be understood that the embodiments and examples described herein have been chosen and described in order to best illustrate the principles of the invention and its practical applications to thereby enable one of ordinary skill in the art to best utilize the invention in various embodiments and with various modifications as are suited for particular uses contemplated. Even though specific embodiments of this invention have been described, they are not to be taken as exhaustive. There are several variations that will be apparent to those skilled in the art.
  • REFERENCES
    • [1] S. Brueckner. Return from the Ant: Synthetic Ecosystem for Manufacturing Control. Dr.rer.nat. Thesis at Humboldt University Berlin, Department of Computer Science, 2000. Available at http://dochostrz.hu-berlin.de/dissertationen/brueckner-sven-2000-06-21/P-DF/Brueckner.pdf.
    • [2] S. Carberry. Techniques for Plan Recognition. User Modeling and User-Adapted Interaction, 11(1-2):31-48, 2001. Available at http://www.cis.udel.edu/.about.carberry/Papers/UMUAI-PlanRec.ps.
    • [3] J. Ferber and J.-P. Muller. Influences and Reactions: a Model of Situated Multiagent Systems. In Proceedings of Second International Conference on Multi-Agent Systems (ICMAS-96), pages 72-79, 1996.
    • [4] A. Haddadi and K. Sundermeyer. Belief-Desire-Intention Agent Architectures, In G. M. P. O'Hare and N. R. Jennings, Editors, Foundation of Distributed Artificial Intelligence, pages 169-185. John Wiley, New York, N.Y., 1996.
    • [5] A. Ilachinski. Artificial War: Multiagent-based Simulation of Combat. Singapore, World Scientific, 2004.
    • [6] H. Kantz and T. Schreiber. Nonlinear Time Series Analysis. Cambridge, UK, Cambridge University Press, 1997.
    • [7] M. K. Lauren and R. T. Stephen. Map-Aware Non-uniform Automata (MANA)—A New Zealand Approach to Scenario Modelling. Journal of Battlefield Technology, 5(1 (March)):27ff, 2002. Available at http://www.argospress.com/jbt/Volume5/5-1-4.htm.
    • [8] F. Michel. Formalisme, methodologie et outils pour la modelisation et la simulation de systemes multi-agents. Doctorat Thesis at Universite des Sciences et Techniques du Languedoc, Department of Informatique, 2004. Available at http://www.lirmm.fr/.about.fmichel/these/index.html.
    • [9] A. Ortony, G. L. Clore, and A. Collins. The cognitive structure of emotions. Cambridge, UK, Cambridge University Press, 1988.
    • [10] H. V. D. Parunak, R. Bisson, S. Brueckner, R. Matthews, and J. Sauter. Representing Dispositions and Emotions in Simulated Combat. In Proceedings of Workshop on Defense Applications of Multi-Agent Systems (DAMAS05, at AAMAS05), pages (forthcoming), 2005. Available at http://www.altarum.net/.about.vparunak/DAMAS05DETT.pdf.
    • [11] H. V. D. Parunak and S. Brueckner. Ant-Like Missionaries and Cannibals: Synthetic Pheromones for Distributed Motion Control. In Proceedings of Fourth International Conference on Autonomous Agents (Agents 2000), pages 467-474, 2000. Available at http://www.altarum.net/.about.vparunak/MissCann.pdf.
    • [12] H. V. D. Parunak, S. Brueckner, M. Fleischer, and J. Odell. A Design Taxonomy of Multi-Agent Interactions. In Proceedings of Agent-Oriented Software Engineering IV, pages 123-137, Springer, 2003. Available at www.altarum.net/.about.vparunak/cox.pdf.
    • [13] H. V. D. Parunak, S. Brueckner, and J. Sauter. Digital Pheromones for Coordination of Unmanned Vehicles. In Proceedings of Workshop on Environments for Multi-Agent Systems (E4MAS 2004), pages 246-263, Springer, 2004. Available at http://www.altarum.net/.about.vparunak/AAMAS04_UAVCoordination.pdf.
    • [14] H. V. D. Parunak, S. A. Brueckner, and J. Sauter. Digital Pheromone Mechanisms for Coordination of Unmanned Vehicles. In Proceedings of First International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS 2002), pages 449-450, 2002. Available at www.altarum.net/.about.vparunak/AAMAS02ADAPTIV.pdf.
    • [15] A. S. Rao and M. P. Georgeff. Modeling Rational Agents within a BDI Architecture. In Proceedings of International Conference on Principles of Knowledge Representation and Reasoning (KR-91), pages 473-484, Morgan Kaufman, 1991.
    • [16] J. A. Sauter, R. Matthews, H. V. D. Parunak, and S. Brueckner. Evolving Adaptive Pheromone Path Planning Mechanisms. In Proceedings of Autonomous Agents and Multi-Agent Systems (AAMAS02), pages 434-440, 2002. Available at www.altarum.net/.about.vparunak/AAMAS02Evolution.pdf.

Claims (20)

1. A method of predicting the behavior of an agent in an environment, comprising the steps of:
executing a computer simulation of an environment including a plurality of software agents;
estimating the internal state of at least one of the agents based upon its behavior in the simulation, including its movement within the environment; and
predicting the likely future behavior of the agent based upon the estimate of its internal state.
2. The method of claim 1, wherein the agent's internal state is estimated by examining changes in the agent's observed behavior.
3. The method of claim 1, wherein the agent's internal state is estimated in conjunction with a model of the environment.
4. The method of claim 1, wherein the prediction of the agent's future behavior is based in part on the agent's interaction with the environment.
5. The method of claim 1, wherein the agents represent human beings.
6. The method of claim 1, wherein the simulated environment comprises digital pheromones.
7. The method of claim 6, wherein the digital pheromones are scalar variables that agents can sense and which they deposit at their current location in the environment.
8. The method of claim 7, wherein the agents respond to the local concentrations of the digital pheromones tropistically through climbing or descending local gradients.
9. The method of claim 6, wherein the pheromones run on the nodes of a graph-structured environment.
10. The method of claim 6, wherein the graph-structured environment is a rectangular lattice.
11. The method of claim 6, wherein each agent is capable of aggregating pheromone deposits from individual agents, thereby fusing information across multiple agents over time.
12. The method of claim 6, wherein each agent is capable of evaporating pheromones over time to remove inconsistencies that result from changes in the simulation.
13. The method of claim 6, wherein each agent is capable of diffusing pheromones to nearby places, thereby disseminating information for access by nearby agents.
14. The method of claim 6, wherein the movements of the agents change their deposit patterns.
15. The method of claim 6, wherein the simulation integrates knowledge of threat regions, a cognitive analysis of the agent's beliefs, desires, and intentions, a model of the agent's emotional disposition and state, and the dynamics of interactions with the environment.
16. The method of claim 1, wherein the simulation involves urban warfare.
17. The method of claim 1, wherein the simulation involves a computer game.
18. The method of claim 1, wherein the simulation involves a business strategy.
19. The method of claim 1, wherein the simulation involves a sensor fusion.
20. A system for predicting the behavior of an agent in an environment, comprising:
a processor or microprocessor coupled to a memory, wherein the processor or microprocessor is programmed to evaluate search results by:
executing a computer simulation of an environment including a plurality of software agents;
estimating the internal state of at least one of the agents based upon its behavior in the simulation, including its movement within the environment; and
predicting the likely future behavior of the agent based upon the estimate of its internal state.
US13/079,766 2005-10-12 2011-04-04 Characterizing and predicting agents via multi-agent evolution Abandoned US20110178978A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US13/079,766 US20110178978A1 (en) 2005-10-12 2011-04-04 Characterizing and predicting agents via multi-agent evolution

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US72585405P 2005-10-12 2005-10-12
US11/548,909 US7921066B2 (en) 2005-10-12 2006-10-12 Characterizing and predicting agents via multi-agent evolution
US13/079,766 US20110178978A1 (en) 2005-10-12 2011-04-04 Characterizing and predicting agents via multi-agent evolution

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US11/548,909 Continuation US7921066B2 (en) 2005-10-12 2006-10-12 Characterizing and predicting agents via multi-agent evolution

Publications (1)

Publication Number Publication Date
US20110178978A1 true US20110178978A1 (en) 2011-07-21

Family

ID=38233879

Family Applications (2)

Application Number Title Priority Date Filing Date
US11/548,909 Expired - Fee Related US7921066B2 (en) 2005-10-12 2006-10-12 Characterizing and predicting agents via multi-agent evolution
US13/079,766 Abandoned US20110178978A1 (en) 2005-10-12 2011-04-04 Characterizing and predicting agents via multi-agent evolution

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US11/548,909 Expired - Fee Related US7921066B2 (en) 2005-10-12 2006-10-12 Characterizing and predicting agents via multi-agent evolution

Country Status (1)

Country Link
US (2) US7921066B2 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120302351A1 (en) * 2011-05-27 2012-11-29 Microsoft Corporation Avatars of friends as non-player-characters
US9336481B1 (en) * 2015-02-02 2016-05-10 James Albert Ionson Organically instinct-driven simulation system and method
US9369543B2 (en) 2011-05-27 2016-06-14 Microsoft Technology Licensing, Llc Communication between avatars in different games
WO2017015183A1 (en) * 2015-07-21 2017-01-26 Homeward Health, Llc Psychoscocial construction system
US10248957B2 (en) * 2011-11-02 2019-04-02 Ignite Marketing Analytics, Inc. Agent awareness modeling for agent-based modeling systems
US11316977B2 (en) 2017-10-27 2022-04-26 Tata Consultancy Services Limited System and method for call routing in voice-based call center
US11481658B2 (en) 2017-10-01 2022-10-25 Pontificia Universidad Javeriana Real-time multi-agent BDI architecture with agent migration and methods thereof

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8157640B2 (en) * 2007-06-27 2012-04-17 Wms Gaming Inc. Swarming behavior in wagering game machines
US20090144214A1 (en) * 2007-12-04 2009-06-04 Aditya Desaraju Data Processing System And Method
US7764628B2 (en) * 2008-01-18 2010-07-27 Alexandre Gerber Method for controlling traffic balance between peering networks
US8634982B2 (en) * 2009-08-19 2014-01-21 Raytheon Company System and method for resource allocation and management
US9542038B2 (en) 2010-04-07 2017-01-10 Apple Inc. Personalizing colors of user interfaces
TWI439960B (en) 2010-04-07 2014-06-01 Apple Inc Avatar editing environment
US8692830B2 (en) * 2010-06-01 2014-04-08 Apple Inc. Automatic avatar creation
USRE49044E1 (en) * 2010-06-01 2022-04-19 Apple Inc. Automatic avatar creation
US8494981B2 (en) * 2010-06-21 2013-07-23 Lockheed Martin Corporation Real-time intelligent virtual characters with learning capabilities
US8396730B2 (en) * 2011-02-14 2013-03-12 Raytheon Company System and method for resource allocation and management
WO2013076615A1 (en) 2011-11-22 2013-05-30 Koninklijke Philips Electronics N.V. Mental balance or imbalance estimation system and method
CN104731603A (en) * 2015-04-02 2015-06-24 西安电子科技大学 System self-adaptation dynamic evolution method facing complex environment
CN105045976B (en) * 2015-07-01 2018-01-30 中国人民解放军信息工程大学 A kind of method that war game map terrain properties are established with grid matrix
CN113987842B (en) * 2021-12-24 2022-04-08 湖南高至科技有限公司 BDI modeling method, device, equipment and medium based on knowledge graph

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060059113A1 (en) * 2004-08-12 2006-03-16 Kuznar Lawrence A Agent based modeling of risk sensitivity and decision making on coalitions

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4661074B2 (en) * 2004-04-07 2011-03-30 ソニー株式会社 Information processing system, information processing method, and robot apparatus
US7284228B1 (en) * 2005-07-19 2007-10-16 Xilinx, Inc. Methods of using ant colony optimization to pack designs into programmable logic devices

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060059113A1 (en) * 2004-08-12 2006-03-16 Kuznar Lawrence A Agent based modeling of risk sensitivity and decision making on coalitions

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
International Conference on Autonomous Agents, Digital pheromone mechanisms for coordination of unmanned vehicles, Puranak, Brueckner et al. 2002, all pages. *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120302351A1 (en) * 2011-05-27 2012-11-29 Microsoft Corporation Avatars of friends as non-player-characters
US8814693B2 (en) * 2011-05-27 2014-08-26 Microsoft Corporation Avatars of friends as non-player-characters
US9369543B2 (en) 2011-05-27 2016-06-14 Microsoft Technology Licensing, Llc Communication between avatars in different games
US10248957B2 (en) * 2011-11-02 2019-04-02 Ignite Marketing Analytics, Inc. Agent awareness modeling for agent-based modeling systems
US11270315B2 (en) * 2011-11-02 2022-03-08 Ignite Marketing Analytics, Inc. Agent awareness modeling for agent-based modeling systems
US20220148020A1 (en) * 2011-11-02 2022-05-12 Ignite Marketing Analytics, Inc. Agent Awareness Modeling for Agent-Based Modeling Systems
US9336481B1 (en) * 2015-02-02 2016-05-10 James Albert Ionson Organically instinct-driven simulation system and method
WO2017015183A1 (en) * 2015-07-21 2017-01-26 Homeward Health, Llc Psychoscocial construction system
US11481658B2 (en) 2017-10-01 2022-10-25 Pontificia Universidad Javeriana Real-time multi-agent BDI architecture with agent migration and methods thereof
US11316977B2 (en) 2017-10-27 2022-04-26 Tata Consultancy Services Limited System and method for call routing in voice-based call center

Also Published As

Publication number Publication date
US20070162405A1 (en) 2007-07-12
US7921066B2 (en) 2011-04-05

Similar Documents

Publication Publication Date Title
US7921066B2 (en) Characterizing and predicting agents via multi-agent evolution
Ladosz et al. Exploration in deep reinforcement learning: A survey
Michel et al. Multi-agent systems and simulation: A survey from the agent commu-nity’s perspective
Abbass et al. Computational red teaming: Past, present and future
Shum et al. Theory of minds: Understanding behavior in groups through inverse planning
KR102523888B1 (en) Method, Apparatus and Device for Scheduling Virtual Objects in a Virtual Environment
Parunak et al. Concurrent modeling of alternative worlds with polyagents
Van Dyke Parunak et al. Real-time agent characterization and prediction
Johnson Predictive analytics in the naval maritime domain
Zakharov et al. Episodic memory for subjective-timescale models
Abbass et al. Smart shepherding: Towards transparent artificial intelligence enabled human-swarm teams
Yang et al. WISDOM-II: A network centric model for warfare
Parunak et al. Hybrid multi-agent systems: integrating swarming and BDI agents
Parunak et al. Real-Time Evolutionary Agent Characterization and Prediction
Chatty et al. Adaptation capability of cognitive map improves behaviors of social robots
Meng et al. Self-adaptive distributed multi-task allocation in a multi-robot system
Parunak et al. Representing dispositions and emotions in simulated combat
Hu Context-dependent adaptability in crowd behavior simulation
Sanchez et al. VIBES: bringing autonomy to virtual characters
Parunak et al. Evolving Agents to Recognize Plans and Emotions
Conforth et al. Reinforcement learning for neural networks using swarm intelligence
Tinguy et al. Home run: Finding your way home by imagining trajectories
Ruas et al. Modeling artificial life through multi-agent based simulation
Yang A networked multi-agent combat model: Emergence explained
Castro et al. Swarm Intelligence applied in synthesis of hunting strategies in a three-dimensional environment

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION