EP3380994A1 - Systeme et procede d'aide a la decision - Google Patents
Systeme et procede d'aide a la decisionInfo
- Publication number
- EP3380994A1 EP3380994A1 EP16801202.9A EP16801202A EP3380994A1 EP 3380994 A1 EP3380994 A1 EP 3380994A1 EP 16801202 A EP16801202 A EP 16801202A EP 3380994 A1 EP3380994 A1 EP 3380994A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- competitive
- elementary
- function
- entity
- learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/02—Computing arrangements based on specific mathematical models using fuzzy logic
- G06N7/06—Simulation on general purpose computers
Definitions
- the present invention generally relates to data management systems and in particular to a decision support system and method.
- Decision support systems are used in many areas where strategic decisions need to be made, for example in the military field.
- such systems may be useful for optimizing the defense strategy in response to an attack triggered by an attacking device.
- the attacking device can be controlled by one or more operators via a control interface. Modeling the behavior of the attacking device is a key issue in order to predict future actions and adapt the defense strategy accordingly.
- There are known simple strategic decision models that can be applied to provide information on the gain (positive or negative) that a defender device can predict with respect to the actions of the attacking device.
- the notion of gain quantifies the advantages that can be obtained by choosing one action over another, this advantage depending on the choice of the opponent.
- a known modeling approach based on game theory has been used to model the strategic decision in the context of security issues.
- a “game” is made up of a set of competitive entities (also called “players”), a set of movements / actions (also called “strategies”) available for these competitive entities, and specifications of expected gains for each combination of actions.
- equilibrium states can be defined. This means that by defining the security set, it is necessary to know all the actions and the values of possible gains. Balances are situations in which players (including attacking devices and defensive devices in the case of a security game) have no interest in changing their choice of actions (ie their strategies).
- the theory of John Nash (1957) has shown that there are always "mixed" equilibria in a game. This theory means that for any type of game, there is always a probability distribution, beyond the strategies players, which lead to a balance. Determining balances is not always a simple problem and is not always desirable. Indeed, in some cases it may be desirable to determine the solution closest to the "social" optimum rather than the equilibrium.
- the invention improves the situation by proposing a decision support method for determining an action to be implemented by a given competitive entity in a competitive system comprising the competitive entity and at least one other competitive adversary entity, the competitive entity that can implement one of a set of predefined actions, each action providing a different expected gain depending on the actions implemented by said competitive opposing entities, each entity being further able to implement a method of learning from a set of predefined learning methods for learning the actions of the adverse entities, the method comprising:
- the method may for example comprise the generation of an action recommendation comprising an identifier of the determined action.
- the method may include a prior step of modeling the strategic location of the given competitive entity in the form of a game model comprising the set of possible actions of the competitive entities and the gain function. applying to the said actions, the gain function associating with each action or combination of shares of the competitive entities an expected gain.
- the probability parameter may be a weight value.
- the elementary probability functions may correspond to a component of a probability vector defined as a function of a probability distribution. Each component of the probability vector can then depend on predefined elementary weights.
- the method may include a step of calculating the gain function from training data.
- the gain function can in this case depend on at least one of the following multicriteria models among a weighted sum, Choquet Integral, a generalized additive utility model, and a neural network.
- the gain function may depend on a probabilistic model.
- the method may comprise a step of updating at least one elementary probability function by using an update function, in response to the reception of learning data obtained by executing or by simulating the elementary probability function selected and the action actually chosen in the competitive system at least once.
- the update step may include updating the selected elementary probability function.
- the update step may further include updating at least one of said other elementary probability functions.
- the update step may also include applying a different update function for each elementary probability function.
- the update functions may include at least one update function depending on the gain obtained.
- the update functions may comprise at least one update function dependent on elementary weights, each elementary weight being associated with a given action and the elementary weights being a function of the gains obtained.
- the update function of a given elementary probability function can comprise a component per action, each component per action depending on the ratio between the elementary weight associated with the action, at the decision stage considered, and the sum of the elementary weights corresponding to the different components of the elementary probability function at the decision stage.
- the update step may furthermore comprise the updating of the elementary weights as a function of the loss incurred by using the learning method associated with the elementary probability function to be updated, at a minimum. given decision step.
- the updating step may include updating the elementary weights by applying a Boltzmann distribution function to the gains obtained.
- the updating step may include updating the elementary weights based on a parameter representing the state of the environment, an exploration rate, and a discount factor.
- the update functions may comprise at least one update function depending on received gain parameters, which regret the choice of the learning method associated with the elementary probability function rather than another learning method. , at a given decision step.
- the invention further provides a computer program product, wherein the computer program includes code instructions for performing the steps of the method according to any one of the preceding features, when the program is run on a computer.
- the invention also proposes a decision support system for determining an action to be implemented by a given competitive entity in a competitive system comprising said competitive entity and at least one other competitive opposing entity, the competitive entity being able to implement implement an action among a set of predefined actions, each action providing a different expected gain depending on the actions implemented by said opposing competitive entities, each entity being further able to implement a learning method among a set of predefined learning methods for learning the actions of the opposing entities, each learning method being associated with an elementary probability function associating a probability parameter with each possible action of the given competitive entity.
- the decision support system comprises a global learning module configured to determine a global probability function capable of associating a probability parameter with each elementary probability function, the learning module. further comprising a selection unit configured to select one of said elementary probabilities functions using the global probability function, the global learning module being adapted to apply the selected elementary probability function to determine an action among said actions; can be implemented by said given competitive entity, for example to generate a recommendation including an identifier of the action.
- the invention thus provides a method of meta-learning that selects the most appropriate learning method in a decision support system in a strategic decision context. When a control and control unit must make a decision whose advantages or disadvantages depend on the actions of the opposing devices, the decision support system according to the invention is able to provide optimal recommendations.
- Embodiments according to the invention thus make it possible to determine the optimal learning scheme among a set of predefined learning schemes for determining such optimal recommendations.
- the proposed embodiments make it possible in particular to learn from among a set of learning methods that which provides maximum gains for the system.
- the method and the system according to the invention are able to treat uncertainty about the gains of opposing entities. They also dynamically adapt to the addition of new actions in the competitive system or other learning schemes such as Markov Decision Process Algorithm (eg Q-Study, SARSA) which are particularly advantageous for the treatment of dynamic games, that is to say games for which the context also can impact the expected gains.
- Markov Decision Process Algorithm eg Q-Study, SARSA
- FIG. 1 is a diagram of an exemplary architecture implementing a decision support system, according to some embodiments;
- FIG. 2 is a flow chart showing the steps implemented by a learning method according to the prior art during a decision cycle;
- FIG. 3 represents an example of a competitive system of telecommunication system type including a set of transmitters and a set of receivers to which the embodiments of the invention can be applied;
- FIG. 4 represents an exemplary gain matrix corresponding to an example of a competitive system in which a competitive entity is threatened by electronic attacks implemented by one or more attacking entities;
- FIG. 5 is a schematic view of the decision support system comprising a meta-learning device according to some embodiments
- FIG. 6 is a general flow chart showing the main steps of the decision support method, according to some embodiments.
- FIG. 7 shows an example of an environment in which the decision support system can be implemented according to one embodiment
- FIG. 8 is an example of a gain matrix corresponding to the same embodiment as that of FIG. 4;
- FIG. 9 is a diagram representing the evolution of the gains of two competitive entities over time (average over 50 executions), according to an exemplary embodiment, when the two competitive entities use the Brown algorithm;
- FIG. 10 is a diagram representing the evolution of the probability value associated with the choice of each action for each entity, in the example of FIG. 9;
- FIG. 11 is an example of a pay table illustrating the average gain that each competitive entity has obtained after 100 games, according to an exemplary embodiment
- FIG. 12 illustrates the results obtained with a particular type of learning method
- FIG. 13 is a schematic view of a computer system that can be used to implement the decision support system according to some embodiments.
- FIG. 1 schematically shows an exemplary architecture implementing a decision support system 10 according to some embodiments.
- the decision support system 10 interacts with a requesting device 11 (also called a control device).
- a requesting device 11 also called a control device.
- the control device 11 is arranged to control one or more competitive entities 20 A in a competitive system 101.
- the decision support system 10 can receive a request sent by the control device to provide recommendations of actions to be implemented by a competitive competitive entity 20A competitive system (static mode).
- the decision support system 10 may also generate recommendations for actions to be implemented by a dynamically given competitive entity, for example in response to a change in the competitive context detected in the competitive system, or periodically.
- competitive entities or “competitive agents” mean competing agents or entities, that is, having opposing goals, where the success of an entity (or agent) can be realized. by the defeat of one or more other entities (or agents).
- Competitive entities can thus include entities attackers and opposing entities.
- An entity may itself be a device or system.
- Competitive entities are associated with a competitive environment or system (environment or "multi-agent” system) that may include one or more competitive entities.
- the competitive environment may also include independent entities (whose goals are not related to the competitive entities) and / or collaborative entities.
- the control device 1 1 of each competitive entity may be able to trigger actions in the environment of the controlled competitive entity 20A and collect data from the environment for example by means of sensors.
- the sensors may be arranged at the competitive entity 20A or in its environment.
- Each entity can be for example:
- control device 1 1 can control the actions of the device by means of different commands.
- a software device that can implement actions in its environment for example by sending messages via a network, the data of its environment being collected from the movements of the mouse, network messages, etc.
- the behavior of a competitive entity is described by a strategy that defines one or more actions to be implemented by the entity.
- action refers to a "logical” action, that is, an action modeled by the system. This action can correspond to one or more "physical” sub-actions. For example, for a "send a message” action, several physical sub-actions can be implemented such as “choose the frequency", “establish a connection", “send the message”.
- This action strategy can be defined by the decision support system 10 and implemented by a device of the entity 20A.
- the decision support system 10 may implement the chosen action instead of issuing a recommendation, which corresponds to an automated decision.
- the decision support system 10 comprises a meta-learning module 50 (also called a global “learning module” or “global learning device”) configured to select an algorithm for learning. learning among a set of predefined learning algorithms and apply the selected algorithm to determine an action choice to be implemented for a given competitive entity.
- a learning algorithm (or learning method) in a multi-agent system is configured to determine for a given entity (the one that is learning) a strategy that can offer maximum gain relative to the opposing entities, using the experience acquired on the entity's environment (strategic situation data, hereinafter referred to as "learning data").
- a multi-agent learning algorithm thus tries to learn a model represented by a matrix of gains if the opposing strategies are known, or a vector of gains if they are not.
- a matrix of gains associated with a competitive system is represented as a tuple (A 1 JV , Ri .. N , I ... M) where N denotes the number of competitive entities, Cn is the set of actions that the entity n can choose and Rn the matrix / Wx / V-dimensions which gives the possible gains for each of the possible combinations of the M possible actions of the N entities.
- the term "gain" of a competitive entity refers to the profit or loss achieved by that entity as a result of the actions being applied by all entities. It thus denotes a quantitative datum but which can be derived from a qualitative analysis of the situation. Moreover, the gains can be measured (they are then called “observables") or calculated using several parameters characteristic of the situation and combining them into a multicriteria function (for example, weighted sum, integral of Choquet, utility model generalized additive, etc.) or with other methods (eg Bayesian network, neural network, etc.). Whatever the mode of definition of earnings, it can be determined prior to the use of the method. A gain can be positive, negative (gain corresponding to a loss) or zero.
- the term "strategy" for a competitive entity refers to choices made by the entity between a set of actions; the strategy can be pure if it is based on a single deterministic or "mixed” choice if it is based on a probabilistic choice between the actions.
- Known multi-agent learning methods can rely on various known models:
- a learning method learns from the parameters the probabilities of choosing an action for a certain state of the entity and provides a probability distribution on the actions (choice).
- a learning method is associated with an elementary probabilities function corresponding to a distribution of elementary probabilities on the actions that can be implemented by a competitive entity.
- This function of elementary probabilities can be in the form of a probability vector, each component of the probability vector corresponding to the probability of choosing an action of the given entity.
- Known learning algorithms can implement different types of equations or models. The probability vectors associated with each learning algorithm are therefore different from one type of algorithm to another.
- the stable point to reach for a learning algorithm is called Nash equilibrium, this point corresponding to the point constituting the best answer.
- the Nash equilibrium represents a collection of strategies comprising a set of probability vectors for each entity N such that the vector p n is a better response to the vectors p_ n of the competing competitive entities "n".
- the environment of a given competitive entity may be variable.
- Learning algorithms also known as learning methods
- These algorithms can also allow a competitive entity to adapt to the effects of other entities on the learning data.
- the training data may comprise a set of data observed and / or calculated following the execution or simulation of actions in the context of the entities.
- the observation of learning data can be achieved by applying actions and observing the result obtained following the application of these actions.
- the learning data may include earnings data obtained by the competitive entities (learning about the failure / success of the actions).
- Multi-agent learning methods can be characterized by several properties such as a rationality property (the entities seek to maximize their gains over a certain time scale), convergence (a learning algorithm stabilizes into a vector stationary probabilities), security, or "no regret".
- Some learning algorithms may be based on the assumption that the winning matrices of all competitive entities are known and / or that the strategy or actions of the opposing entities are known.
- a learning method can be implemented conventionally according to the steps of the flowchart of Figure 2, using a single learning method throughout the decision cycle including decision steps (or epochs).
- step 202 For a given learning method (block 200), as long as the game is not completed (condition 201), in step 202, an action is chosen according to the learning method.
- step 203 the gain achieved by applying the action is calculated.
- step 204 the probability function of the learning method is updated using the gain.
- a learning method uses the operating information (using past information) and the exploration data (by testing new strategies or strategies already used) in a balanced way.
- the meta-learning module 50 is not limited to the use of a single learning algorithm throughout the decision cycle but uses a set of learning algorithms to determine the action to be implemented by a given entity at a given moment.
- the meta-learning module 50 is thus configured to select a learning algorithm from the set of predefined learning algorithms using the training data, and thus to improve the decision process and the performances of the entity. .
- the meta-learning module 50 makes it possible to dynamically modify the parameters of the decision module as a function of the learning data acquired.
- the learning process includes a set of interaction cycles between the agent, its environment, and the opposing entities.
- the decision support system 10 can receive training data (observation phase), analyze this data to determine the context and the gain of the previously chosen action and dynamically determines a new choice of action using its meta-learning module.
- new learning data can be collected. New decision cycles can then be implemented by iteration of the method.
- the decision support system 10 may return the result to the control device 11 as a recommendation including an identifier of the selected action.
- the control device 11 may then apply the recommendation or not based on criteria specific to the control device, the competitive environment and / or additional information collected.
- control device 11 can be an integral part of the competitive entity in a competitive system, for example for a simulator purpose.
- control device 1 1 can be a combat management system capable of controlling the actions of opposing combat devices (opposing competitive entity) vis-à-vis opposing combat (competitive attacking entity) whose actions may be detrimental to the success of the actions of opposing combat devices.
- the decision support method may be repeated until t reaches a predefined threshold Ts or indefinitely.
- the decision support system 10 can itself determine the stopping of the updating steps of the learning methods by observation of a stabilization (or convergence) of the probabilities of the probability functions. elementary and meta-learning method. For example, if these probabilities do not evolve between two steps t and t + 1 beyond a threshold value e.
- the gain can be defined by a function or observed directly as a value (for example: the number of combat units still active). It should be noted that the gain function models the advantage of making a decision (ie making a choice) with respect to the decisions (or choices) of the opposing entities. In some embodiments, the gain function may be impacted by some uncertainty as to the characteristics of the environment or the sensors that are used to collect environmental data. It then takes into account a probability distribution (stochastic game theory).
- the gain function can also cover several characteristics of the given situation and / or the resources of the considered entity (for example: the number of combat units still active + the land gained / lost + the cost of the maneuver + etc.), this is called a multi-criteria function.
- Such functions may take the form of Choquet Integral or Generalized Additive Utility Model.
- Embodiments of the invention may for example be implemented in a competitive system of telecommunication system type including a set of transmitters 20A and a set of receivers 20B / 20C as shown in Figure 3.
- such a system comprises one or more transmitters 20A and one or more receivers 20B / 20C interconnected in a communication network that can be formed by competitive entities, a transmitter 20A being able to constitute an opposing entity and one or more receivers. 20B can constitute the attacking entities.
- the competitive system 10 comprises an attacking receiver 20B and an opposing transmitter 20A.
- the transmitter 20A wishes to send a message on a public transmission channel to a target receiver 20C.
- the exchanged message 30 may be a clear message (i.e. not encrypted) or an encrypted message.
- the purpose of the attacking receiver 20 B is to attempt to block the message.
- the transmitters 20A and the receivers 20B and 20C may for example be mobile user equipment such as mobile phones or smart phones, in a mobile communication network.
- the competitive system may include 20A transmitters and receivers 20B / 20C type clients / servers exchanging http messages in a Internet network according to the Internet protocol, the entity 20B attempting to block the messages sent by the entity 20A to a destination device 20C (computer, smart phone, tablet computer, etc.).
- a destination device 20C computer, smart phone, tablet computer, etc.
- the opposing entities 20B can attempt to hinder the routing of the message sent by a transmitter 20A by means of many techniques such as attack techniques:
- intrusion Exploiting system vulnerabilities to execute unauthorized commands such as exploiting configuration errors or bugs
- the invention is not limited to this type of competitive system and encompasses any type of competitive system comprising at least two opposing competitive entities.
- the environment itself can be considered as an opposing competitive entity if it is the only one to impact the gains of the system.
- the environment may be the competitive entity itself.
- the traffic of the users can be considered as the competitive entity, the users constituting an environment whose objective is to maximize its flow. in the network.
- the invention is not limited to the application examples mentioned in the description above.
- the decision support system of the invention can be used in a combat system in which the competitive entities consist of military devices for choosing a firing strategy, command maneuvers, radio frequencies, etc. .
- the decision support system of the invention can be used in an energy management system including power generation entities and communication entities. energy consumption, the decision support system 10 being usable by a production entity to decide between an action of storing energy or reselling energy to consuming entities.
- the decision support system of the invention may be used in a transport management system in which the entity in question is configured to allocate resources (number of wagons or buses, time delays). waiting for fire, etc.) or in a safety management system to determine security strategies by simulating intrusions by attackers.
- the method and the decision support system in the embodiments of the invention make it possible to control the actions of a competitive entity by determining an optimal choice of action for a given entity by selecting a learning algorithm among a set of predefined learning algorithms at each decision stage.
- FIG. 4 represents a gain matrix (also called a pay table) corresponding to an example of a competitive system where a competitive entity 20A is threatened by electronic attacks implemented by one or more attacking entities 20B.
- a gain matrix also called a pay table
- an opposing entity 20A can "win” by sending data over an unblocked or “lost” communication means by sending the message in a secure communication medium.
- the table of FIG. 4 corresponds to an example where only one means of communication (for example of antenna type) is used with the application of a cost c when an attacking entity blocks the communication means.
- the example table of FIG. 4 corresponds to a single opposing entity.
- the opposing entity 20A can choose to send or not to send the data by various means of communication (antennas, satellites).
- the opposing entity or entities 21 B can choose to block or not one or more of these means of communication.
- the possible actions of the opposing entity 20A (“send” or “do not send” the message) in the competitive environment are indicated in the first column 40 while the possible actions of the attacking entity 20B ("block", “Not to block” the means of communication) in the competitive environment are indicated in the first line 40.
- the estimated gain for the opposing entity 20A while in the right-hand part is indicated the estimated gain for the attacking entity 20B.
- the Nash equilibrium is represented by a probability vector ⁇ for each competitive entity.
- the invention is of particular interest for decision support in contexts of non-cooperative decisions. Indeed, in such contexts, the earnings function takes into account the gains perceived by the so-called "friendly” competitive entities.
- the control device 1 1 can then have the ability to observe learning data of "friendly” competitive entities either by sensors or by a communication module through which the "friendly" competitive entities can send this data.
- Figure 5 is a schematic view of the decision support system 10 including a meta-learning module according to some embodiment.
- the decision support system 10 is configured to determine a choice of action to be implemented by a competitive entity, for example an opposing entity 20A, in a competitive environment, using a set of predefined learning methods.
- the meta-learning module 50 determines and uses a learning meta-function to select a learning method from the predefined learning methods 52 and uses the selected learning method to determine an action to be implemented. by the competitive entity (ie strategy).
- the decision support system 10 includes the meta-learning module 50 (also known as the global learning module) for selecting a learning-by-learning method from among the set of K learning methods 52.
- the meta-learning module 50 may further comprise:
- a game model generator 51 configured to generate a game model (also called a "strategic situation") according to the context of use considered.
- the model includes a set of possible actions of the entity that uses the system 10 and the gain function that applies to the actions. Gains can be observed or observed and calculated.
- the gain function makes it possible to calculate the gains that are a priori unknown (relative to the given situation, ie the actions of the entity competitively considered, those of adversary entities, or other information on the state).
- the gain function may or may not be modeled and is an input of the decision support system 10.
- the model thus generated may be used without the update phase to receive or determine the training data.
- An initialization unit 53 for initializing the learning methods of the predefined set 52; and
- a learning method selection unit 54 for determining a learning meta-function (also called a "global” learning function) and selecting a learning method among the K learning methods using the meta -function of learning;
- An action determination unit 55 for determining an action choice to be implemented by a given competitive entity from the selected learning method.
- the terms "context” or "situation” refer to the application environment in which the decision support system 10 is used and on which the control device 1 1 depends.
- the context may be for example a military context using a control device 1 1 implementing a situation suit.
- the application context may be a telecom context using a control device 1 1 of surveillance device type.
- a context-related device that may be the controller 1 1 itself or a separate device is configured to collect the training data (or requested from them) once the action is chosen. is executed and provided to the decision support system 10.
- the method and the decision support system according to some embodiments of the invention can be implemented either: in a decision phase for determining an action (hereinafter also called “strategy” or “choice” or “strategic action”) to be implemented by a competitive entity 20A to obtain an optimal gain compared to the adverse entities 20B;
- a decision phase for determining an action (hereinafter also called “strategy” or “choice” or “strategic action”) to be implemented by a competitive entity 20A to obtain an optimal gain compared to the adverse entities 20B;
- FIG. 6 is a general flow chart showing the main steps of the decision support method, according to some embodiments, that can be implemented in a competitive system comprising a set of competitive entities.
- Each learning method M k of the set of learning methods 52 corresponds to a learning method able to "learn" which actions are likely to bring the best gain with regard to the opponents' choice of actions.
- This best response determination strategy is known to converge to a pure Nash equilibrium if it exists. If it does not exist, the learning methods can be more or less well adapted to find a mixed Nash equilibrium or the vector of probabilities maximizing the gains, the invention converging in fine towards the best adapted.
- To each learning method is associated a basic probability function PEk which associates a probability p ik with each action Ci among m actions that can be implemented by a competitive entity 20A given the competitive system.
- the elementary probability functions PEk can be defined by a probability distribution.
- the probability distribution can be in the form of a probability vector, each component of which corresponds to one of the elementary probability functions PEk.
- PfcC (Plfc C - Pifc C - Pmfc C)
- the probability parameters can for example be calculated using weights. The remainder of the description will be made with reference to weight type probability parameters, by way of non-limiting example.
- a prior step 600 may be implemented to load a set of learning algorithms ⁇ 1, ..., K] for use by the decision support method.
- one or more learning algorithms may be hot added or deleted at any point in the decision process.
- a trigger condition relating to a given competitive entity 20A in a competitive system is detected.
- the triggering condition can be detected in response to the reception of a request sent by a control device 1 1 controlling the competitive entity, the request comprising the identification of the competitive entity and data on the context. of the entity and on the opposing entities 20B.
- the request may be issued by the control device 1 1 to obtain a recommendation of action Ci (hereinafter called "strategic choice”) to be implemented by the competitive entity 20A vis-à-vis the adverse entities 20B of the competitive system 101 such as the action Ci optimize the gain of the opposing entity 20A vis-à-vis these adverse entities (also called attacking entities).
- the recommended action Ci is associated with an expected gain which may depend on one or more adverse choices if the system comprises several opposing entities 20B.
- elementary probability functions ⁇ PE1, ... PEK ⁇ are initialized (604) or updated (605 and 606) in correspondence with each learning algorithm k ( ME1, .... MEK).
- Each elementary probability function PEk associates a probability parameter with each possible action of the competitive entity considered 20A, these probability parameters corresponding to a probability distribution over the set of possible actions of the competitive entity 20A.
- each probability parameter may be a weight or score.
- each probability function can be defined by a probability vector comprising a set of components, each component of the probability vector representing the probability parameter associated with one of the actions Ci.
- each elementary probability function PEk is initialized (604).
- the elementary probabilities functions can be initialized to the same value (ie the weights w fcl (t), ..., w km t) are the same for all the functions PEk), according to a distribution of uniform probabilities.
- step 604 can include the initialization of the meta-function of probabilities (also called "global function of probabilities") which associates a weight (or more generally a parameter of probabilities) with each of the functions of elementary probabilities.
- the basic probability functions PEk can be updated according to learning data or change on all the learning algorithms (addition or deletion), in step 605 and 606.
- step 607 the function of global probabilities ("meta-function of probabilities”) MF, noted pt), is updated using the gain obtained following the implementation of a chosen action.
- the weights w ik (t) and w fc (t) are calculated at each decision step t and can for example be calculated from equations using the gains obtained by applying the gain function on the training data that can be provided by the competitive system 1 01 via the control system 1 1.
- one of the elementary probability functions PEk is selected using the MF probability meta-function. To do this, the system randomly draws a value between 0 and 1 and compares this value with the probabilities of the probability function ME. For each elementary function PEj, the probabilities of the function ME are summed. If the PEj the sum exceeds the function randomly drawn value, then the selected elementary probability function is the function t PEj_.
- the selected elementary probability function PEk is used to determine the strategic choice Ci (action) of the competitive entity 20A with respect to the opposing entities 20B.
- the selected elementary probabilities function PEk can choose the action Ci using a probability distribution (for example, if the weights are probabilities, a random draw can be made and the result of the random draw can be compared to the probability distribution ). It should be noted that the weights can be reduced to probabilities by dividing each weight by the sum of the weights of the probability vector p k t).
- a recommendation may be sent to the control device 11 (or directly to the competitive entity 20A if the control device is an integral part of the entity), the recommendation may include an identifier of the choice of the action Ci determined in step 61 0.
- the control device 1 1 can trigger the application of the action Ci to the situation (or context) of the competing competitive entity 20A that it controls, or take another control decision based on a set of information on the environment and / or the context of the entity 20A.
- the control device 1 1 can trigger the execution of the selected strategic choice Ci to the actual situation (actual execution) or simulated (simulated execution).
- the control device 11 may also be configured to estimate or measure the gain obtained and other auxiliary data (the gain data and auxiliary data forming learning data) as a result of the execution of the action.
- the "gain" obtained can represent the ratio of the observed result to the expected result, a measurement per sensor, etc. It can be calculated from a multicriteria function involving data relating to several observed metrics as well as expected values on these metrics. It can also involve methods to take into account an uncertainty on the observation (for example the error rate).
- the control device 11 can then transmit the learning data including data on the gain obtained to the decision support system 10 (in feedback mode).
- control device 11 may be an integral part of the decision support system 10. More specifically, in some embodiments, the decision support method may comprise in addition, a step of updating at least one elementary probability function, in step 607, in response to the reception of learning data collected as a result of the execution of the strategic choice Ci in the situation of the given competitive entity 20A (605) and after extracting from these data the metrics involved in calculating the gain function (606).
- the update step 607 includes updating the selected elementary probability function, and may also include updating one or more other elementary probability functions.
- the update of basic probability functions can also be triggered in response to an addition or removal of learning methods.
- the training data collected by the control device 11 (605) can thus be used in the gain function which gives the gain and / or in the step of updating the elementary probability functions.
- the update step 607 can comprise the updating of the elementary probability functions PEk from the training data (notably the gain obtained) and by using an update function which can depend on the method associated with each elementary probability function to be updated.
- the update function can be configured to update the components of the probability vector or the values of the probability parameters associated with the actions (weight for example).
- the same updating function can be defined for all the elementary probability functions of the set 52.
- an update function can be defined for a single elementary probability function or for a single elementary probability function. a subgroup of elementary probability functions of the set 52.
- the meta-learning module 50 represented in FIG. 5 may notably comprise a logical proxy 59 (represented schematically) able to implement the defined update function. for each elementary probability function.
- steps 601 to 61 1 can be iterated several times using a gain matrix associated with the situation. complete (or supplemented by interpolation of some values) to drive the learning meta-function and accelerate convergence to optimal probabilities (the meta-learning module 50 learns about learning methods).
- a single iteration of the decision support method of FIG. 6 can be implemented for each decision step t, using the training data (in particular measurements of the obtained gain provided by one or more sensors).
- Such sensors can also be used to provide information on the action actually performed.
- the nature and positioning of the sensors may depend on the application context and / or the competitive environment 101.
- the sensors can for example include satellites.
- they may for example comprise a probe configured to duplicate the packets to be examined.
- the learning data (notably on the earnings measurements and the actions performed by the competing entities) may be absent or uncertain.
- the method and the meta-learning module thus make it possible to determine the optimal learning method for a given competitive entity among the set of K learning methods (52) from the gain functions of the competitive entities, the functions of gains of opposing entities that may or may not be unknown (for example, when the decision support system 10 does not have the earnings matrix).
- the execution of the initialization step 604 of the method of FIG. 6 (condition 603) can be initiated in response checking a condition relating to the decision step t (t ⁇ T).
- condition 604 is executed.
- the gain function can be for example a multi-criteria mathematical function of the Choquet Integral type, of the Generalized Additive Utility model type or of the neural network type.
- the gain function can be calculated using a probabilistic Bayesian network model if some criteria are uncertain. For example, when the training data is collected from various sensors, the sensors can be chosen to have a non-deterministic level of precision (error rate, etc.) and / or not be able to obtain the information .
- the probability vector pt) and each elementary probability vector p k t) (corresponding to an elementary probability function PEk corresponding to a given learning method Mk) can be updated respectively in the steps 607 and 608 using an update function p k (t) dependent on elementary weights w fc (t), or an update function for components p ik (t) dependent on elementary weights w ife (t) , the elementary weights being a function of the gains obtained (56).
- variable p k representing the probability that the meta-function proposes the elementary function k at step 609.
- variable p ik representing the probability that the elementary function k proposes action i at step 610;
- variable w ik representing the weight of the elementary function k corresponding to the action i
- variable w k representing the total weight associated with each elementary probability function k (sum of w ik );
- variable w representing the sum of the variables w k .
- steps 601 to 61 1 can be re-iterated T times.
- the above variables are then noted by associating the expression "(t)".
- the update function of the components of the overall probability function, in step 608, may depend on the ratio of the elementary weights w fe (t) and the elementary weights w (t), for example according to equation (2):
- each probability vector p (t) or p k t) can be updated at step 608 so as to guarantee some exploration (testing of new actions or replay of certain actions) as follows:
- the parameter 0 ⁇ y t ⁇ 1 may be decreasing in time to stabilize the exploration or constant.
- the probability distribution can be updated at steps 607 and / or 608 directly from gain parameters that measure the regret of having chosen a learning method at a given decision step.
- each component w jk of the probability vector p k t) can then be updated in step 607 according to the function of next update, where Ct denotes the action chosen by the system at the decision step t:
- each component w fc of the global vector p (t) can be updated at step 608 according to the following update function, where Ct designates the action chosen by the system at the decision step t:
- each component w ik of the elementary vector p fc (t) can be updated, in step 607, directly from the gain obtained according to equation [3] with b ⁇ 0.5 and b can decrease in time according to the following update function: w ik (t) + b (1 - w ik (t)) u (i, t) if Ci ⁇ Ct not chosen at 1 step t
- each component w fe of the global vector p (t) can be updated, in step 608, directly from the gain obtained according to equation [3] with b ⁇ 0.5 and b being able to decrease in time according to the following update function:
- step 607 of updating the elementary probabilities functions (respectively of the meta-function at step 608) according to the equations [1] (respectively [2]) and [3] (respectively [4]) can understand the update of the elementary weights w ik t + 1) (respectively w k (t + 1) using the gains obtained or the formula of equation [9] (respectively [10] for updating the overall probability vector at step 608 ):
- Equation [9] the parameters ⁇ ⁇ [0, ⁇ ] and Z É (t) denote the loss incurred by choosing the action Ci at the decision step t.
- the parameter l k (t) denotes the loss incurred using the learning method k at the decision step t
- the equations [7] and [8] are particularly suitable when the distribution of gain in time is not known. In particular, if a significant variation of the gain is observed for the same given actions chosen by one or more learning methods.
- the weights may take into account the state of the environment S defined by the decision support system 10.
- M3 ⁇ 4 k (t + 1) (1 - a) w sk (t) + [u t (a k ) + ⁇ ax s ⁇ k ⁇ w s ⁇ k ⁇ (t + 1)] [14]
- parameter a refers to the exploration rate, which can also decrease over time
- ⁇ is a discount rate (it allows us to weight the importance of future earnings).
- the training data collected by the decision support system 10 may be average earnings data.
- the elementary probability functions PEk can then be defined by using as a parameter of probability associated with each action Ci a score.
- the score associated with an action Ci can be determined according to the average of the gains obtained for this action in response to the execution of the method. PEk learning at least once.
- a hazard can be used to choose a uniform probability distribution on the actions.
- the score associated with each action Ci can be determined from the average of the gains received by using the learning method k corresponding to the function of elementary probabilities PEk and taking into account an exploration factor.
- the learning meta-function can be used to draw , which makes it possible to select an elementary probability function.
- the decision support system 1 0 is configured to collect information on the actions of the opposing entities 20B and to receive information as to the number of times each Ci action was chosen for each action Ci .
- the training data may then include data on the actions of the adverse entities 20B.
- the decision support system 10 can determine probability distributions relative to the actions chosen by the opposing entities and making it possible to determine the probable actions of the opposing entities.
- Each elementary probability function PEk can then associate a probability parameter not only with the possible actions of the opposing entity 10A but also with the possible actions of the adverse entities 20B.
- the action chosen at the stage 610 therefore corresponds to the action that maximizes the gain of the competitive entity 20A by facing the strategy of the opposing entities.
- Fig. 7 shows an exemplary environment in which the decision support system 10 may be implemented according to one embodiment.
- the competitive system 101 comprises competitive entities of the computer type connected via a network 102.
- the entity 20A (opposing entity) seeks to send a message (for example an http message) to a recipient computer 20C via an Internet network 102.
- Competitive 20B tries to block the sending of the message.
- the decision support system 10 is similar to that of FIG. However, in the embodiment of Figure 7, the decision support system 10 further includes a context monitoring unit 57 configured to monitor context changes.
- the monitoring unit 57 may comprise an action detector 570 for detecting the new actions of a competitive entity and a gain divergence detector 571 for detecting a gain divergence of a competitive entity with respect to a target gain ( eg average gain).
- the decision support system 10 may also include a learning method update unit 58 configured to update one or more of the K learning methods as a function of a detected change. by the context monitoring unit 57, such as for example the appearance of new actions detected by the action detector 570 or the detection of a sharp gain divergence detected by the gain detector 571.
- the gain detector 571 can apply a set of statistical tests relating to the gains such as, for example, a limit test on the variance of gain or a test such as ARIMA (acronym for "Autoregressive Integrated Moving Average” means “Average Displacement”). Integrated Autoregressive ”) or the Page-Kinkley test.
- the update unit of the learning methods may trigger a reset of the training data. This reset can be implemented in the form of a restart, for example by setting to 1 the elementary weights associated with the elementary functions PEk or by modifying the weights according to a uniform probability distribution.
- the reset can be implemented by modifying the elementary probability functions PEk corresponding to the learning functions Mk so that they associate a probability parameter (eg weight) with each new action detected, by setting the parameter from probability to an initial value.
- Initial values can be determined by pulling against the probability meta-function.
- the inventor has compared the system performance and decision support method of the invention with Brown's classic game algorithm, as illustrated by the payout matrix of FIG. 8.
- a first set of experiments were conducted 50 times for 100 decision steps in a competitive system comprising two entities 1 and 2 using the Brown algorithm.
- the two competitive entities were first observed assuming that each entity may have information about the choices of the other competitive entity and the earnings matrix of the other competitive entity is known.
- FIG. 9 represents an example of evolution of the gains of entities 1 and 2 obtained in time (average over 50 executions) when the two competitive entities 1 and 2 use the Brown algorithm: in the long term, the entity 2 gains better than Entity 1.
- FIG. 10 represents an example of evolution of the probability value associated with the choice of each action A or B for each entity 1 or 2 during the experiment:
- a first curve C1 represents the evolution of the probability value associated with the choice of each action A by the entity 1 during the experiment
- a second curve C2 represents the evolution of the probability value associated with the choice of each action B by the entity 1 during the experiment
- a third curve C3 represents the evolution of the probability value associated with the choice of each action A by the entity 2 during the experiment.
- a fourth curve C4 represents the evolution of the probability value associated with the choice of each action B by the entity 2 during the experiment.
- Figure 10 shows that each entity follows a mixed strategy associating the probability values; ⁇ to the actions ⁇ A, B ⁇ .
- FIGs 9 and 10 thus illustrate that Brown's algorithm converges to mixed Nash equilibrium.
- the decision support method has been implemented for the example of competitive context illustrated by the tables of Figure 2 by varying the learning method for each competitive entity. It has been observed that the best performing learning method for this example is the learning method corresponding to equation [5].
- the invention has been compared with Brown's algorithm and a conventional learning algorithm according to the invention. equation [5]. It should also be noted that other experiments have shown that the learning algorithm according to equation [5] is more efficient against Brown's algorithm. It is recalled that, like Brown's algorithm, the formula [5] assumes that the environment has an impact on the competitive context. The environment can be modeled in different ways, for example using the strategy of the opposing entity.
- Figure 1 schematically shows an exemplary architecture implementing a decision support system 10 according to some embodiments.
- a learning method can be implemented conventionally according to the steps of the flowchart of Figure 2, using a single learning method throughout the decision cycle.
- Embodiments of the invention may for example be implemented in a competitive system of telecommunication system type including a set of transmitters 20A and a set of receivers 20B / 20C as shown in Figure 3.
- FIG. 4 represents a gain matrix (also called a pay table) corresponding to an example of a competitive system where a competitive entity 20A is threatened by electronic attacks implemented by one or more attacking entities 20B.
- a gain matrix also called a pay table
- Figure 5 is a schematic view of the decision support system 10 including a meta-learning module according to some embodiment.
- FIG. 6 is a general flow chart showing the main steps of the decision support method, according to some embodiments, that can be implemented in a competitive system comprising a set of competitive entities.
- Fig. 7 shows an exemplary environment in which the decision support system 10 may be implemented according to one embodiment.
- the competitive system 101 comprises competitive entities of the computer type connected via a network 102.
- the inventor has compared the system performance and decision support method of the invention with Brown's classic game algorithm, as illustrated by the payout matrix of FIG. 8.
- the payout matrix corresponds to the same embodiment that Figure 3.
- FIG. 9 represents an example of evolution of the gains of entities 1 and 2 obtained in time (average over 50 executions) when the two competitive entities 1 and 2 use the Brown algorithm: in the long term, the entity 2 gains better than Entity 1.
- FIG. 10 represents an example of evolution of the probability value associated with the choice of each action A or B for each entity 1 or 2 during the experiment.
- the table in FIG. 11 shows the average gain that each competitive entity obtained after 100 parts (the results for entity 1 are indicated in the left columns 900 and those obtained for entity 2 are indicated in the columns of FIG. right 902).
- the best strategy for competitive entity 1 is to choose a learning method based on the formula EQ.5 while that of the competitive entity 2 is to use the method of assistance to the decision according to the invention which uses a learning meta-function.
- Table 12 shows the results obtained with a more "blind” learning method, based on equation 4. In this case, equilibrium is reached when both entities use the decision support method according to the invention (in this example equilibrium is the social optimum).
- the decision support method according to the embodiments can be implemented in various ways by hardware, software, or a combination of hardware and software, particularly under the A form of program code that can be distributed as a program product in a variety of forms.
- the program code may be distributed using computer readable media, which may include computer readable storage media and communication media.
- the methods described in the present description can in particular be implemented in the form of computer program instructions executable by one or more processors in a device. computer computing. These computer program instructions may also be stored in a computer readable medium.
- the decision support system 10 and / or the control device 11 and / or each competitive entity 20A or 20B can be implemented in the form of one or more devices or computer systems 70 (hereinafter referred to as computer).
- the computer 70 may comprise a processor 71, a memory 72, a mass storage memory device 75, an input / output (I / O) interface 77 (for example, video screen, touch screen, input devices and commands such as alphanumeric keyboard, pointing device, keypads, pushbuttons, control buttons, microphones, etc.).
- the computer 70 may also be operatively coupled to one or more external resources via a network 76 and / or an I / O interface 77.
- External resources 79 may include, but are not limited to, servers, databases, and databases. data, mass storage devices, peripheral devices, cloud based network services, or any other appropriate computer resource that can be used by the computer 70.
- the processor 71 may include one or more processor devices such as microprocessors, microcontrollers, central processing units, or any other device that manipulates signals (analog or digital) according to operation instructions that are stored in the processor. memory 72.
- the processor 71 can operate under the control of an operating system 73 that resides in the memory 72.
- the operating system 73 can manage computer resources such as an integrated computer program code in the form of one or more software applications 74 residing in the memory 72.
- the invention is not limited to the embodiments described above by way of non-limiting example. It encompasses all the embodiments that may be envisaged by those skilled in the art. In particular, the invention is not limited to a particular competitive system and includes any competitive system including at least two opposing competitive entities.
- the set of learning methods can include any type of learning method without limitation.
- This set is also not limited by a particular number of learning methods.
- the invention is also not limited to particular functions for updating learning methods. These update functions may differ for each learning method. They may also change for a given learning method between each iteration of the decision support method.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Pure & Applied Mathematics (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Algebra (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Molecular Biology (AREA)
- Fuzzy Systems (AREA)
- Biomedical Technology (AREA)
- Medical Informatics (AREA)
- Automation & Control Theory (AREA)
- Operations Research (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- User Interface Of Digital Computer (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Feedback Control In General (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR1502483A FR3044438A1 (fr) | 2015-11-27 | 2015-11-27 | Systeme et procede d'aide a la decision |
PCT/EP2016/078634 WO2017089443A1 (fr) | 2015-11-27 | 2016-11-24 | Systeme et procede d'aide a la decision |
Publications (1)
Publication Number | Publication Date |
---|---|
EP3380994A1 true EP3380994A1 (fr) | 2018-10-03 |
Family
ID=56369003
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP16801202.9A Pending EP3380994A1 (fr) | 2015-11-27 | 2016-11-24 | Systeme et procede d'aide a la decision |
Country Status (6)
Country | Link |
---|---|
US (1) | US11120354B2 (fr) |
EP (1) | EP3380994A1 (fr) |
CN (1) | CN108701260B (fr) |
CA (1) | CA3006383A1 (fr) |
FR (1) | FR3044438A1 (fr) |
WO (1) | WO2017089443A1 (fr) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA3067903A1 (fr) * | 2017-06-05 | 2018-12-13 | Balanced Media Technology, LLC | Plateforme destinee au traitement collaboratif de taches informatiques |
EP3746890A4 (fr) | 2018-03-26 | 2021-12-08 | Balanced Media Technology, LLC | Interface rendue abstraite pour ludification d'algorithmes d'apprentissage automatique |
WO2020147074A1 (fr) * | 2019-01-17 | 2020-07-23 | Alibaba Group Holding Limited | Schémas d'échantillonnage pour une recherche de stratégie dans une interaction stratégique entre des parties |
WO2020098822A2 (fr) | 2019-12-12 | 2020-05-22 | Alipay (Hangzhou) Information Technology Co., Ltd. | Détermination de politiques de sélection d'action d'un dispositif d'exécution |
WO2020098821A2 (fr) | 2019-12-12 | 2020-05-22 | Alipay (Hangzhou) Information Technology Co., Ltd. | Détermination de politiques de sélection d'action d'un dispositif d'exécution |
WO2020098823A2 (fr) * | 2019-12-12 | 2020-05-22 | Alipay (Hangzhou) Information Technology Co., Ltd. | Détermination de politiques de sélection d'actions d'un dispositif d'exécution |
CN111311947B (zh) * | 2020-03-02 | 2021-01-08 | 清华大学 | 一种网联环境下考虑驾驶人意图的行车风险评估方法和装置 |
CN111405646B (zh) * | 2020-03-17 | 2022-06-03 | 重庆邮电大学 | 异构蜂窝网络中基于Sarsa学习的基站休眠方法 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7483867B2 (en) * | 2001-06-26 | 2009-01-27 | Intuition Intelligence, Inc. | Processing device with intuitive learning capability |
EP2249292A1 (fr) * | 2009-04-03 | 2010-11-10 | Siemens Aktiengesellschaft | Mécanisme de prise de décision, procédé, module, et robot configuré pour décider d'au moins une action respective du robot |
WO2013059517A1 (fr) * | 2011-10-18 | 2013-04-25 | Causata Inc. | Apprentissage de différences temporelles en ligne à partir d'historiques incomplets d'interactions avec des clients |
CN102724257A (zh) * | 2011-11-07 | 2012-10-10 | 李宗诚 | 互联网全息协同系统配置机制设计 |
US9679258B2 (en) * | 2013-10-08 | 2017-06-13 | Google Inc. | Methods and apparatus for reinforcement learning |
US10325202B2 (en) * | 2015-04-28 | 2019-06-18 | Qualcomm Incorporated | Incorporating top-down information in deep neural networks via the bias term |
-
2015
- 2015-11-27 FR FR1502483A patent/FR3044438A1/fr active Pending
-
2016
- 2016-11-24 WO PCT/EP2016/078634 patent/WO2017089443A1/fr active Application Filing
- 2016-11-24 CA CA3006383A patent/CA3006383A1/fr active Pending
- 2016-11-24 US US15/778,600 patent/US11120354B2/en active Active
- 2016-11-24 EP EP16801202.9A patent/EP3380994A1/fr active Pending
- 2016-11-24 CN CN201680079996.7A patent/CN108701260B/zh active Active
Also Published As
Publication number | Publication date |
---|---|
WO2017089443A1 (fr) | 2017-06-01 |
CN108701260A (zh) | 2018-10-23 |
CN108701260B (zh) | 2022-09-27 |
US11120354B2 (en) | 2021-09-14 |
FR3044438A1 (fr) | 2017-06-02 |
US20180349783A1 (en) | 2018-12-06 |
CA3006383A1 (fr) | 2017-06-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2017089443A1 (fr) | Systeme et procede d'aide a la decision | |
Dahiya et al. | A reputation score policy and Bayesian game theory based incentivized mechanism for DDoS attacks mitigation and cyber defense | |
US11250322B2 (en) | Self-healing machine learning system for transformed data | |
EP3542322A1 (fr) | Gestion et évaluation de modèles appris par apprentissage machine sur la base de données journalisées localement | |
US11558403B2 (en) | Quantum computing machine learning for security threats | |
CN110235149B (zh) | 神经情节控制 | |
Pauna et al. | Qrassh-a self-adaptive ssh honeypot driven by q-learning | |
Bidgoly et al. | Modelling and quantitative verification of reputation systems against malicious attackers | |
Kachavimath et al. | A deep learning-based framework for distributed denial-of-service attacks detection in cloud environment | |
CN115841366B (zh) | 物品推荐模型训练方法、装置、电子设备及存储介质 | |
Zhang et al. | Detecting Insider Threat from Behavioral Logs Based on Ensemble and Self‐Supervised Learning | |
CN115481441A (zh) | 面向联邦学习的差分隐私保护方法及装置 | |
CN117150566B (zh) | 面向协作学习的鲁棒训练方法及装置 | |
Musman et al. | Steps toward a principled approach to automating cyber responses | |
Nalayini et al. | A new IDS for detecting DDoS attacks in wireless networks using spotted hyena optimization and fuzzy temporal CNN | |
CN113379536A (zh) | 一种基于引力搜索算法优化递归神经网络的违约概率预测方法 | |
Mourad et al. | Machine assisted human decision making | |
Booker et al. | A model-based, decision-theoretic perspective on automated cyber response | |
CN114666107A (zh) | 移动雾计算中一种高级持续性威胁防御方法 | |
CN117999554A (zh) | 一种蜜罐实体及其操作方法 | |
Lisas et al. | Sequential Learning for Modelling Video Quality of Delivery Metrics | |
FR3105489A3 (fr) | Dispositif et procede de detection de fraude | |
Ishitaki et al. | Performance evaluation of a neural network based intrusion detection system for tor networks considering different hidden units | |
Zhang et al. | Gaussian process learning for cyber‐attack early warning | |
CN113794699B (zh) | 一种网络分析处理方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: UNKNOWN |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20180524 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
17Q | First examination report despatched |
Effective date: 20201008 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |