CN107135224A - Cyber-defence strategy choosing method and its device based on Markov evolutionary Games - Google Patents

Cyber-defence strategy choosing method and its device based on Markov evolutionary Games Download PDF

Info

Publication number
CN107135224A
CN107135224A CN201710334463.5A CN201710334463A CN107135224A CN 107135224 A CN107135224 A CN 107135224A CN 201710334463 A CN201710334463 A CN 201710334463A CN 107135224 A CN107135224 A CN 107135224A
Authority
CN
China
Prior art keywords
mrow
game
msubsup
stage
attacking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710334463.5A
Other languages
Chinese (zh)
Other versions
CN107135224B (en
Inventor
张恒巍
王娜
黄健明
韩继红
王衡军
李涛
寇广
王晋东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
PLA Information Engineering University
Original Assignee
PLA Information Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by PLA Information Engineering University filed Critical PLA Information Engineering University
Priority to CN201710334463.5A priority Critical patent/CN107135224B/en
Publication of CN107135224A publication Critical patent/CN107135224A/en
Application granted granted Critical
Publication of CN107135224B publication Critical patent/CN107135224B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/20Network architectures or network communication protocols for network security for managing network security; network security policies in general
    • H04L63/205Network architectures or network communication protocols for network security for managing network security; network security policies in general involving negotiation or determination of the one or more network security mechanisms to be used, e.g. by negotiation between the client and the server or between peers or by selection according to the capabilities of the entities involved
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Algebra (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Pure & Applied Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The present invention relates to a kind of cyber-defence strategy choosing method and its device based on Markov evolutionary Games, this method is included:According to dynamic attacking and defending game during network-combination yarn, multistage Markov attacking and defending Evolutionary Game Model is built, the model includes multiple subgame stages;For multistage Markov attacking and defending Evolutionary Game Model, the optimal defence policies in attacking and defending each stage of game are solved using optimal defence policies Algorithms of Selecting and exported.The present invention is directed to multistage Markov attacking and defending Evolutionary Game Model analog network attacking and defending Dynamic Evolution, from the angle of Attack Defence, state transition between each evolutionary phase is described as random process, on the basis of Markov process is used for reference, multistage Markov evolutionary Game is built;Discount total revenue using game introduces discount factor ξ and carries out discount processing, research and probe Network Security Analysis Method and defense technique system to the game income of different phase, have important practical significance as object function.

Description

Cyber-defence strategy choosing method and its device based on Markov evolutionary Games
Technical field
The invention belongs to technical field of network security, more particularly to a kind of cyber-defence plan based on Markov evolutionary Games Omit choosing method and its device.
Background technology
Face the challenge of cyberspace security fields directly, strengthen network security defence capability, it is ensured that cyberspace safety into For pressing issues urgently to be resolved hurrily.The essence of network security is in Attack Defence, therefore from the angle of Attack Defence, and research is visited Rope Network Security Analysis Method and defense technique system, have important practical significance, it has also become research emphasis in recent years.Game By be study decision-maker between behavior direct interaction when decision problem theory.The target opposition that network-combination yarn has Property, the essential characteristic of relation Non-synergic, tactful interdependence feature with game theory coincide, betting model is applied to network-combination yarn Analysis turns into new in recent years research method and focus, and achieves part achievement.But on the whole, the network based on game theory Space safety Study on Problems is started late, the still inadequate system of current research method, and existing achievement in research is mostly rich with tradition Play chess based on model, its model method is set up under conditions of rational, and real network-combination yarn can not be met, so as to reduce Achievement in research practical value.At present, the network security defence based on theory of games is used based on traditional game theory mostly, and is passed Theory of games unite premised on actor's rational, game both sides, by maximizing number one, choose in gambling process Optimal defence policies carry out network security defence.And signaling games are exactly one kind in traditional game theory, player is respectively The person of sending of signal and the recipient of signal.The type of the signal person of sending is not known to signal recipient, but recipient is to letter The type of number person of sending has priori judgement.Recipient makes amendment to the type for the person of sending using signal, forms posteriority and judges, enters And select optimal action.
Actor's rational hypotheses during traditional game is theoretical are not inconsistent with actual conditions.Traditional game theory is set up Under the hypotheses of actor's rational, and because the decision-making capability of people is limited in reality, i.e., policymaker is actual belongs to In non-fully rationality individual.Weight can be produced to final payoff by ignoring actor's bounded rationality condition during traditional game is theoretical Big influence, makes final game equilibrium result differ larger with actual, so as to reduce the validity of model and method.Tradition is rich Theory is played chess based on matrix games, fails to embody the Dynamic Evolution of real network attacking and defending game.Traditional game is theoretical Using matrix games form, by the analysis and calculating to gain matrix, final game equilibrium is drawn, so that for network peace The selection of full defence policies, but the analysis method only analyzed some game stage in gambling process.In reality In internet ping-pong process, the game equilibrium of attacking and defending both sides can with the change of pursuit-evasion strategy collection and system running environment by Break, so that the game for starting next stage is developed.Dynamic change for the ping-pong process of attacking and defending both sides in society is special Levy, its application value is limited.
The content of the invention
For deficiency of the prior art, the present invention provides a kind of cyber-defence strategy choosing based on Markov evolutionary Games Method and its device are taken, it is traditional passive anti-to make up to build a kind of effective network security initiative type safeguard technology using theory of games The deficiency that imperial technology is present, can analyze the dynamic antagonistic process between the attacker of bounded rationality and defender, optimal defence The practicality and stronger with directive significance that strategy is chosen.
According to design provided by the present invention, a kind of cyber-defence strategy selection side based on Markov evolutionary Games Method, comprising:
According to dynamic attacking and defending game during network-combination yarn, multistage Markov attacking and defending Evolutionary Game Model, the mould are built Type includes multiple subgame stages;
For multistage Markov attacking and defending Evolutionary Game Model, attacking and defending game is solved using optimal defence policies Algorithms of Selecting The optimal defence policies in each stage are simultaneously exported.
Above-mentioned, multistage Markov attacking and defending Evolutionary Game Model is expressed as:M2ADE=(N, T, B, P, ξ, S0, S, η, U), Wherein, N=(ND,NA) be evolutionary Game participant space, NDFor defender, NAFor attacker;T is the rank of multiple stages game Section sum, G (k) expressions of current generation gambling process, k={ 1,2 ..., T },B=(DS, AS) is attacking and defending actionable space, Optional strategy of the attacker k-th of stage is represented, Represent optional strategy of the defender k-th of stage;It is game conviction set,Represent in k-th of stage choosing Select attack strategiesProbability,And Represent in k-th of stage selection defence policiesProbability,Andξ is discount factor, and it represents that the income in game stage k compares the ratio of discounting of starting stage, 0 ≤ξ≤1;It is initial safe state set in ping-pong process, S={ S1…Sk…STIt is ping-pong process Safe condition set, set S0Corresponded with the state in S and game stage;η represents safe condition transition probability, ηij=η (Sj|Si) represent system from state SiSaltus step is to state SjProbability;It is game revenue function set,With Represent the revenue function of defender and attacker in k-th of game stage.
It is preferred that, the optimal defence policies in attacking and defending each stage of game, bag are solved using optimal defence policies Algorithms of Selecting Containing following content:
A), according to multistage Markov attacking and defending Evolutionary Game Model, each subgame stage attacking and defending both sides' income is solved;
B the discount income based on the starting stage), is converted to by introducing discount factor by the income of forthcoming generations, will be many Stage game equilibrium Solve problems, are converted into the dynamic programming problems using integral benefit as target;
C), dynamic programming problems are solved, multiple stages game balance policy collection is obtained and merges output.
It is preferred that, described step A includes following content:Build the evolutionary Game tree in each subgame stage and calculate this The game income in stage, by loop iteration until the evolutionary Game tree for completing all subgame stages builds and game income Calculate.
It is preferred that, build the evolutionary Game tree in each subgame stage and calculate the game income in the stage, comprising:
A1, the game conviction set for building the previous cycle subgame stage;
A2, the pursuit-evasion strategy pair for the subgame stage, calculate attacking and defending both sides' financial value respectively;
A3, the expected revenus according to the set of game conviction and attacking and defending both sides' financial value, respectively calculating attacking and defending both sides;
A4, the average yield according to the set of game conviction and the expected revenus of attacking and defending both sides, respectively calculating attacking and defending both sides.
It is preferred that, described step B includes following content:
B1, according to discount factor, safe condition transition probability and objective criteria function, the discount for calculating forthcoming generations is received Benefit;
B2, the discount income according to forthcoming generations and each subgame stage attacking and defending both sides' solving result, pass through dynamic rule Game equilibrium Solve problems are converted into Dynamic Programming Solve problems by the method for drawing.
It is preferred that, described objective criteria function expects criterion function using discount, and it is expressed as:
Wherein, UA、UDIn respectively subgame stage G (k) Attack, the financial value of anti-both sides,Represent the discount financial value of forthcoming generations.
It is preferred that, game equilibrium Solve problems are converted into by Dynamic Programming Solve problems by dynamic programming method, specifically It is expressed as:To k={ 1,2 ..., T },
, wherein, k-th of subgame stage,WithThe replica locating of defender and attacker are represented respectively Equation,WithDefence policies are represented respectivelyAnd attack strategiesSelection probability.
A kind of cyber-defence strategy selecting device based on Markov evolutionary Games, comprising:Evolutionary Game Model builds mould Block and model solution output module,
Evolutionary Game Model builds module, for building the multistage according to dynamic attacking and defending game during network-combination yarn Markov attacking and defending Evolutionary Game Models, the model includes multiple subgame stages;
Model solution output module, the multistage Markov attacking and defending for being built for Evolutionary Game Model in module is developed Betting model, the optimal defence policies in attacking and defending each stage of game are solved and defeated by using optimal defence policies Algorithms of Selecting Go out.
In above-mentioned cyber-defence strategy selecting device, described model solution output module is included:Stage income is solved Unit, problem conversion unit and strategy solve output unit,
Stage income solves unit, and the multistage Markov attacking and defending for being built according to Evolutionary Game Model in module is developed Betting model, solves attacking and defending both sides' income in each subgame stage;
Problem conversion unit, for solving each subgame stage attacking and defending both sides' income in unit according to stage income, and The discount income based on the starting stage is converted to by introducing discount factor by the income of forthcoming generations, multiple stages game is balanced Solve problems, are converted into the dynamic programming problems using integral benefit as target;
Strategy solves output unit, for being solved to the dynamic programming problems in problem conversion unit, obtains multistage Section game equilibrium strategy set, determines the optimal defence policies of defender and is exported according to theory of games.
Beneficial effects of the present invention:
The present invention solves traditional game model and there is actor's rational and nothing applied to the selection of cyber-defence strategy The problem of method describes the Dynamic Evolution of real network ping-pong process;From the angle of Attack Defence, research and probe network Safety analytical method and defense technique system, by the way that the state transition between each evolutionary phase is described as into random process, Use for reference on the basis of Markov process, multistage evolutionary Game is combined with Markov decision-making techniques, build the multistage Markov evolutionary Games;Discount total revenue using game decays for the income in multistage ping-pong process, drawn as object function Enter discount factor ξ and discount processing is carried out to the game income of different phase, it is basic in a balanced way solving and analyzing multiple stages game On, optimal defence policies Algorithms of Selecting is designed, and pass through emulation experiment checking model and the validity of method.With studying into Fruit is compared, and achievement of the present invention can analyze the dynamic antagonistic process between the attacker of bounded rationality and defender, optimal defence The practicality and stronger with directive significance that strategy is chosen.
Brief description of the drawings:
Fig. 1 is method flow schematic diagram of the invention;
Fig. 2 is signaling games tree schematic diagram;
Fig. 3 is multistage Markov Evolutionary Game Model configuration diagram;
Fig. 4 is schematic device of the invention;
Fig. 5 is simulation example experimental system structural representation;
Fig. 6 is the specific network-combination yarn state migration procedure schematic diagram of simulation example.
Embodiment:
To make the object, technical solutions and advantages of the present invention clearer, clear, below in conjunction with the accompanying drawings with technical scheme pair The present invention is described in further detail.It should be noted that in the case where not conflicting, embodiment and implementation in the application Feature in example can be mutually combined.Obviously, described embodiment is only a part of embodiment of the invention, rather than Whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art are not making creative work premise Lower obtained every other embodiment, belongs to the scope of protection of the invention.
Markov Analysis (Markov analysis):Also known as Markov transition matrix method, referring to can in Ma Er Under the supposed premise of husband's process, these variable change in future situations are predicted by analyzing the real situation of change of stochastic variable A kind of Forecasting Methodology, is mainly used in the research of chance event variation tendency, and the theory has very heavy in current scientific research field The meaning wanted.
Markov chain (Markov Chain):It is all discrete Markov process, abbreviation horse to refer to time and state Family name's chain, is designated as { Xn=X (n), n=0,1,2 ... }.It is considered as on time collection T={ 0,1,2 ... } to discrete state The result of Markovian process successive observed.
State transition probability (State Transition Probability):Assuming that in time collection T, the shape of markov chain State space is I={ a1,a2..., ai∈R.Pair for the situation of chain, Markov property leads to typical conditions distribution laws and represented, i.e.,With 0≤t1<t2…<tr<m;ti, m, n+m ∈ T,
Have
Wherein, a. ∈ I, note above formula right-hand member is Pij(m,m+n).Claim conditional probability Pij(m, m+n)=P (Xm+n=aj|Xm= ai) it is in state a in moment m for markov chainiUnder the conditions of, it is transferred to state a in moment m+njState transition probability.
Evolutionary game theory (Evolutionary Game Theory):Darwin theory of biological evolution is come from, life is inherited Thing, from individual bounded rationality condition, using group behavior as research object, is being illustrated for the theoretical explanation of spore In the development course of living species and selection of evolving, the Evolutionary Game process of Biology seed coating is explained.Pass through long-term trial and error, mould Imitative and improvement, all game sides can all tend to some stable strategy, under the strategy may for a long time be stablized in groups Come, the Evolutionary Stability Strategy of the strategy equilibrium of this stabilization just with biological evolution is closely similar, to reach a kind of relative harmony Game equilibrium state.
Evolutionarily Stable Strategy (ESS, Evolutionary Stable Strategy):Refer to under explicitly defining not The strategy that can be invaded by mutant, is the balance policy with true stability and stronger predictive ability in evolutionary Game.It is There is the robustness equalization concept for remaining to " recover " compared with strong anti-interference ability and after being interfered in biological evolution theory, be to drill Change equalization concept most crucial in game theory analysis.
The embodiments of the invention provide a kind of cyber-defence strategy choosing method based on Markov evolutionary Games, referring to figure Shown in 1, comprising:
101st, according to dynamic attacking and defending game during network-combination yarn, multistage Markov attacking and defending Evolutionary Game Model is built, The model includes multiple subgame stages;
102nd, for multistage Markov attacking and defending Evolutionary Game Model, attacking and defending is solved using optimal defence policies Algorithms of Selecting The optimal defence policies in each stage of game are simultaneously exported.
Traditional game model is solved to there is actor's rational applied to the selection of cyber-defence strategy and can not describe The problem of Dynamic Evolution of real network ping-pong process;Pass through multistage Markov attacking and defending Evolutionary Game Model analog network Attacking and defending Dynamic Evolution, from the angle of Attack Defence, research and probe Network Security Analysis Method and defense technique system, Have important practical significance.
In another embodiment of the present invention, multistage Markov attacking and defending Evolutionary Game Model is expressed as:M2ADE= (N,T,B,P,ξ,S0, S, η, U), wherein, N=(ND,NA) be evolutionary Game participant space, NDFor defender, NAFor attack Side;T is the stage sum of multiple stages game, and current generation gambling process is represented with G (k), k={ 1,2 ..., T },B= (DS, AS) is attacking and defending actionable space, Represent optional plan of the attacker k-th of stage Slightly, Represent optional strategy of the defender k-th of stage;It is game Conviction set,Represent in k-th of stage selection attack strategiesProbability,And Represent at k-th Stage selects defence policiesProbability,Andξ is discount factor, and it represents the receipts in game stage k Benefit compares the ratio of discounting of starting stage, 0≤ξ≤1;It is initial safe state set, S in ping-pong process ={ S1…Sk…STIt is safe condition set in ping-pong process, set S0Corresponded with the state in S and game stage;η tables Show safe condition transition probability, ηij=η (Sj|Si) represent system from state SiSaltus step is to state SjProbability;It is Game revenue function set,WithRepresent the revenue function of defender and attacker in k-th of game stage.
Multistage Evolutionary Game Model is combined with Markov decision processes, multistage Markov evolutionary Game mould is built Type, multiple agent is expanded to by the single intelligent body in Markov decision processes, specifically as shown in figure 3, in network-combination yarn process In, based on Imperfect Rationality it is assumed that attacking and defending policymaker can be by learning to other people method, update, improve itself plan Summary, is optimal self benefits, belongs to a multimode and state is with the time dynamic evolutionary process changed.In certain time In stage, network-combination yarn system can reach some stable equilibrium state, but the state can not be maintained down always, over time Passage, game element and system environments may all be changed, and stable equilibrium state is destroyed, network system with probability η from Stable state jumps to another unsteady state, and system will break the balanced evolution so as to start next stage.From attacking and defending The global visual angle of journey is set out, and system is in the dynamic process of " evolution-saltus step-evolution ", in analysis multi-stage network attacking and defending On the basis of gambling process, multistage Markov attacking and defending Evolutionary Game Model is built.
In one more embodiment of the present invention, attacking and defending each stage of game is solved using optimal defence policies Algorithms of Selecting Optimal defence policies, include following content:
A), according to multistage Markov attacking and defending Evolutionary Game Model, each subgame stage attacking and defending both sides' income is solved;
B the discount income based on the starting stage), is converted to by introducing discount factor by the income of forthcoming generations, will be many Stage game equilibrium Solve problems, are converted into the dynamic programming problems using integral benefit as target;
C), dynamic programming problems are solved, multiple stages game balance policy collection is obtained and merges output.
Included in network-combination yarn betting model in multiple subgame processes, another embodiment of the present invention, according to the multistage Markov attacking and defending Evolutionary Game Models, solve each subgame stage attacking and defending both sides' income, include following content:Build per height The evolutionary Game tree in game stage and the game income for calculating the stage, by loop iteration until completing all subgame ranks The evolutionary Game tree of section builds and game income calculation.
Network security defence based on theory of games is used based on traditional game theory mostly, and traditional game theory is with row Premised on person's rational, game both sides, by maximizing number one, choose optimal defence policies and entered in gambling process Row network security is defendd.And signaling games be exactly traditional game it is theoretical in one kind, player be respectively signal the person of sending and The recipient of signal.The type of the signal person of sending is not known to signal recipient, but recipient is to the type of the signal person of sending There is priori judgement.Recipient makes amendment to the type for the person of sending using signal, forms posteriority and judges, and then selects optimal row Dynamic, its signaling games tree as shown in Fig. 2 solve the Perfect Bayesian Equilibrium of network-combination yarn both sides according to the following steps.
(1) solve attacker and infer interdependent sub-game perfect equilibrium strategy
Work as m=m1When,
Because o1+o2+o3=1, order
It can obtain
Assuming thatAndThen there are following three kinds of situations
IfFormer formula=a12·o1+a42·o2+a72·o3, a (m1)=A1
IfFormer formula=a22·o1+a52·o2+a82·o3, a (m1)=A2
IfFormer formula=a32·o1+a62·o2+a92·o3, a (m1)=A3
It can similarly obtainIfa(m2)=A1;Ifa(m2)=A2;Ifa(m2) =A3
It can similarly obtainIfa(m3)=A1;Ifa(m3)=A2;Ifa(m3)= A3
(2) the sub-game perfect equilibrium strategy that defender infers is solved
Work as t=t1When
WhenWhen
Former formula=max { UD(m1,a(m1),t1),UD(m2,a(m2),t1),UD(m3,a(m3),t1)=max { a11,a23, a35, it thus can try to achieve m (t1)。
It can similarly obtainIn the case of m (t1)。
And t can be obtained2、t3The sub-game perfect equilibrium strategy of type.
(3) Perfect Bayesian Equilibrium of signaling games is solved
Under conditions of known m* (t), a* (m), the attacker for meeting bayes rule can be obtained to defender's type DeductionIf P (t | m) withDo not conflict, you can draw the Perfect Bayesian Equilibrium strategy of signaling games
Understand that the mixed strategy under perfect Bayesian Nash equilibrium is the optimal selection of both sides according to theory of games.
It is preferred that, in another embodiment of the present invention, build the evolutionary Game tree in each subgame stage and calculate the rank The game income of section, comprising:
A1, the game conviction set for building the previous cycle subgame stage;
A2, the pursuit-evasion strategy pair for the subgame stage, calculate attacking and defending both sides' financial value respectively;
A3, the expected revenus according to the set of game conviction and attacking and defending both sides' financial value, respectively calculating attacking and defending both sides;
A4, the average yield according to the set of game conviction and the expected revenus of attacking and defending both sides, respectively calculating attacking and defending both sides.
On the problem of multiple stages game balance policy is solved, in another embodiment of the present invention, by introduce discount because The income of forthcoming generations is converted to the discount income based on the starting stage by son, by the balanced Solve problems of multiple stages game, conversion For the dynamic programming problems using integral benefit as target, following content is included:
B1, according to discount factor, safe condition transition probability and objective criteria function, the discount for calculating forthcoming generations is received Benefit;
B2, the discount income according to forthcoming generations and each subgame stage attacking and defending both sides' solving result, pass through dynamic rule Game equilibrium Solve problems are converted into Dynamic Programming Solve problems by the method for drawing.
Objective criteria function R is used for the quality for judging attacking and defending both sides strategy.It is accurate that conventional objective criteria function has discount to expect Then function and average return criterion function.Because game income is relevant with the time, in yet another embodiment of the present invention, pass through Discount factor is introduced, criterion function is expected using discount, it is expressed as:
Wherein, UA、UDAttacked in respectively subgame stage G (k), the financial value of anti-both sides,Table Show the discount financial value of forthcoming generations.Objective criteria function is financial value U in game stage G (k)A、UDWith following discount financial valueSum, the target of attacking and defending both sides is respective object function is reached maximum.
In network antagonistic process, attacking and defending both sides maximize self benefits as far as possible, for the attacking and defending of multistage Markov Evolutionary Game, when in evolutionary Game stage G (k), the strategy of attacking and defending both sides is respectivelyWithAccording to the balanced theorem of evolutionary Game, ifFor the Evolutionarily Stable Strategy in kth stage, then For arbitrary pursuit-evasion strategyMeet:
Based on M2ADE scale-model investigation multi-stage network ping-pong process, because network-combination yarn is made up of multiple subgame processes, Each state can by last stage attacking and defending game in behavioral strategy influenceed, according to Markov decision rules, participant must have One Markov optimal response strategy.Therefore, ifFor Markov optimal response strategies, then Make object functionWithMaximum is reached, i.e., following condition is satisfied by any stage k:
State transition probability η directly affects the calculating of discount financial value, and then influences the value of object function, for attacking and defending Both sides' optimal policySelection have directly effect., may be directly if intensity of variation is larger after η changes Optimal policy is caused to change.Part has achievement in research and is respectively provided with fixed state transition probability, and according to historical data and specially The empirically determined specific value of family, the present invention uses same method.
Because ping-pong process is made up of k limited stage game, while the set of strategies DS in each stagekAnd ASkIt is to have Limit, and attacking and defending both sides game income be also it is limited, therefore, multistage Markov attacking and defending Evolutionary Game Model M2ADE For the limited Markov Evolutionary Game Models of multimode-multiple agent.M2ADE games are drilled by multiple independent and similar single phase Change game to constitute.On the one hand, because each independent single phase evolutionary Game belongs to finite game, therefore, it there will necessarily be mixed Close the Nash Equilibrium under strategy.On the other hand, by the definition of multistage Markov Evolutionary Game Model, according to transition probability and receipts Beneficial function is understood, is existed and M2Limited Stochastic Game of equal value ADE, and revenue function is convex function.According to limited Stochastic Game Balance policy existence result understand, there is the Nash Equilibrium under mixed strategy in the limited Stochastic Game.To sum up, M2ADE is deposited Nash Equilibrium under mixed strategy.
By solving each subgame stage evolution game equilibrium, asked for the income calculation in multistage ping-pong process Topic, by introducing discount factor, the discount income based on the starting stage is converted to by the income of forthcoming generations, on this basis, In other embodiments of the invention, game equilibrium Solve problems are converted into Dynamic Programming solution by dynamic programming method to ask Topic, is embodied as:To k={ 1,2 ..., T },
, wherein, k-th of subgame stage,WithThe replica locating of defender and attacker are represented respectively Equation,WithDefence policies are represented respectivelyAnd attack strategiesSelection probability.Solving above-mentioned dynamic programming problems can To obtain optimal solution setThe as balance policy set of multiple stages game.According to theory of games, mixed strategyIt is the optimal selection of kth stage attacking and defending both sides, therefore defender should be byIt is used as optimal defence policies.
Corresponding with the above method, the embodiment of the present invention additionally provides a kind of cyber-defence plan based on Markov evolutionary Games Selecting device is omited, as shown in figure 4, comprising:Evolutionary Game Model builds module 201 and model solution output module 202,
Evolutionary Game Model builds module 201, for building the multistage according to dynamic attacking and defending game during network-combination yarn Markov attacking and defending Evolutionary Game Models, the model includes multiple subgame stages;
Model solution output module 202, for building the multistage Markov attacking and defending in module for Evolutionary Game Model Evolutionary Game Model, the optimal defence policies in attacking and defending each stage of game are solved simultaneously by using optimal defence policies Algorithms of Selecting Output.
In another embodiment of the present invention, described model solution output module is included:Stage income solution unit, Problem conversion unit and strategy solve output unit,
Stage income solves unit, and the multistage Markov attacking and defending for being built according to Evolutionary Game Model in module is developed Betting model, solves attacking and defending both sides' income in each subgame stage;
Problem conversion unit, for solving each subgame stage attacking and defending both sides' income in unit according to stage income, and The discount income based on the starting stage is converted to by introducing discount factor by the income of forthcoming generations, multiple stages game is balanced Solve problems, are converted into the dynamic programming problems using integral benefit as target;
Strategy solves output unit, for being solved to the dynamic programming problems in problem conversion unit, obtains multistage Section game equilibrium strategy set, determines the optimal defence policies of defender and is exported according to theory of games.
On the basis of above-mentioned analysis, the optimal defence policies Algorithms of Selecting description of multistage Markov attacking and defending evolutionary Game It is as follows:
Input:Multistage Markov attacking and defending Evolutionary Game Model M2ADE
Output:Multistage optimal defence policies
BEGIN
1. initialize M2ADE=(N, T, B, P, ξ, S0,S,η,U);
2. build act of defense space DS and attack space AS;
3. build the state set in attacking and defending game each stageWith S={ S1…Sk…ST};
4. init state transition probability ηij=η (Sj|Si);
5.For (k=1;k≤T;k++)
{ // the build evolutionary Game tree of different phase simultaneously calculates game income;
6. build game conviction setAndAnd
7. for pursuit-evasion strategy pairCalculate attacking and defending financial value
8. the expected revenus of calculative strategy
9. calculate the average yield of attacking and defending both sides
}
10. utilize discount factor ξ, arithmetical discount income
11. dynamic programming method is based on, with WithFor object function, it is equal that solution obtains game Weighing apparatus solution
12.ReturnThe optimal defence policies in // output attacking and defending game each stage
END
Algorithms T-cbmplexity is O (k (m+n)2), space consuming is concentrated mainly on financial value and balanced solution intermediate result Storage on, its space complexity be O (knm).According to the algorithm, single phase strategy can be analyzed and choose the feelings changed over time The situation of Profit of condition and strategy, then obtains the discount income in k stage by criterion of discounting, and is solved using Dynamic Programming many The optimal defence decision-making of stage ping-pong process, behavioural analysis and defence decision-making for multi-stage network attacking and defending.
Further to verify effectiveness of the invention, below by instantiation is further explained explanation:
Build the information system shown such as Fig. 5 and carry out experimental verification, experimental system is mainly by Prevention-Security equipment, Web service Device, file server, database server and client terminal composition, install the operating systems, Web service such as Windows, Linux Device provides Http services, and file server provides Ftp services, the database server operation systems of Oracle 11, client terminal tool There are the functions such as Email, file download, video request program.
For the network information experimental system structure of foundation, setting network ping-pong process is divided into eight stages, specific network Attacking and defending state migration procedure as shown in fig. 6, wherein,For original state, SkFor Evolution States.Solid arrow was represented in single phase Attacking and defending gambling process, dotted arrow represents to jump to the process of next stage original state from a certain stage.Particular state bag Include:
Sk={ S1:Obtain the access authorities of Web server;S2:Obtain F2 access authorities;S3:Obtained by D1 To F1 access authorities;S4:Obtain D1 user authorities;S5:Obtain F1 root authority;S6:Obtain C2 root authority; S7:D2 access authorities are obtained by C1;S8:Obtain F2 root authority }.
For the state transition between different phase, it is assumed that state transition probability is fixed, and according to historical data and expert Transition probability between empirically determined each stage, specific as shown in table 1, wherein ηij=η (Sj|Si) represent from state SiSaltus step is extremely State SjProbability.
State transition probability between each stage of table 1
The main frame that access control rule limits non-present networks can only access Web server in Web server, system, application Server can conduct interviews to database server.Experiment information system is scanned by Nessus, with reference to national information safety Vulnerability database (CNNVD) information, on the basis of analysis routing file, vulnerability information, builds the pursuit-evasion strategy collection in each game stage, It is specific as shown in table 2:
The attacking and defending action in the different game stages of table 2 can selected works
For each pursuit-evasion strategy to (ASi,DSj), according to tactful cost/return quantization method and tactful income calculation Method, provides the attacking and defending gain matrix in each stage, as shown in table 3 respectively:
The attacking and defending gain matrix in each stage of table 3
If discount factor ξ=0.5, the tactful Algorithms of Selecting in the present invention, meter are realized using Matlab2012 software programmings The balance policy for obtaining each stage is calculated, as shown in table 4, wherein,For defender each stage optimal defence policies.
Each stage game equilibrium strategy of table 4
System mode number is 16 in above-mentioned experimentation, and game is divided into 8 stages, and each stage attacking and defending both sides respectively have 3 Strategy, has 15 kinds of state transition probabilities between each stage, to realize that the optimal defence policies of different phase are chosen, it is necessary to calculate 16 gain matrixs, algorithm operation time is 32.7s.
Experimental analysis
Using file server F1, F2 as target of attack, as long as the root authority that attacker obtains server F1, F2 is to think to reach Target of attack.By analyzing above-mentioned simulation process, there are two primary challenge paths:WithWherein first attack path can obtain server F2 root authority, Article 2 attack path can obtain server F1 root authority.
1st stageAttack action can selected works and defence action can selected works be shown in Table 3 (similarly hereinafter), optimal defence policies For mixed strategyIt is -30.5 to defend income.
2nd stageWhen system with probability η (2 | 1)=0.8 from state S1Saltus step is extremelyAfter start this stage win Play chess, optimal defence policies are mixed strategyIt is -20.4 to defend income.
3rd stageWhen system with probability η (8 | 2)=0.4 from state S2Saltus step is extremelyAfter start the game of this stage, Optimal defence policies are mixed strategyIt is -27.5 to defend income.
1st stageAttack action can selected works and defence action can selected works be shown in Table 3 (similarly hereinafter), optimal defence policies For mixed strategyIt is -30.5 to defend income.
2nd stageWhen system with probability η (6 | 1)=0.6 from state S1Saltus step is extremelyAfter start the game of this stage, Optimal defence policies are mixed strategyIt is -40.3 to defend income.
3rd stageWhen system with probability η (3 | 6)=0.8 from state S6Saltus step is extremelyAfter start the game of this stage, Optimal defence policies are mixed strategyIt is -43.2 to defend income.
4th stageWhen system with probability η (5 | 3)=0.9 from state S3Saltus step is extremelyAfter start the game of this stage, Optimal defence policies are mixed strategyIt is -19.5 to defend income.
For two attack paths, attacking and defending total revenue is calculated respectively, then the attack total revenue of path 1. isIt is anti- Imperial total revenue isThe attack total revenue of path 2. isDefence total revenue beUnderstandAnd1. obvious path more conforms to the expectation of defender, and 2. defender should avoid path as far as possible.It is right Than analysis path 1. with 2. it can be found that its 1st stage is identical, will undergo from system modeDevelop to state S1Process, But when jumping to for 2 stage, path is 1. from state S1Saltus step is extremelyAnd path is 2. from state S1Saltus step is extremelyFor reduction path 2. generation may be, it is necessary to reduce state S1Saltus step is extremelyProbability, if system can not arrive at stateThen 2. path will not Realize, the expectation of defender can be met, Attack Defence is 1. carried out along path.Further analysis shows that the attacking and defending game of the 1st stage After end, path 1. saltus step to statePath 2. saltus step to stateThere are different attacks under both of these case Collection, refers to table 2.Because the change of set of strategies and system running environment is the major reason that causes state transition, forCorrespondence AS={ Oracle TNS Listener, Wu-Ftp Sockprintf, install SQL Listener program }, prevent Driver can in attacking and defending game using dynamic adjustment network access port, set up the modes such as white list and change access control rule Or the new specific aim defence policies of increase, the operational feasibility of the attack collection is reduced, reduces saltus step extremelyProbability, drop The generation of low path footpath 2. may.
The embodiment of each in this specification is described by the way of progressive, and what each embodiment was stressed is and other Between the difference of embodiment, each embodiment identical similar portion mutually referring to.For device disclosed in embodiment For, because it is corresponded to the method disclosed in Example, so description is fairly simple, related part is said referring to method part It is bright.
With reference to the embodiments described herein describe each example unit and method and step, can with electronic hardware, Computer software or the combination of the two are realized, in order to clearly demonstrate the interchangeability of hardware and software, in described above In the composition and step of each example have been generally described according to function.These functions are held with hardware or software mode OK, depending on the application-specific and design constraint of technical scheme.Those of ordinary skill in the art can be to each specific Using realizing described function using distinct methods, but this realization be not considered as it is beyond the scope of this invention.
One of ordinary skill in the art will appreciate that all or part of step in the above method can be instructed by program Related hardware is completed, and described program can be stored in computer-readable recording medium, such as:Read-only storage, disk or CD Deng.Alternatively, all or part of step of above-described embodiment can also use one or more integrated circuits to realize, accordingly Each module/unit in ground, above-described embodiment can be realized in the form of hardware, it would however also be possible to employ the shape of software function module Formula is realized.The present invention is not restricted to the combination of the hardware and software of any particular form.
The foregoing description of the disclosed embodiments, enables professional and technical personnel in the field to realize or use the application. A variety of modifications to these embodiments will be apparent for those skilled in the art, as defined herein General Principle can in other embodiments be realized in the case where not departing from spirit herein or scope.Therefore, the application The embodiments shown herein is not intended to be limited to, and is to fit to and principles disclosed herein and features of novelty phase one The most wide scope caused.

Claims (10)

1. a kind of cyber-defence strategy choosing method based on Markov evolutionary Games, it is characterised in that include:
According to dynamic attacking and defending game during network-combination yarn, multistage Markov attacking and defending Evolutionary Game Model, the model bag are built Containing multiple subgame stages;
For multistage Markov attacking and defending Evolutionary Game Model, using optimal defence policies Algorithms of Selecting solve attacking and defending game each The optimal defence policies in stage are simultaneously exported.
2. the cyber-defence strategy choosing method according to claim 1 based on Markov evolutionary Games, it is characterised in that Multistage Markov attacking and defending Evolutionary Game Model is expressed as:M2ADE=(N, T, B, P, ξ, S0, S, η, U), wherein, N=(ND,NA) It is the participant space of evolutionary Game, NDFor defender, NAFor attacker;T is the stage sum of multiple stages game, current generation Gambling process G (k) expressions, k={ 1,2 ..., T },B=(DS, AS) is attacking and defending actionable space, Optional strategy of the attacker k-th of stage is represented, Represent optional strategy of the defender k-th of stage;It is game conviction set,Represent k-th of stage Select attack strategiesProbability,And Represent in k-th of stage selection defence policiesProbability,Andξ is discount factor, and it represents that the income in game stage k compares the ratio of discounting of starting stage, 0 ≤ξ≤1;It is initial safe state set in ping-pong process, S={ S1…Sk…STIt is ping-pong process Safe condition set, set S0Corresponded with the state in S and game stage;η represents safe condition transition probability, ηij=η (Sj|Si) represent system from state SiSaltus step is to state SjProbability;It is game revenue function set,WithGeneration The revenue function of defender and attacker in k-th of game stage of table.
3. the cyber-defence strategy choosing method according to claim 2 based on Markov evolutionary Games, it is characterised in that The optimal defence policies in attacking and defending each stage of game are solved using optimal defence policies Algorithms of Selecting, following content is included:
A), according to multistage Markov attacking and defending Evolutionary Game Model, each subgame stage attacking and defending both sides' income is solved;
B the discount income based on the starting stage), is converted to by introducing discount factor by the income of forthcoming generations, by the multistage Game equilibrium Solve problems, are converted into the dynamic programming problems using integral benefit as target;
C), dynamic programming problems are solved, multiple stages game balance policy collection is obtained and merges output.
4. the cyber-defence strategy choosing method according to claim 3 based on Markov evolutionary Games, it is characterised in that Described step A includes following content:Build the evolutionary Game tree in each subgame stage and calculate the game income in the stage, By loop iteration until the evolutionary Game tree for completing all subgame stages builds and game income calculation.
5. the cyber-defence strategy choosing method according to claim 4 based on Markov evolutionary Games, it is characterised in that Build the evolutionary Game tree in each subgame stage and calculate the game income in the stage, comprising:
A1, the game conviction set for building the previous cycle subgame stage;
A2, the pursuit-evasion strategy pair for the subgame stage, calculate attacking and defending both sides' financial value respectively;
A3, the expected revenus according to the set of game conviction and attacking and defending both sides' financial value, respectively calculating attacking and defending both sides;
A4, the average yield according to the set of game conviction and the expected revenus of attacking and defending both sides, respectively calculating attacking and defending both sides.
6. the cyber-defence strategy choosing method according to claim 3 based on Markov evolutionary Games, it is characterised in that Described step B includes following content:
B1, according to discount factor, safe condition transition probability and objective criteria function, calculate the discount income of forthcoming generations;
B2, the discount income according to forthcoming generations and each subgame stage attacking and defending both sides' solving result, by Dynamic Programming side Game equilibrium Solve problems are converted into Dynamic Programming Solve problems by method.
7. the cyber-defence strategy choosing method according to claim 6 based on Markov evolutionary Games, it is characterised in that Described objective criteria function expects criterion function using discount, and it is expressed as:
Wherein, UA、UDAttacked in respectively subgame stage G (k), the financial value of anti-both sides,Represent The discount financial value of forthcoming generations.
8. the cyber-defence strategy choosing method according to claim 6 based on Markov evolutionary Games, it is characterised in that Game equilibrium Solve problems are converted into by Dynamic Programming Solve problems by dynamic programming method, are embodied as:To k=1, 2 ..., T },
<mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <mi>max</mi> <mi> </mi> <msubsup> <mi>R</mi> <mi>D</mi> <mi>k</mi> </msubsup> <mrow> <mo>(</mo> <mrow> <msubsup> <mi>S</mi> <mn>0</mn> <mi>k</mi> </msubsup> <mo>,</mo> <msub> <mi>S</mi> <mi>k</mi> </msub> </mrow> <mo>)</mo> </mrow> <mo>=</mo> <mi>max</mi> <mrow> <mo>&amp;lsqb;</mo> <mrow> <msub> <mi>U</mi> <mi>D</mi> </msub> <mrow> <mo>(</mo> <mrow> <msubsup> <mi>DS</mi> <mi>k</mi> <mi>j</mi> </msubsup> <mo>,</mo> <msubsup> <mi>AS</mi> <mi>k</mi> <mi>i</mi> </msubsup> </mrow> <mo>)</mo> </mrow> <mo>+</mo> <munder> <mi>&amp;Sigma;</mi> <mrow> <mi>e</mi> <mo>,</mo> <mi>h</mi> <mo>&amp;Element;</mo> <mrow> <mo>&amp;lsqb;</mo> <mrow> <mi>k</mi> <mo>,</mo> <mi>T</mi> </mrow> <mo>&amp;rsqb;</mo> </mrow> </mrow> </munder> <msup> <mi>&amp;xi;</mi> <mi>h</mi> </msup> <mi>&amp;eta;</mi> <mrow> <mo>(</mo> <mrow> <msub> <mi>S</mi> <mi>h</mi> </msub> <mo>|</mo> <msub> <mi>S</mi> <mi>e</mi> </msub> </mrow> <mo>)</mo> </mrow> <msubsup> <mi>R</mi> <mi>D</mi> <mi>h</mi> </msubsup> <mrow> <mo>(</mo> <mrow> <msubsup> <mi>S</mi> <mn>0</mn> <mi>h</mi> </msubsup> <mo>,</mo> <msub> <mi>S</mi> <mi>h</mi> </msub> </mrow> <mo>)</mo> </mrow> </mrow> <mo>&amp;rsqb;</mo> </mrow> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mi>max</mi> <mi> </mi> <msubsup> <mi>R</mi> <mi>A</mi> <mi>k</mi> </msubsup> <mrow> <mo>(</mo> <mrow> <msubsup> <mi>S</mi> <mn>0</mn> <mi>k</mi> </msubsup> <mo>,</mo> <msub> <mi>S</mi> <mi>k</mi> </msub> </mrow> <mo>)</mo> </mrow> <mo>=</mo> <mi>max</mi> <mrow> <mo>&amp;lsqb;</mo> <mrow> <msub> <mi>U</mi> <mi>A</mi> </msub> <mrow> <mo>(</mo> <mrow> <msubsup> <mi>DS</mi> <mi>k</mi> <mi>j</mi> </msubsup> <mo>,</mo> <msubsup> <mi>AS</mi> <mi>k</mi> <mi>i</mi> </msubsup> </mrow> <mo>)</mo> </mrow> <mo>+</mo> <munder> <mi>&amp;Sigma;</mi> <mrow> <mi>e</mi> <mo>,</mo> <mi>h</mi> <mo>&amp;Element;</mo> <mrow> <mo>&amp;lsqb;</mo> <mrow> <mi>k</mi> <mo>,</mo> <mi>T</mi> </mrow> <mo>&amp;rsqb;</mo> </mrow> </mrow> </munder> <msup> <mi>&amp;xi;</mi> <mi>h</mi> </msup> <mi>&amp;eta;</mi> <mrow> <mo>(</mo> <mrow> <msub> <mi>S</mi> <mi>h</mi> </msub> <mo>|</mo> <msub> <mi>S</mi> <mi>e</mi> </msub> </mrow> <mo>)</mo> </mrow> <msubsup> <mi>R</mi> <mi>A</mi> <mi>h</mi> </msubsup> <mrow> <mo>(</mo> <mrow> <msubsup> <mi>S</mi> <mn>0</mn> <mi>h</mi> </msubsup> <mo>,</mo> <msub> <mi>S</mi> <mi>h</mi> </msub> </mrow> <mo>)</mo> </mrow> </mrow> <mo>&amp;rsqb;</mo> </mrow> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>D</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <msubsup> <mi>q</mi> <mi>k</mi> <mi>j</mi> </msubsup> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <msubsup> <mi>dq</mi> <mi>k</mi> <mi>j</mi> </msubsup> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </mrow> <mrow> <mi>d</mi> <mi>t</mi> </mrow> </mfrac> <mo>=</mo> <msubsup> <mi>q</mi> <mi>k</mi> <mi>j</mi> </msubsup> <mrow> <mo>(</mo> <mrow> <msub> <mi>U</mi> <mrow> <msubsup> <mi>DS</mi> <mi>k</mi> <mi>j</mi> </msubsup> </mrow> </msub> <mo>-</mo> <mover> <msub> <mi>U</mi> <msub> <mi>D</mi> <mi>k</mi> </msub> </msub> <mo>&amp;OverBar;</mo> </mover> </mrow> <mo>)</mo> </mrow> <mo>=</mo> <mn>0</mn> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>A</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <msubsup> <mi>p</mi> <mi>k</mi> <mi>i</mi> </msubsup> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <msubsup> <mi>dp</mi> <mi>k</mi> <mi>i</mi> </msubsup> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </mrow> <mrow> <mi>d</mi> <mi>t</mi> </mrow> </mfrac> <mo>=</mo> <msubsup> <mi>p</mi> <mi>k</mi> <mi>i</mi> </msubsup> <mrow> <mo>(</mo> <mrow> <msub> <mi>U</mi> <mrow> <msubsup> <mi>AS</mi> <mi>k</mi> <mi>i</mi> </msubsup> </mrow> </msub> <mo>-</mo> <msub> <mover> <mi>U</mi> <mo>&amp;OverBar;</mo> </mover> <msub> <mi>A</mi> <mi>k</mi> </msub> </msub> </mrow> <mo>)</mo> </mrow> <mo>=</mo> <mn>0</mn> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <munderover> <mi>&amp;Sigma;</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <msubsup> <mi>q</mi> <mi>k</mi> <mi>j</mi> </msubsup> <mo>=</mo> <mn>1</mn> <mo>,</mo> <munderover> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>m</mi> </munderover> <msubsup> <mi>p</mi> <mi>k</mi> <mi>i</mi> </msubsup> <mo>=</mo> <mn>1</mn> <mo>;</mo> <msubsup> <mi>q</mi> <mi>k</mi> <mi>j</mi> </msubsup> <mo>&amp;Element;</mo> <mrow> <mo>&amp;lsqb;</mo> <mrow> <mn>0</mn> <mo>,</mo> <mn>1</mn> </mrow> <mo>&amp;rsqb;</mo> </mrow> <mo>,</mo> <msubsup> <mi>p</mi> <mi>k</mi> <mi>i</mi> </msubsup> <mo>&amp;Element;</mo> <mrow> <mo>&amp;lsqb;</mo> <mrow> <mn>0</mn> <mo>,</mo> <mn>1</mn> </mrow> <mo>&amp;rsqb;</mo> </mrow> </mrow> </mtd> </mtr> </mtable> </mfenced> ,
Wherein, k-th of subgame stage,WithThe replicator dynamics equation of defender and attacker are represented respectively,WithDefence policies are represented respectivelyAnd attack strategiesSelection probability.
9. a kind of cyber-defence strategy selecting device based on Markov evolutionary Games, it is characterised in that include:Evolutionary Game mould Type builds module and model solution output module,
Evolutionary Game Model builds module, is attacked for building multistage Markov according to dynamic attacking and defending game during network-combination yarn Anti- Evolutionary Game Model, the model includes multiple subgame stages;
Model solution output module, for building the multistage Markov attacking and defending evolutionary Game in module for Evolutionary Game Model Model, solves the optimal defence policies in attacking and defending each stage of game by using optimal defence policies Algorithms of Selecting and exports.
10. the cyber-defence strategy selecting device according to claim 9 based on Markov evolutionary Games, its feature exists In described model solution output module is included:Stage income solves unit, problem conversion unit and strategy and solves output list Member,
Stage income solves unit, for building the multistage Markov attacking and defending evolutionary Game in module according to Evolutionary Game Model Model, solves attacking and defending both sides' income in each subgame stage;
Problem conversion unit, for solving each subgame stage attacking and defending both sides' income in unit according to stage income, and passes through Introduce discount factor and the income of forthcoming generations is converted to the discount income based on the starting stage, multiple stages game equilibrium is solved Problem, is converted into the dynamic programming problems using integral benefit as target;
Strategy solves output unit, for being solved to the dynamic programming problems in problem conversion unit, obtains the multistage rich Balance policy set is played chess, the optimal defence policies of defender is determined according to theory of games and is exported.
CN201710334463.5A 2017-05-12 2017-05-12 Network defense strategy selection method and device based on Markov evolution game Active CN107135224B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710334463.5A CN107135224B (en) 2017-05-12 2017-05-12 Network defense strategy selection method and device based on Markov evolution game

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710334463.5A CN107135224B (en) 2017-05-12 2017-05-12 Network defense strategy selection method and device based on Markov evolution game

Publications (2)

Publication Number Publication Date
CN107135224A true CN107135224A (en) 2017-09-05
CN107135224B CN107135224B (en) 2020-01-10

Family

ID=59731541

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710334463.5A Active CN107135224B (en) 2017-05-12 2017-05-12 Network defense strategy selection method and device based on Markov evolution game

Country Status (1)

Country Link
CN (1) CN107135224B (en)

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107483486A (en) * 2017-09-14 2017-12-15 中国人民解放军信息工程大学 Cyber-defence strategy choosing method based on random evolution betting model
CN107819785A (en) * 2017-11-28 2018-03-20 东南大学 A kind of double-deck defence method towards power system false data injection attacks
CN108287473A (en) * 2017-12-20 2018-07-17 中国人民解放军海军大连舰艇学院 The sub- Dispatching Control System of fleet air defense resource and method of a kind of task based access control collection and system mode EVOLUTION ANALYSIS
CN108494810A (en) * 2018-06-11 2018-09-04 中国人民解放军战略支援部队信息工程大学 Network security situation prediction method, apparatus and system towards attack
CN108629422A (en) * 2018-05-10 2018-10-09 浙江大学 A kind of intelligent body learning method of knowledge based guidance-tactics perception
CN108833402A (en) * 2018-06-11 2018-11-16 中国人民解放军战略支援部队信息工程大学 A kind of optimal defence policies choosing method of network based on game of bounded rationality theory and device
CN108898010A (en) * 2018-06-25 2018-11-27 北京计算机技术及应用研究所 A method of establishing the attacking and defending Stochastic Game Model towards malicious code defending
CN109327427A (en) * 2018-05-16 2019-02-12 中国人民解放军战略支援部队信息工程大学 A kind of dynamic network variation decision-making technique and its system in face of unknown threat
CN109379322A (en) * 2018-05-16 2019-02-22 中国人民解放军战略支援部队信息工程大学 The decision-making technique and its system that network dynamic converts under the conditions of a kind of Complete Information
CN109496305A (en) * 2018-08-01 2019-03-19 东莞理工学院 Nash equilibrium strategy on continuous action space and social network public opinion evolution model
CN109617863A (en) * 2018-11-27 2019-04-12 杭州电子科技大学 A method of the mobile target based on game theory defends optimal defence policies to choose
CN109829566A (en) * 2018-12-26 2019-05-31 中国人民解放军国防科技大学 Method for generating combat action sequence
CN110166428A (en) * 2019-04-12 2019-08-23 中国人民解放军战略支援部队信息工程大学 Intelligence defence decision-making technique and device based on intensified learning and attacking and defending game
CN110225019A (en) * 2019-06-04 2019-09-10 腾讯科技(深圳)有限公司 A kind of network security processing method and device
CN110300106A (en) * 2019-06-24 2019-10-01 中国人民解放军战略支援部队信息工程大学 Mobile target based on Markov time game defends decision choosing method, apparatus and system
CN110460572A (en) * 2019-07-06 2019-11-15 中国人民解放军战略支援部队信息工程大学 Mobile target defence policies choosing method and equipment based on Markov signaling games
CN110874470A (en) * 2018-12-29 2020-03-10 北京安天网络安全技术有限公司 Method and device for predicting network space security based on network attack
CN111064702A (en) * 2019-11-16 2020-04-24 中国人民解放军战略支援部队信息工程大学 Active defense strategy selection method and device based on bidirectional signal game
WO2020093201A1 (en) * 2018-11-05 2020-05-14 北京大学深圳研究生院 Security modeling quantisation method for cyberspace mimic defence based on gspn and martingale theory
CN111224966A (en) * 2019-12-31 2020-06-02 中国人民解放军战略支援部队信息工程大学 Optimal defense strategy selection method based on evolutionary network game
CN111245857A (en) * 2020-01-17 2020-06-05 安徽师范大学 Channel network steady state evolution game method in block link environment
CN112422573A (en) * 2020-11-19 2021-02-26 北京天融信网络安全技术有限公司 Attack path restoration method, device, equipment and storage medium
CN112417751A (en) * 2020-10-28 2021-02-26 清华大学 Anti-interference fusion method and device based on graph evolution game theory
CN112434922A (en) * 2020-11-13 2021-03-02 北方工业大学 Urban power grid system security control method and device based on zero sum game
CN112966273A (en) * 2021-03-09 2021-06-15 中国人民解放军空军工程大学 Multi-stage platform dynamic defense method based on Markov evolution model
CN110110857B (en) * 2019-04-10 2021-10-08 浙江锐文科技有限公司 Hybrid remote measuring method based on game theory
US11418533B2 (en) * 2020-04-20 2022-08-16 Prince Mohammad Bin Fahd University Multi-tiered security analysis method and system
US11552965B2 (en) * 2017-12-28 2023-01-10 Hitachi, Ltd Abnormality cause specification support system and abnormality cause specification support method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103152345A (en) * 2013-03-07 2013-06-12 南京理工大学常熟研究院有限公司 Network safety optimum attacking and defending decision method for attacking and defending game
US20140274246A1 (en) * 2013-03-15 2014-09-18 University Of Southern California Localized shortest-paths estimation of influence propagation for multiple influencers

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103152345A (en) * 2013-03-07 2013-06-12 南京理工大学常熟研究院有限公司 Network safety optimum attacking and defending decision method for attacking and defending game
US20140274246A1 (en) * 2013-03-15 2014-09-18 University Of Southern California Localized shortest-paths estimation of influence propagation for multiple influencers

Cited By (48)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107483486A (en) * 2017-09-14 2017-12-15 中国人民解放军信息工程大学 Cyber-defence strategy choosing method based on random evolution betting model
CN107483486B (en) * 2017-09-14 2020-04-03 中国人民解放军信息工程大学 Network defense strategy selection method based on random evolution game model
CN107819785B (en) * 2017-11-28 2020-02-18 东南大学 Double-layer defense method for false data injection attack of power system
CN107819785A (en) * 2017-11-28 2018-03-20 东南大学 A kind of double-deck defence method towards power system false data injection attacks
CN108287473A (en) * 2017-12-20 2018-07-17 中国人民解放军海军大连舰艇学院 The sub- Dispatching Control System of fleet air defense resource and method of a kind of task based access control collection and system mode EVOLUTION ANALYSIS
US11552965B2 (en) * 2017-12-28 2023-01-10 Hitachi, Ltd Abnormality cause specification support system and abnormality cause specification support method
CN108629422A (en) * 2018-05-10 2018-10-09 浙江大学 A kind of intelligent body learning method of knowledge based guidance-tactics perception
CN108629422B (en) * 2018-05-10 2022-02-08 浙江大学 Intelligent learning method based on knowledge guidance-tactical perception
CN109379322A (en) * 2018-05-16 2019-02-22 中国人民解放军战略支援部队信息工程大学 The decision-making technique and its system that network dynamic converts under the conditions of a kind of Complete Information
CN109327427A (en) * 2018-05-16 2019-02-12 中国人民解放军战略支援部队信息工程大学 A kind of dynamic network variation decision-making technique and its system in face of unknown threat
CN108494810A (en) * 2018-06-11 2018-09-04 中国人民解放军战略支援部队信息工程大学 Network security situation prediction method, apparatus and system towards attack
CN108494810B (en) * 2018-06-11 2021-01-26 中国人民解放军战略支援部队信息工程大学 Attack-oriented network security situation prediction method, device and system
CN108833402B (en) * 2018-06-11 2020-11-24 中国人民解放军战略支援部队信息工程大学 Network optimal defense strategy selection method and device based on limited theory game theory
CN108833402A (en) * 2018-06-11 2018-11-16 中国人民解放军战略支援部队信息工程大学 A kind of optimal defence policies choosing method of network based on game of bounded rationality theory and device
CN108898010A (en) * 2018-06-25 2018-11-27 北京计算机技术及应用研究所 A method of establishing the attacking and defending Stochastic Game Model towards malicious code defending
CN109496305A (en) * 2018-08-01 2019-03-19 东莞理工学院 Nash equilibrium strategy on continuous action space and social network public opinion evolution model
CN109496305B (en) * 2018-08-01 2022-05-13 东莞理工学院 Social network public opinion evolution method
WO2020024170A1 (en) * 2018-08-01 2020-02-06 东莞理工学院 Nash equilibrium strategy and social network consensus evolution model in continuous action space
CN112313915A (en) * 2018-11-05 2021-02-02 北京大学深圳研究生院 Security modeling quantification method based on GSPN and halter strap theoretical network space mimicry defense
CN112313915B (en) * 2018-11-05 2021-08-31 北京大学深圳研究生院 Security modeling quantification method based on GSPN and halter strap theoretical network space mimicry defense
WO2020093201A1 (en) * 2018-11-05 2020-05-14 北京大学深圳研究生院 Security modeling quantisation method for cyberspace mimic defence based on gspn and martingale theory
CN109617863B (en) * 2018-11-27 2020-02-18 杭州电子科技大学 Method for selecting optimal defense strategy for moving target defense based on game theory
CN109617863A (en) * 2018-11-27 2019-04-12 杭州电子科技大学 A method of the mobile target based on game theory defends optimal defence policies to choose
CN109829566A (en) * 2018-12-26 2019-05-31 中国人民解放军国防科技大学 Method for generating combat action sequence
CN110874470A (en) * 2018-12-29 2020-03-10 北京安天网络安全技术有限公司 Method and device for predicting network space security based on network attack
CN110110857B (en) * 2019-04-10 2021-10-08 浙江锐文科技有限公司 Hybrid remote measuring method based on game theory
CN110166428B (en) * 2019-04-12 2021-05-07 中国人民解放军战略支援部队信息工程大学 Intelligent defense decision-making method and device based on reinforcement learning and attack and defense game
CN110166428A (en) * 2019-04-12 2019-08-23 中国人民解放军战略支援部队信息工程大学 Intelligence defence decision-making technique and device based on intensified learning and attacking and defending game
CN110225019A (en) * 2019-06-04 2019-09-10 腾讯科技(深圳)有限公司 A kind of network security processing method and device
CN110225019B (en) * 2019-06-04 2021-08-31 腾讯科技(深圳)有限公司 Network security processing method and device
CN110300106B (en) * 2019-06-24 2021-11-23 中国人民解放军战略支援部队信息工程大学 Moving target defense decision selection method, device and system based on Markov time game
CN110300106A (en) * 2019-06-24 2019-10-01 中国人民解放军战略支援部队信息工程大学 Mobile target based on Markov time game defends decision choosing method, apparatus and system
CN110460572B (en) * 2019-07-06 2021-11-02 中国人民解放军战略支援部队信息工程大学 Mobile target defense strategy selection method and equipment based on Markov signal game
CN110460572A (en) * 2019-07-06 2019-11-15 中国人民解放军战略支援部队信息工程大学 Mobile target defence policies choosing method and equipment based on Markov signaling games
CN111064702B (en) * 2019-11-16 2021-09-24 中国人民解放军战略支援部队信息工程大学 Active defense strategy selection method and device based on bidirectional signal game
CN111064702A (en) * 2019-11-16 2020-04-24 中国人民解放军战略支援部队信息工程大学 Active defense strategy selection method and device based on bidirectional signal game
CN111224966A (en) * 2019-12-31 2020-06-02 中国人民解放军战略支援部队信息工程大学 Optimal defense strategy selection method based on evolutionary network game
CN111224966B (en) * 2019-12-31 2021-11-02 中国人民解放军战略支援部队信息工程大学 Optimal defense strategy selection method based on evolutionary network game
CN111245857B (en) * 2020-01-17 2021-11-26 安徽师范大学 Channel network steady state evolution game method in block link environment
CN111245857A (en) * 2020-01-17 2020-06-05 安徽师范大学 Channel network steady state evolution game method in block link environment
US11418533B2 (en) * 2020-04-20 2022-08-16 Prince Mohammad Bin Fahd University Multi-tiered security analysis method and system
CN112417751A (en) * 2020-10-28 2021-02-26 清华大学 Anti-interference fusion method and device based on graph evolution game theory
CN112417751B (en) * 2020-10-28 2024-03-29 清华大学 Anti-interference fusion method and device based on graph evolution game theory
CN112434922A (en) * 2020-11-13 2021-03-02 北方工业大学 Urban power grid system security control method and device based on zero sum game
CN112434922B (en) * 2020-11-13 2021-08-24 北方工业大学 Urban power grid system security control method and device based on zero sum game
CN112422573A (en) * 2020-11-19 2021-02-26 北京天融信网络安全技术有限公司 Attack path restoration method, device, equipment and storage medium
CN112422573B (en) * 2020-11-19 2022-02-25 北京天融信网络安全技术有限公司 Attack path restoration method, device, equipment and storage medium
CN112966273A (en) * 2021-03-09 2021-06-15 中国人民解放军空军工程大学 Multi-stage platform dynamic defense method based on Markov evolution model

Also Published As

Publication number Publication date
CN107135224B (en) 2020-01-10

Similar Documents

Publication Publication Date Title
CN107135224A (en) Cyber-defence strategy choosing method and its device based on Markov evolutionary Games
CN112329348B (en) Intelligent decision-making method for military countermeasure game under incomplete information condition
CN107483486B (en) Network defense strategy selection method based on random evolution game model
Dereszynski et al. Learning probabilistic behavior models in real-time strategy games
CN108833401A (en) Network active defensive strategy choosing method and device based on Bayes&#39;s evolutionary Game
Cardoso et al. Competing against nash equilibria in adversarially changing zero-sum games
WO2021159779A1 (en) Information processing method and apparatus, computer-readable storage medium and electronic device
Kanellopoulos et al. Non-equilibrium dynamic games and cyber–physical security: A cognitive hierarchy approach
CN110099045A (en) Network security threats method for early warning and device based on qualitative differential game and evolutionary Game
CN108696534B (en) Real-time network security threat early warning analysis method and device
Aguilar Adaptive random fuzzy cognitive maps
Huang et al. Markov evolutionary games for network defense strategy selection
Yablochnikov et al. Modelling of informational counteraction between objects in economy
Chen et al. GAIL-PT: An intelligent penetration testing framework with generative adversarial imitation learning
CN116112278A (en) Q-learning-based network optimal attack path prediction method and system
Pratt et al. Rebel with many causes: A computational model of insurgency
Cheng et al. Off-policy deep reinforcement learning based on Steffensen value iteration
Jain et al. Soccer result prediction using deep learning and neural networks
CN111767991B (en) Measurement and control resource scheduling method based on deep Q learning
Dahl The lagging anchor algorithm: Reinforcement learning in two-player zero-sum games with imperfect information
Kardes Robust stochastic games and applications to counter-terrorism strategies
CN115174173A (en) Global security game decision method of industrial information physical system in cloud environment
CN113344071B (en) Intrusion detection algorithm based on depth strategy gradient
Tang et al. Regret-minimizing double oracle for extensive-form games
Varma et al. Analysis of opinion dynamics under binary exogenous and endogenous signals

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant