CN113315763B

CN113315763B - Network security defense method based on heterogeneous group evolution game

Info

Publication number: CN113315763B
Application number: CN202110557062.2A
Authority: CN
Inventors: 王刚; 张恩宁; 马润年; 伍维甲; 严丽娜; 唐剑
Original assignee: Air Force Engineering University of PLA
Current assignee: Air Force Engineering University of PLA
Priority date: 2021-05-21
Filing date: 2021-05-21
Publication date: 2022-12-09
Anticipated expiration: 2041-05-21
Also published as: CN113315763A

Abstract

The disclosure relates to a network security defense method based on heterogeneous group evolution game, which comprises the following steps: dividing an attacker and a defender into different game groups according to the difference of decision behaviors of the attacker and the defender; constructing a heterogeneous group evolution game model according to the game group; constructing a heterogeneous group replication dynamic equation according to the heterogeneous group evolution game model; and determining an optimal defense strategy through the heterogeneous population replication dynamic equation. The method and the device improve the accuracy of network security defense decisions.

Description

Network security defense method based on heterogeneous group evolution game

Technical Field

The disclosure relates to the technical field of computer network information security, in particular to a network security defense method based on heterogeneous group evolutionary gaming.

Background

Information network technologies such as 5G, block chains and the like accelerate the development of informatization to intellectualization, and meanwhile, hidden, efficient and targeted network attacks represented by Advanced Persistent Threat (APT) make network security situation and defense decision become more and more complex. The network security defense decision is a precondition and a key link for the application of network defense technology and tactics, and is established on the basis of accurate control of elements such as network attack and defense action characteristic rules, network service load dynamic requirements and the like.

In the related art, the network security defense method has limitations on the assumption of a game type, does not fully consider experience reference values and the intelligent requirements of decision behaviors, and cannot show the difference between an attack and defense party, so that the finally obtained network security defense decision is not accurate. Therefore, there is a need to improve one or more of the above problems in the related art solutions to improve the efficiency of platform dynamic defense under persistent and staged attacks.

It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.

Disclosure of Invention

The embodiment of the disclosure aims to provide a network security defense method based on heterogeneous group evolution game, so as to improve the accuracy of network security defense decision.

The invention provides a network security defense method based on heterogeneous group evolutionary game, which comprises the following steps:

dividing an attacker and a defender into different game groups according to the difference of decision behaviors of the attacker and the defender;

constructing a heterogeneous group evolution game model according to the game groups;

constructing a heterogeneous population replication dynamic equation according to the heterogeneous population evolution game model;

and determining an optimal defense strategy through the heterogeneous population replication dynamic equation.

In one embodiment of the present disclosure, the heterogeneous population evolution game model is a 4-tuple model (N, S, P, U), wherein,

N＝(N _A ,N _D )，N _A is the aggressor participant total space, N _A ＝(N _A1 ,N _A2 ,…,N _Aj )，N _A1 ,N _A2 ,…,N _Aj Is a subgroup of aggressor participants, N _D Is the defense participant total space, N _D ＝(N _D1 ,N _D2 ,…,N _Di )，N _D1 ,N _D2 ,…,N _Di Is a subset of defender participants;

S＝(S _A ,S _B ) Hybrid strategy space for a group of attacking and defending game participants, S _A Is the pure policy total space of the aggressor participant, S _A ＝(S _A1 ,S _A2 ,…,S _Aj )，S _A1 ,S _A2 ,…,S _Aj Is a pure strategy for the selection of a subgroup of aggressor participants, S _D Is the defensive party participant pure policy total space, S _D ＝(S _D1 ,S _D2 ,…,S _Di )，S _D1 ,S _D2 ,…,S _Di Is a pure strategy for the selection of a sub-population of defender participants;

P＝(P _A ,P _D ) For game belief sets, P _A Is a set of game beliefs of aggressors, P _A ＝(P _A1 ,P _A2 ,…,P _Aj )，P _Aj Is to select a policy S _Aj Probability of (P) _D Is a set of game beliefs of aggressors, P _D ＝(P _D1 ,P _D2 ,…,P _Di )，P _Di Is to select strategy S _Di The probability of (d);

U＝(U _A ,U _D ) For game income collection, U _A Is the aggressor game income set, U _A ＝(U _A1 ,U _A2 ,…,U _Aj )，U _Aj Is a subgroup N _Aj By adopting a pure strategy S _Aj Expected benefit, U, obtained in a stage of gaming _D Is the defensive party game income set, U _D ＝(U _D1 ,U _D2 ,…,U _Di )，U _Di Is a subgroup N _Di By adopting a pure strategy S _Di The desired benefit obtained in a one stage game.

In an embodiment of the present disclosure, the game profit set U = (U) _A ,U _D ) In the step (1), the first step,

is the average gain in space of the aggressor participant population,

is the average revenue of the defense participant population space,

in an embodiment of the disclosure, the profit calculation formula of the defender is U _D ＝δ·C _r -O _cost The income calculation formula of the attacker is U _A ＝λ·C _r -A _cost Wherein, in the step (A),

C _r in a complete attack and defense process, the target resources of an attack party are repeatedThe degree of severity;

O _cost the cost of the defender to make targeted adjustments to defeat the aggressor attack;

A _cost the cost paid for the attacker to attack;

lambda is the probability of the attacking party successfully utilizing the vulnerability to infect the defending party;

delta is the probability of the defender successfully clearing the virus with the defending action.

In an embodiment of the disclosure, the step of constructing the heterogeneous population replication dynamic equation according to the heterogeneous population evolution game model includes:

obtaining a basic replication dynamic equation according to the game belief set and the time derivative of the sub-population;

and improving the basic replication dynamic equation to obtain the heterogeneous population replication dynamic equation.

In one embodiment of the present disclosure, the base copy dynamic equation is

P _Di ' (t) corresponds to the defender gaming beliefs at time t.

In an embodiment of the present disclosure, the step of improving the basic replication dynamical equation to obtain the heterogeneous population replication dynamical equation includes:

and establishing a system dynamic equation according to a preset strategy learning mechanism, and improving the basic replication dynamic equation.

In an embodiment of the disclosure, the preset policy learning mechanism is that after each stage of game is finished, each sub-group of the attacker and the defender randomly extracts one other sub-group from the groups as a countering object to perform policy learning.

In one embodiment of the present disclosure, the heterogeneous population replication dynamic equation is

Wherein b is the resistance.

In an embodiment of the present disclosure, the heterogeneous group evolution game model is a dual heterogeneous group evolution game model.

The technical scheme provided by the disclosure can comprise the following beneficial effects:

in the embodiment of the disclosure, a heterogeneous population evolution game model is constructed through differential analysis of decision behaviors of an attacker and a defender, a heterogeneous population replication dynamic equation consistent with the heterogeneous population evolution game model is established, an optimal defense strategy is determined through the heterogeneous population replication dynamic equation, and the accuracy of network security defense decisions is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure. It is apparent that the drawings in the following description are only some embodiments of the disclosure, and that other drawings may be derived from those drawings by a person of ordinary skill in the art without inventive effort.

Fig. 1 is a schematic diagram illustrating steps of a network security defense method based on heterogeneous group evolution gaming in an exemplary embodiment of the present disclosure;

FIG. 2 is a schematic diagram illustrating steps of a method for constructing a heterogeneous population replication dynamic equation according to a heterogeneous population evolution game model in an exemplary embodiment of the present disclosure;

FIG. 3 illustrates a convergence trajectory of an evolving stable solution in an exemplary embodiment of the disclosure;

FIG. 4 illustrates a convergence trajectory of a solution to the classical model evolution stability in an exemplary embodiment of the present disclosure;

FIG. 5 is a schematic diagram illustrating a topological environment of a network information system in an exemplary embodiment of the present disclosure;

FIG. 6 shows a policy selection probability variation trend of both attacking and defending parties in an exemplary embodiment of the disclosure;

fig. 7 shows a policy selection probability variation trend of both attacking and defending parties under different values of b in the exemplary embodiment of the disclosure.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

First, in this exemplary embodiment, a network security defense method based on heterogeneous group evolution game is provided, and referring to fig. 1, the method may include the following steps:

step S101: dividing an attacker and a defender into different game groups according to the difference of decision behaviors of the attacker and the defender;

step S102: constructing a heterogeneous group evolution game model according to the game groups;

step S103: constructing a heterogeneous population replication dynamic equation according to the heterogeneous population evolution game model;

step S104: and determining an optimal defense strategy by copying a dynamic equation through the heterogeneous population.

Hereinafter, each step of the above-described method in the present exemplary embodiment will be described in more detail.

In step S101, the "Population (Population) in the evolutionary game is derived from the Population concept in biology. In biology, different populations of the same species have differences in traits due to different living environments, and objects need to be distinguished into heterogeneous populations in the research process. In the academic field, the population in biology is mapped to the population in the game theory, and different populations represent game participants with the same attribute types but different decision modes in the game. In some network attack and defense gaming scenarios, both gaming parties can be set as limited rational game participants, but there is a certain difference in their decision-making manner. For example, in the aspect of decision criteria, a defender needs to balance the resource importance degree of a protection node, the security deployment cost and the defense operation cost; the attacking party needs to consider factors such as attack cost and attack income. Therefore, the traditional evolutionary game with the same decision-making mode adopted by game participants is set to belong to a homogeneous group evolutionary game essentially; in contrast, the heterogeneous group evolution game can better reflect the influence of different decision-making modes of game participants on game balance, and belongs to a double heterogeneous group evolution game according to a network attack and defense game with different income functions of attack and defense parties.

Specifically, in a real-world environment, the incompleteness of network situation information and the limited rationality of decision makers make it difficult for network attacking and defending parties to completely know the accurate real-time information of opponents, and under the condition of incomplete information, the cognition and decision modes of the attacking and defending parties are different, so that the difference of attacking and defending behaviors and heterogeneous population evolution game characteristics of attacking and defending decisions are caused.

In step S102, concepts in biology are mapped into a game model. The group in the game model represents a set of individuals in the same category, namely a group; the sub-population represents a set of individuals with the same characteristics, i.e., a set of individuals with the same traits, and the sub-population belongs to the population.

The network attack and defense game is a symmetrical game, and all game participants are divided into network attackers and network defenders according to the attributes of the game participants. And constructing a Dual Heterogeneous group evolution Game Model (DHPEGM) according to the Game groups. Specifically, the dual heterogeneous population evolution game model can be represented as a 4-element ordered group (N, S, P, U). Wherein:

S＝(S _A ,S _B ) Hybrid strategy space for a group of attacking and defending game participants, S _A Is the pure policy total space of the aggressor participant, S _A ＝(S _A1 ,S _A2 ,…,S _Aj )，S _A1 ,S _A2 ,…,S _Aj Is a pure strategy for the selection of a subgroup of aggressor participants, S _D Is the defender participant pure policy total space, S _D ＝(S _D1 ,S _D2 ,…,S _Di )，S _D1 ,S _D2 ,…,S _Di Is a pure strategy for defensive party participant subgroup selection;

P＝(P _A ,P _D ) For game belief sets, P _A Is a set of game beliefs of aggressors, P _A ＝(P _A1 ,P _A2 ,…,P _Aj )，P _Aj Is to select strategy S _Aj Probability of (P) _D Is a set of game beliefs of aggressors, P _D ＝(P _D1 ,P _D2 ,…,P _Di )，P _Di Is to select a policy S _Di The probability of (d);

U＝(U _A ,U _D ) Is a Chinese traditional medicineGo to chess with income set U _A Is the aggressor game revenue set, U _A ＝(U _A1 ,U _A2 ,…,U _Aj )，U _Aj Is a subgroup N _Aj By adopting a pure strategy S _Aj Expected benefit, U, achieved in a one-stage game _D Is the defensive party game income set, U _D ＝(U _D1 ,U _D2 ,…,U _Di )，U _Di Is a subgroup N _Di By adopting a pure strategy S _Di The desired benefit obtained in a one stage game.

In one embodiment, the set of game benefits U = (U) _A ,U _D ) In (1),

is the average gain in space of the aggressor participant population,

is the average revenue of the defense participant population space,

in particular, the benefit refers to the incremental effect of the game on the adaptive impact of the game participants. Take defensive side as an example, U _Di The game of =1 time does not influence the game strategy selection, U _Di Positive influence on strategy selection at > 1, U _Di Strategy selection is negatively impacted < 1.

In step S103, a heterogeneous population replication dynamic equation is constructed according to the heterogeneous population evolution game model. A pure strategy stable feasible solution and an optimal defense pure strategy selection algorithm based on a potential function are designed by combining the deterministic requirement of network security defense single decision and the limitation of the traditional Nash equilibrium solution.

In one embodiment, referring to fig. 2, step S103 further includes steps S201 and S202.

Step S201: obtaining a basic replication dynamic equation according to the game belief set and the time derivative of the sub-population;

step S202: and improving the basic replication dynamic equation to obtain the heterogeneous population replication dynamic equation.

Specifically, the network defense game is a multi-stage game, and each stage of game is to randomly draw one person from each game party subgroup for game. In the latter stage, each gaming participant "mimics" the gaming strategy of the previous stage. For each stage, the natural birth rate of the game participants is beta (beta is more than or equal to 0), the natural death rate is delta (delta is more than or equal to 0), and the natural birth rate represents the adaptability of the game participants to the environment of the stage, namely the probability that the two network attacking and defending parties quit the game before and in the stage due to the network breaking, disconnection and other irresistible factors.

Then in step S201, the sub-population N is determined at time t _Di Time derivative N of (t) _Di ' (t) is:

N _Di ′(t)＝(β+U _Di (t)-δ)·N _Di (t) (1)

combining the meaning of game belief sets, the method can obtain that at any time t:

P _Di (t)·N _D (t)＝N _Di (t) (2)

and (3) simultaneously performing derivation on t at two sides of the formula (2) and finishing to obtain:

equation (3) is based on duplicating the dynamic equation, P _Di ' (t) corresponds to the defender gaming beliefs at time t. For example, in a certain stage of game, the profit of the defender can be expressed as:

U _D ＝δ·C _r -O _cost (4)

the aggressor revenue is derived from the revenue obtained after infecting the platform, and is related to the infection probability, then the aggressor revenue can be expressed as:

U _A ＝λ·C _r -A _cost (5)

wherein, C _r The method is the important degree of attacking the target resources of the attack party in a complete attack and defense process. O is _cost Targeted defeating aggressor attacks by defendersThe adjustment may entail a penalty such as increased system overhead, decreased quality of service, etc. A. The _cost The attack cost is related to the threat level of the vulnerability in the embodiment, and the higher the threat level of the vulnerability is, the lower the attack cost is. And lambda is the probability of the attacker successfully utilizing the vulnerability to infect the defender. Delta is the probability of the defender successfully clearing the virus with the defending action.

In step S202, a system dynamics equation may be established according to a preset strategy learning mechanism, and the basic replication dynamics equation may be modified.

In multi-stage gaming, the gaming parties are generally not satisfied with the benefits of the current stage gaming strategy, and it is believed that a more optimal strategy exists. Under this dissatisfaction assumption, the two parties in the game seek other strategies to learn, and adopt a new strategy in the next stage of the game, namely a strategy "thinking-learning" mechanism. In a specific embodiment, the preset strategy learning mechanism is that after each stage of game is finished, each subgroup of the attacker and the defender randomly extracts one other subgroup from the groups as a countering object to learn the strategy.

Obviously, the real network attack and defense game decision should be based on a 'backstepping-learning' mechanism in nature. After each stage of game is finished, the backstepping-learning mechanism can be combined with modeling analysis to establish an evolutionary game model and a system kinetic equation which are consistent with the modeling analysis. Under the condition of limited rationality, the network attack and defense sub-group adjusts the behavior based on the strategy of a backstepping-learning mechanism, and can be regarded as an independent incremental process for accumulating the occurrence times of random events, namely a poisson process. The 'backstepping-learning' time of the sub-population can be approximated as the arrival time of the Poisson process, and the arrival rate of the Poisson process is the average backstepping rate R _s . Assuming that the Poisson distributions of the sub-populations are statistically independent of each other, a defense strategy S is taken _Di The sum of the sub-population 'backstepping-learning' time of (A) is a Poisson process, the arrival rate P of which _arrive Comprises the following steps:

P _arrive ＝P _Di ·R _s (N _Di ) (6)

defining defense policy transition probabilities

Removal of N in defense-ready populations _Di Sub-populations beyond that convert defense strategies to S _Di The probability of (c). In this example, the reflexes are driven by the dissatisfaction of the sub-population with its own strategy, while the sub-population of each reflexes draw is random, and therefore

Transitioning from other policies to S in a population if the transition of defense policies is also statistically independent _Di Overall poisson process arrival rate P _arrive Comprises the following steps:

according to the law of large numbers, a group random process is set as a deterministic flow, and then a sub-group N _Di From selection of defense strategy S _Dj Subgroup N of _Dj Inflow P of _in Comprises the following steps:

subgroup N _Di Outflow P of _out Comprises the following steps:

game belief P of defense strategy _Di The following steps are changed:

if the rate of inversion of sub-populations in the population with an unsuccessful strategy is higher than the rate of inversion of sub-populations with a more successful strategy, a selection dynamic with strictly monotonically decreasing gains occurs. By introducing a Lipschitz continuous potential function ρ (x) which strictly monotonically decreases on its argument x, the average inversion rate can be expressed as:

R _s (N _Di )＝ρ(U _Di ) (11)

defense strategy S _Di Is selected probability P _Di Can be expressed as:

assuming that the rate of thought of the subpopulation is linearly decreasing in its current yield, then

ρ(U _Di )＝a-b·U _Di (a,b∈R) (13)

Let the inverse thinking rate R _s (N _Di ) If not, the heterogeneous population replication dynamic equation is obtained as

At this time, P _Di ' with strict Nash equilibrium, if the time argument is ignored, then equation (14) is a constant multiple of equation (3); b is the rate of the countering ability, which is adjusted to a steady state corresponding to the strategy.

The stability analysis is further carried out on the method, starting from the essential condition that the heterogeneous group evolution game evolution is stable, and the credibility and the reasonability of the decision are verified through stability verification and example analysis of the pure strategy evolution equilibrium solution of the game model.

1. Mathematical proof

Firstly, the definition of an evolution stable and optimal strategy set is introduced:

different mixing strategies S for game participants _x ,S _y If present, of _y E (0, 1) satisfies the inequality U (S) _x ,S _ω )≥U(S _y ,S _ω ) For all epsilon (0, epsilon) _y ) Are all true, then S _x Is an evolutionary stabilization strategy. Wherein S is _ω ＝εS _y +(1-ε)S _x Is a hybrid strategy S _y New mixed strategy, epsilon, formed after invading the original mixed strategy space _y Is intrusion strategy S _y Probability of being picked, U (S), in a game _x ,S _ω ) Is the original strategy space is strategy S _y Income after invasion, U (S) _y ,S _ω ) Is the benefit of the intrusion strategy.

Optimal policy set

Means all game participants N _i Of (2) an evolving stabilization strategy S _i The set of (a) or (b),

obvious set

Is a strict nash equilibrium for gaming.

Theorem 1: the essential condition for the stable evolution of heterogeneous population N is the strict nash equilibrium of N.

The sufficiency: setting heterogeneous population N to be stable in evolution and fixing position N of game participants in game total space _i . Order to

And all j ≠ i have S _yj ＝S _xj . Hybrid strategy S _ω ＝εS _y +(1-ε)S _x Wherein ε ∈ (0, ε) _y ) Then for any i there is U (S) _xi ,S _ωi )＝U(S _yi ,S _ωi ) And for all j ≠ i, there is U (S) _xj ,S _-ωj )＝U(S _yj ,S _-ωj ) In which S is _-ωi Is the policy space to gambling party N _i Mixed strategy S of _ωi The complement of (c). According to evolution stability, S _y ＝S _x And is provided with

Thus, it is possible to provide

So that a strict nash equilibrium exists for N.

The necessity: setting the heterogeneous group N to have strict Nash equilibrium and fixing the position N of the game participant in the total game space _i And order S _y ≠S _x . For any i, there is U (S) _xi ,S _-xi )＝U(S _xi )＞U(S _yi ,S _-xi ) Due to the profit U (S) _xi ) Is a continuous function, must have epsilon _y E (0, 1) such that for any e (0, e) _y ) And S _ω ＝εS _y +(1-ε)S _x All have U (S) _xi ,S _-ω )＞U(S _yi ,S _-ω ) I.e. the heterogeneous population N is evolutionarily stable.

From the above analysis, R _s The form of (c) determines whether the equation has a progressively stable evolutionary equilibrium solution. In a game model, an unstable evolutionary equilibrium solution cannot form a feasible and credible preferred strategy, so potential games and potential function concepts are introduced, namely if the strategy change of each sub-population is monotonous and can be mapped into a global monotonous function, the global monotonous function is a potential function, and the game has strict nash equilibrium. Therefore, the potential function is introduced into the formula (12), so that the heterogeneous population evolution game model can obtain an evolution stable solution, and effective and accurate decision of defense is realized.

Each potential game has a pure strategy evolution stable solution.

Heterogeneous group game is N = (N) ₁ ,N ₂ …N _m ) The function ρ (x) is the potential function of the heterogeneous group game, thus N _i The stable solution of (a) can be mapped into N (ρ (i)), if and only if U (ρ (i)) > U (- ρ (i)). Because the potential function is monotonic, there is a pure policy-evolving stable solution for N (ρ (i)), N _i There is a pure strategy evolution stable solution.

2. Example analysis

Taking a 2 x 2 attack and defense symmetric game as an example, the solution process of evolutionary equilibrium solution is performed. Both attacking and defending parties respectively contain two sub-groups N _A1 ，N _A2 ；N _D1 ，N _D2 Corresponding to pure strategyIs S _A1 ，S _A2 ；S _D1 ，S _D2 . Taking the game defender as an example, the revenue matrix can be expressed as:

matrix U _D Is a standardized matrix that reduces the number of variables that need to be observed. u. of ₁ The attacker adopts a pure strategy S _A1 The defending party adopts a pure strategy S _D1 The relative gain obtained; u. u ₂ Is that the attacker adopts a pure strategy S _A2 The defender then adopts a pure strategy S _D2 The relative gain achieved. Substituting equation (14) to derive the replication dynamic equations of the corresponding defenders and attackers:

P _A1 ′＝b·[(u ₁ +u ₂ )·P _D1 -u ₂ ]·P _A1 ·(1-P _A1 )

P _D1 ′＝b·[(u ₁ +u ₂ )·P _A1 -u ₂ ]·P _D1 ·(1-P _D1 ) (16)

P _A2 ′＝-P _A1 ′，P _D2 ′＝-P _D1 ′

and analyzing the stability of the game evolution stable solution by using an MATLAB experimental tool. From the formula (15), u ₁ ,u ₂ The positive and negative values of (c) will influence the evolution trend of the game, u ₁ ,u ₂ The numerical value of the game cannot influence the evolution trend of the game; the value of b affects the rate of evolution of the game. In the experiment, for u ₁ ,u ₂ And b, adjusting the value of b for multiple times, and finding that the convergence result of the evolution stable solution is not influenced. Setting | u ₁ |＝0.4,|u ₂ I | =0.6, b =1, initial game belief P _A1 ,P _D1 Is a random number of (0, 1), and fig. 3 corresponds to 100 monte carlo simulation experiment results.

In fig. 3, the dots in the middle of the box are pure strategy solution convergence points, and the dots at the corners of the box are mixed strategy solution convergence points. Analysis of FIGS. 3 (b) and (d) shows that when u is equal to ₁ ·u ₂ If < 0, the game beliefs are in the stateThe symbols in the space are not changed, and starting from any initial position in the state space, the overall states of both sides of the game converge to a strict dominance pure strategy, namely when u is ₁ ＝0.4,u ₂ When the time is =0.6, the attacker adopts a pure strategy S _A1 The defender adopts a pure strategy S _D1 ；u ₁ ＝-0.4,u ₂ The attacker adopts pure strategy S when =0.6 _A2 The defensive party adopts a pure strategy S _D2 。

When analyzing FIGS. 3 (a) and (c), it can be seen that u is the same as ₁ ·u ₂ Above 0, the game has two strict pure strategy nash balances and one mixed strategy nash balance. In combination with equation (16), it can be seen that P is the value of the game when it converges to the Nash equilibrium of the hybrid strategy _A1 ＝u ₂ /(u ₁ +u ₂ ),P _D1 ＝u ₂ /(u ₁ +u ₂ ). The mixing strategy Nash equilibrium point of the game is unstable and follows u ₁ ,u ₂ The value changes. Therefore, when u ₁ ·u ₂ At > 0, gaming has only two stable strict pure strategy nash equilibria. Further analysis of FIG. 3 (a) reveals that the hybrid strategy Nash equilibrium is a saddle point, and besides the curve passing through the saddle point, other solution trajectories converge to two stable pure strategy Nash equilibrium, i.e., when u is equal to ₁ ＝0.4,u ₂ When =0.6, the attacker adopts pure strategy S _A1 The defensive party adopts a pure strategy S _D1 Or the attacker adopts pure strategy S _A2 The defender adopts a pure strategy S _D2 . Further analysis of FIG. 3 (c) reveals that the game strategy of both sides of the attack and defense game can converge to a more extreme case, namely when u ₁ ＝-0.4,u ₂ When the time is =0.6, the attacker adopts a pure strategy S _A1 The defensive party adopts a pure strategy S _D2 Or the attacker adopts a pure strategy S _A2 The defender adopts a pure strategy S _D1 。

Compared with other related documents, in the double-homogeneous group evolution game model, the mixed strategy evolution stable solution of the 2 x 2 symmetric game model is stable and can be used as a reference of an optimal defense strategy [10,15], but in the double-heterogeneous group game model, the mixed strategy evolution stable solution of the 2 x 2 symmetric game model is saddle points and is not strictly stable. This also follows the characteristics of the actual gaming process, i.e. when the game occurs in two distinct groups, the behavior will show a tendency of "extreme" and the decision will be more and more biased towards a single strategy.

In order to further embody the capability of the model and the algorithm to overcome the fact deviation, a group of comparison experiments are set. The replication dynamic equation in the classical model is:

P _A1 ′＝[(u ₁ +u ₂ )·P _A1 -u ₂ ]·P _A1 ·(1-P _A1 )

P _D1 ′＝[(u ₁ +u ₂ )·P _D1 -u ₂ ]·P _D1 ·(1-P _D1 ) (17)

P _A2 ′＝-P _A1 ′，P _D2 ′＝-P _D1 ′

as can be seen from the comparison of the formulas (16) and (17), the strategy adjustment of the attacking and defending parties in the classical model does not consider the game strategy change of the other party, but adjusts the strategy selection through the income change thereof. However, the real network attack and defense game is a normal game, and the profit measurement modes of the attack and defense parties are different. The optimal defense strategy is selected by applying a classical model, and the optimal defense strategy can be induced by a deceptive strategy of an attacker to generate an error strategy reference result. To prove this, | u is maintained ₁ |＝0.4,|u ₂ I =0.6 invariant, initial game belief P _A1 ,P _D1 Is a random number of (0, 1), fig. 4 corresponds to the results of 100 monte carlo simulations of the classical model.

When u is found by analyzing FIG. 4 (a) ₁ ＞0,u ₂ And when the game result is more than 0, the game result is related to the values of the initial game beliefs PA1 and PD1, and the strategy optimization cannot be realized. When u is found by analyzing FIG. 4 (c) ₁ ＜0,u ₂ When the frequency is less than 0, the game converges to a Nash equilibrium point (0.6 ) of a mixing strategy, and the game result appears in a probability form at the moment, which is not beneficial to the certainty requirement of decision in reality. Comparative analysis of FIGS. 3 (b) (d) and 4 (b) (d) when u ₁ ·u ₂ When the evolution stability solution of the classical model and the dual heterogeneous evolution game model is completely opposite when the evolution stability solution is less than 0. It should be noted that the evolution of defense strategies in the classical model is not consideredDue to the change of game beliefs of the attacker, the attacker can completely utilize the vulnerability design cheating strategy to mislead the defending party. In contrast, the model and the algorithm provided by the embodiment can overcome the fact deviation caused by the homogeneous group assumption in the classical model, and provide a credible defense decision reference for network security defense.

3. Simulation analysis

By taking a classic network information system design idea as a reference, a simple network information system is deployed for simulation experiments for verifying the effectiveness of the disclosed model and method. The topological environment of the network information system is shown in fig. 5.

The firewall and the gateway divide the network into an external network area where an attacker is located, an isolation area (DMZ) where experiments are performed, and an internal network area where a defending party (user) is located. The access control strategy of the firewall is that the non-intranet host can only access the FTP server, the Web server, the E-MAIL server and the bastion host H of the DMZ zone, and three servers in the DMZ zone are Cisco servers. An experimental network information system is scanned by using a Nessus tool, vulnerability information provided by a National information Security Vulnerability library (CNNVD) and definitions of network defense strategies and operation costs such as Jiangwei are combined, an atomic attack strategy used in the experiment is shown in a table 1, and an atomic defense strategy is shown in a table 2.

TABLE 1 atomic attack strategy

TABLE 2 atomic defense strategy

The attacker utilizes the high-score vulnerability to attack the short-term income gain and take effect quickly, but is not beneficial to the income increase value after long-term holding (the zero-day vulnerability is a classic example), and selects the low-score vulnerability as the target attack with high cost and low single income. This embodiment will utilize a high scoring funnelHole setting as risk type attack strategy S _A1 ＝(a ₁ ,a ₂ ,a ₃ ) Setting as a conservative attack strategy S by using a low-score vulnerability _A2 ＝(a ₄ ,a ₅ )。

The policy yield of the defender depends mainly on the operating cost O _cost Defense strategies that operate at low cost tend to be less effective. Thus, the use of the high operational cost policy is set herein as the adventure-type defense policy S _D1 ＝(b ₄ ,b ₅ ) Setting as a conservative defense strategy S using a low operating cost strategy _D2 ＝(b ₁ ,b ₂ ). Setting the resource importance degree C by combining with the income calculation formulas (4) and (5) _r =1, the available offensive and defensive strategy revenue is shown in table 3.

TABLE 3 attack and defense strategy revenue quantification

When strategy income is calculated, the strategy income is considered to be equal to the average income of atomic attack and defense actions contained in the strategy, and an income quantization matrix of both the attack and defense parties is given by combining a formula (15):

experiment 1 attack and defense strategy selection probability variation trend

In conjunction with equation (17) (18), with the control variable b =1, the convergence of the evolution stabilization strategy under experimental conditions was first investigated. Initial game beliefs (P) _A1 ,P _D1 ) = { (0.5 ), (0.7, 0.3), (0.3, 0.7), (0.6 ) }, four primary groupsThe initial game beliefs respectively represent that the attacking and defending parties have no strategy selection tendency; the attacker tends to choose strategy S _A1 The defender prefers to select policy S _D2 (ii) a The attacker tends to choose strategy S _A2 The defender prefers to select policy S _D1 (ii) a The attacker tends to choose strategy S _A1 The defenders tend to select policy S _D1 And so on for different situations. Fig. 6 shows a simulation result of the strategy selection probability variation trend of the attacking and defending parties.

Analysis of FIGS. 6 (a) (b) reveals that the game credits (P) correspond to different initial game credits _A1 ,P _D1 )＝(0.5,0.5),(0.7,0.3),(0.3,0.7),(0.6,0.6),P _A1 Is always converged to 1,P _A2 Always converge to 0; p is _D1 Always converge to 1,P _D2 Always converging to 0. Binding to Experimental conditions U _A ,U _D Further analysis of the numerical value shows that the relative profit u in the risk type strategy ₁ Relative profit u far greater than conservative strategy ₂ In the case of the game, no matter whether the attacking and defending parties have the strategy selection tendency before the game starts, the network attacking and defending parties finally select the risk type strategy.

Experiment 2 influence of the countering ability b on the selection of the attack and defense strategy

Keep u ₁ ,u ₂ Setting initial game beliefs (P) without change _A1 ,P _D1 ) = 0.7,0.3, and b =0.5,1,1.5, respectively, were taken to investigate the effect of parameter b on game results. Fig. 7 shows simulation results of the policy selection probability variation trend of both the attacking and the defending parties under different values of b.

Analysis of fig. 7 reveals that when b =0.5,1,1.5, the strategy P _A1 The evolution times required for achieving stable evolution are 36 times, 15 times and 7 times respectively; policy P _D1 The number of evolutions required to reach stability of evolution was 117, 59 and 39, respectively. Taking b =1 as a reference, when b is less than 1, the game strategy needs more evolution times to reach an evolution stable state; when b > 1, the gaming strategy requires fewer evolutions to reach the evolutionary steady state. Therefore, the backstepping capability b can influence the solving speed of the game result. The practical significance of the method is that the sub-population with weak thinking-resisting ability (b is less than 1) needs more time to be suitableThe decision can be made only by the environment; the sub-population (b is more than 1) with stronger thinking resistance has stronger environmental adaptability and quicker decision response. The parameter b is reasonably adjusted theoretically so as to correspond to the time window of each game, and the time sensitivity of the game result can be improved.

The network security defense method based on the heterogeneous group evolution game is characterized in that decision method research based on a heterogeneous group evolution game is carried out aiming at a network security defense accurate decision problem, limitations of a traditional homogeneous group game model are analyzed by combining a population concept in biology, a heterogeneous group evolution game model is constructed, an optimal defense strategy selection algorithm based on a strategy backstepping mechanism is designed, and feasibility and stability of a global monotonous potential function analysis model solution are introduced. The feasibility and the credibility of the double heterogeneous population evolution game model and the strategy selection algorithm are verified through simulation.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice in the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims

1. A network security defense method based on heterogeneous group evolution game is characterized by comprising the following steps:

constructing a heterogeneous group evolution game model according to the game group;

constructing a heterogeneous group replication dynamic equation according to the heterogeneous group evolution game model;

determining an optimal defense strategy by copying a dynamic equation through the heterogeneous population;

the heterogeneous population evolution game model is a 4-tuple model (N, S, P, U), wherein,

S＝(S _A ,S _B ) Hybrid strategy space for a group of attacking and defending game participants, S _A Is the pure policy total space, S, of the aggressor participant _A ＝(S _A1 ,S _A2 ,…,S _Aj )，S _A1 ,S _A2 ,…,S _Aj Is a pure strategy for the selection of a subgroup of aggressor participants, S _D Is the defender participant pure policy total space, S _D ＝(S _D1 ,S _D2 ,…,S _Di )，S _D1 ,S _D2 ,…,S _Di Is a pure strategy for the selection of a sub-population of defender participants;

P＝(P _A ,P _D ) For game belief sets, P _A Is a game belief set of an attacker, P _A ＝(P _A1 ,P _A2 ,…,P _Aj )，P _Aj Is to select a policy S _Aj Probability of (P) _D Is a set of game beliefs of aggressors, P _D ＝(P _D1 ,P _D2 ,…,P _Di )，P _Di Is to select a policy S _Di The probability of (d);

U＝(U _A ,U _D ) For game revenue sets, U _A Is the aggressor game income set, U _A ＝(U _A1 ,U _A2 ,…,U _Aj )，U _Aj Is a subgroup N _Aj By adopting a pure strategy S _Aj Expected benefit, U, achieved in a one-stage game _D Is the defensive party game income set, U _D ＝(U _D1 ,U _D2 ,…,U _Di )，U _Di Is a subgroup N _Di By adopting a pure strategy S _Di In one stepExpected revenue obtained in the segment game;

the step of constructing the heterogeneous group replication dynamic equation according to the heterogeneous group evolution game model comprises the following steps:

establishing a system dynamic equation according to a preset strategy learning mechanism, and improving the basic replication dynamic equation;

the basic replication dynamic equation is

P _Di ' (t) game beliefs, U, of defenders corresponding to time t _Di (t) is the sub-population N at time t _Di By adopting a pure strategy S _Di The desired proceeds obtained in a one-stage game,

is the average yield, P, of the defender participant group space at time t _Di (t) is the selection strategy S at time t _Di The probability of (d);

the heterogeneous population replication dynamic equation is

Wherein b is the resistance to thinking.

2. The method of claim 1, wherein the set of game benefits is U = (U) _A ,U _D ) In (1),

is the average gain in space of the aggressor participant population,

is a defenseThe average gain in space of the group of participants of the party,

P _Am is the game belief set of the attacker, U _Am Is a subgroup N _Aj By adopting a pure strategy S _Aj Expected benefit, P, obtained in a stage of the game _Dn Is to select a policy S _Di Probability of (U) _Dn Is a subgroup N _Di By adopting a pure strategy S _Di The desired benefits obtained in a one stage game.

3. The method of claim 1, wherein the yield of the defender is calculated as U _D ＝δ·C _r -O _cost The income calculation formula of the attacker is U _A ＝λ·C _r -A _cost Wherein, in the step (A),

C _r the importance degree of the target resources of the attack party in a complete attack and defense process;

A _cost the cost paid for the attacker to attack;

4. The method of claim 1, wherein the predetermined strategy learning mechanism is that after each stage of game is over, each sub-population of the attacker and the defender randomly extracts one other sub-population from the population as a countering object to perform strategy learning.

5. The method of claim 1, wherein the heterogeneous population evolution game model is a dual heterogeneous population evolution game model.