CN110049497A - A kind of user oriented intelligent attack defense method in mobile mist calculating - Google Patents

A kind of user oriented intelligent attack defense method in mobile mist calculating Download PDF

Info

Publication number
CN110049497A
CN110049497A CN201910287756.1A CN201910287756A CN110049497A CN 110049497 A CN110049497 A CN 110049497A CN 201910287756 A CN201910287756 A CN 201910287756A CN 110049497 A CN110049497 A CN 110049497A
Authority
CN
China
Prior art keywords
attack
legitimate user
mode
defence
probability
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910287756.1A
Other languages
Chinese (zh)
Other versions
CN110049497B (en
Inventor
涂山山
孟远
于金亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN201910287756.1A priority Critical patent/CN110049497B/en
Publication of CN110049497A publication Critical patent/CN110049497A/en
Application granted granted Critical
Publication of CN110049497B publication Critical patent/CN110049497B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W12/00Security arrangements; Authentication; Protecting privacy or anonymity
    • H04W12/12Detection or prevention of fraud
    • H04W12/121Wireless intrusion detection systems [WIDS]; Wireless intrusion prevention systems [WIPS]
    • H04W12/122Counter-measures against attacks; Protection against rogue devices
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W12/00Security arrangements; Authentication; Protecting privacy or anonymity
    • H04W12/009Security arrangements; Authentication; Protecting privacy or anonymity specially adapted for networks, e.g. wireless sensor networks, ad-hoc networks, RFID networks or cloud networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

The user oriented intelligent attack defense method of one kind not only relates to computer network and wireless communication field in mobile mist calculating, but also belongs to cyberspace security fields.Theoretical (Prospect Theory, PT) and DQL (Double Q-learning, the DQL) algorithm of Utilization prospects of the present invention realizes that mobile mist calculates the intelligent attack defending of subjectivity of customer-centric in environment.Malicious user initiates the intelligence attack of different mode using open wireless network access platform when mist node is communicated with legal terminal user, such as spoof attack, interference attack, eavesdropping attack etc., influence the secure communication that mist calculates mist node and mobile subscriber in network.Intelligence attack is defendd based on PT and DQL algorithm, improve the excessive estimation problem of Q value in Q-learning algorithm, generate the optimal defence policies of legitimate user, both the detection effectiveness of legitimate user under dynamic environment can have been increased, it can reduce the subjective attack probability of intelligent attacker again, while enhancing the security protection ability that mobile mist calculates network.

Description

A kind of user oriented intelligent attack defense method in mobile mist calculating
Technical field
Utilization prospects of the present invention theoretical (Prospect Theory, PT) and DQL (Double Q-learning, DQL) are calculated Method realizes that mobile mist calculates the intelligent attack defending of subjectivity of customer-centric in environment.Malicious user is wireless using opening The intelligence that network insertion platform initiates different mode when mist node is communicated with legal terminal user is attacked, such as spoof attack, Interference attack, eavesdropping attack etc., influence the secure communication that mist calculates mist node and mobile subscriber in network.It is calculated based on PT and DQL Method defence intelligence attack, improves the excessive estimation problem of Q value in Q-learning algorithm, generates the optimal anti-of legitimate user Imperial strategy, can not only increase the detection effectiveness of legitimate user under dynamic environment, but also can reduce the subjective attack of intelligent attacker Probability, while enhancing the security protection ability that mobile mist calculates network.Intelligence is resisted using nitrification enhancement and game theory Attack, not only relates to computer network and wireless communication field, but also belong to cyberspace security fields.
Background technique
Due to the continuous popularization of technology of Internet of things and mobile intelligent terminal, a large amount of interactive datas of generation are needed in wireless network It is handled in real time in network environment, and traditional cloud computing cannot effectively meet the network demands such as its isomery, low time delay.Mist calculates will Cloud computing extends to network edge, can use the direct transmission link of equipment to improve throughput of system, solves cloud computing The problems such as poor mobility, weak, time delay is high geography information perception.It is calculated in network in mobile mist, mist counting system structure can be divided into Cloud-mist-device framework and mist-device framework two major classes, comprising the equipment close to Internet of Things edge in mist layer, they are referred to as mist Node.Under the support of wireless network, mobile mist calculates mist node and terminal device in network and is able to carry out data interaction.
However, since wireless network is easy by security threat still there are many data and lead in mist layer and user's interlayer Believe safety problem.In the case where mobile mist calculates environment, illegal terminal user can start intelligent attack to other legitimate users, pass through Radio channel status information and defence policies information are obtained, and selects suitable attack mode to destroy mist layer and user's interlayer Wireless network secure.Common attack mode includes spoof attack, interference attack, eavesdropping attack etc., starts the end intelligently attacked End subscriber (i.e. intelligent attacker) subjectively calculates network to mist using the above attack means and causes huge threat.In face of intelligence It can attack, game theory is its threat of processing, guarantees that mist calculates the strong tools of network security.It calculates in network security, grinds in mist The person of studying carefully thinks that participating in the participant of game is rationality, they use expected utility theory (Expected Utility Theory, EUT) effectiveness of participant is calculated, participant selects each walking dynamic for the purpose of obtaining greatest hope effectiveness.But It is that in dynamic wireless network, each participant does not know about the accuracy rate of whole network state and reception information, When the strategy intelligently attacked is resisted in selection, their decision has strong subjectivity, not consistent with the result of EUT.And PT It is the theory for describing people and taking different risk partiality decisions when in face of gain and loss, which thinks that people are when facing acquisition Avoid risk, face lose when preference risk, it using subjective probability calculate participate in game person effectiveness, be able to reflect decision The subjectivity of person.
Therefore, the present invention is based on prospect theory, propose a kind of mobile mist calculate in the intelligent attack defending based on DQL algorithm Method.This method has derived static state by constructing the subjective zero-sum game model of static state between intelligent attacker and legitimate user The Nash Equilibrium of subjective zero-sum game, while by DQL algorithm, it proposes to inhibit intelligent attacker's subjectivity attack for dynamic environment The method of motivation generates the optimal defence policies of legitimate user, and legitimate user is made subjectively to judge whether that physical layer is only used only Safe practice resists intelligent attack.This method can be such that the detection effectiveness of legitimate user increases, and attack rate is promoted to be effectively reduced, with It resists the method intelligently attacked based on Q-learning algorithm, Sarsa algorithm, Greedy strategy and compares, this method is in mobile mist meter It calculates in environment and has higher security protection performance.
Summary of the invention
Present invention obtains the intelligent attack defense methods based on DQL algorithm in a kind of mobile mist calculating, devise movement The security model of intelligence attacker involved in mist calculating, and intelligent attacker and legitimate user's progress are constructed based on prospect theory The static method of subjective zero-sum game and dynamic subjective game method based on DQL algorithm.By this method defensive attack, so that The defence policies of legitimate user are optimal, and improve the detection effectiveness of legitimate user, reduce attack rate, while enhancing shifting Dynamic mist calculates internet security and protective performance.
Present invention employs the following technical solution and realize step:
1. the intelligence attack security model in mobile mist calculating
Security model of the invention considers the communication of mist layer and user's interlayer, such as Fig. 1 institute towards mist node and terminal user Show, mobile mist calculates the intelligent attacker of any one in network as the terminal user with subjectivity, is likely to other Legal terminal user initiates intelligence attack, and the value set expression of intelligent attacker is? Their attack mode of moment t is represented asMoment value set expression isSeparately Outside, the value set expression of legitimate user is,In moment t,Their defence mode is represented asAssuming that at a time t, intelligent attacker 1 utilizes Intelligent programmable wireless device, takesMode is in the legitimate user under same mist node with it to some and initiates intelligence It can attack, whenWhen, indicate that the attacker halts attacks;WhenWhen, indicate the attacker by sending interference letter Number attack legitimate user, reduce legitimate user from mist node receive signal SINR;WhenWhen, indicate that the attacker takes Attack mode is eavesdropped, the information propagated between mist node and legitimate user is intercepted and captured;WhenWhen, it is false to indicate that the attacker uses Media access control address (Media Access Control Address, MAC-A) pretend to be mist node to legitimate user send out Data are sent, i.e. the attacker takes spoof attack mode;WhenWhen, indicate that the attacker takes Replay Attack mode, The data packet for sending legitimate user's received mistake achievees the purpose that cheat legitimate user.Legitimate user under attack When facing different types of attack, there are two types of the modes of defence: whenWhen, legitimate user is used only PLS defence intelligence and attacks It hits, this defence mode is referred to as basic schema;WhenWhen, legitimate user will spend more overheads, make first Preliminary detection, filtering and anti-eavesdrop are carried out with the PLS technology based on channel parameter, then detects by HLSM and is tested by physical layer The data of card.
2. a kind of intelligent attack defense method of subjectivity based on DQL algorithm
Method includes the following steps:
(1) the static zero-sum game of subjectivity between intelligent attacker and legitimate user is established based on PT.Wherein, attack mode table It is shown as SAm, the quantity of attack mode is expressed as Num, Num >=1, and defence mode is expressed as EUn.Based on PT, intelligent attacker and conjunction Method user takes subjective decision to carry out game, realizes Nash Equilibrium.The present invention is calculated subjective using Prelec probability right function Probability, its calculation formula is:
Wherein p be objective probability, p ∈ (0,1], σobjectIndicate objective probability weight, σobject∈(0,1].Object table Show the object for participating in game, herein, object=attac or object=user.The description of Prelec probability right function The game object of subjective game is participated in because the result of adjustment is given to the objective probability of decision in the influence of weight.By PT's It inspires, when facing high-probability event, subjective decision person can underestimate corresponding objective probability;On the contrary, when facing low probability event When, subjective decision person can over-evaluate corresponding objective probability.In zero-sum game, legitimate user is in defence mode EUnLower detection intelligence Attack SAmThe income of acquisition is expressed asIf intelligent attack is not detected, the security loss being subjected to is expressed as All there is rate of false alarm and omission factor under any defence mode, rate of false alarm refers to that the valid data that legitimate node is sent is detected For the probability of illegal data, omission factor indicates that illegal data are detected as the probability of valid data, this is that safety occur The reason of loss.Therefore, both comprehensive ratios, legitimate user is in defence mode EUnSA is intelligently attacked in lower detectionmError rate It is expressed asAccording to system model, the attack mode of intelligent attacker shares 5 kinds, and the defence mode of legitimate user has 2 kinds, The value of utility of intelligent attacker and legitimate user are shown below:
Wherein, Uuser(SAm,EUn) indicate legitimate user value of utility, It is non-zero etc. to be quantified as C Grade,It isProbability, and obeyProbability distribution, whereinIt is all Quantify the sum of probability
According to formula (2), when game both sides, which are based on EUT, calculates effectiveness, calculation formula are as follows:
Wherein,Indicate the value of utility that legitimate user is calculated based on EUT.When game both sides are counted using PT When calculating effectiveness, they are made a policy based on subjective probability, and there is no calculated based on objective average detected error rate.Cause This, according to formula (1) and (2), the calculation formula of both sides' effectiveness is respectively as follows:
During subjective game, game both sides change subjective probability by adjusting objective weight, pursue respective effectiveness Maximum value, reach Nash Equilibrium.When intelligent attacker think by he current time initiate intelligence attack can be legal User detected, then he can select to halt attacks.When think can using the Security mechanism of higher by legitimate user When obtaining more multi-purpose, he can enable EUn=1.The strategy combination of Nash Equilibrium is represented as hereinThe plan Slightly combination is the combination for making game both sides obtain maximum utility, it should meet following condition:
Therefore, according to formula (4), (5), (6) and (7), this step, which summarizes, works as SAmWhen=0,1,2,3, subjective static state zero With the Nash Equilibrium condition in game about spoof attack, they, which respectively illustrate intelligent attacker, takes and halts attacks or pretend When attacking both modes, the reason of legitimate user takes two kinds of defence pattern formation Nash Equilibrium states.
1. Nash Equilibrium strategy combination is (0,1) when the condition that meets (8), (9).
2. Nash Equilibrium strategy combination is (0,2) when the condition that meets (10), (11).
3. Nash Equilibrium strategy combination is (3,1) when the condition that meets (12), (13).
4. Nash Equilibrium strategy combination is (3,2) when the condition that meets (14), (15).
(2) dynamic subjective game method is constructed, the optimal defence policies intelligently attacked are resisted based on the acquisition of DQL algorithm.It dislikes Meaning user and legitimate user carry out static subjective game, using effectiveness as the standard of measurement decision-making results, however are dynamically moving Dynamic mist calculates in network, participates in understanding of both sides' shortage to overall network environment of game, and legitimate user not can determine that attack inspection The error rate of survey, therefore they are continually interacted, and enable legitimate user that suitable defence mode to be selected to resist malicious user hair The intelligence attack risen, increases the effectiveness of legitimate user, reduces attack rate.In intensified learning method, Q-learning algorithm is A method of obtaining optimal policy in the insufficient dynamic environment of information, it derives from Studying theory on behaviorism, by Q The superiority and inferiority that value evaluation object takes some to act in a particular state, wherein including two important parameters: learning efficiency and prize The weak coefficient of encouraging property.Learning efficiency is bigger, and the effect of training is fewer before retaining;Incentive weak coefficient is bigger, then more Ground considers income at a specified future date.When calculating income at a specified future date, Q-learning has used max function, is easy excessively to estimate Q value, Therefore, this step realizes the dynamic subjective game of malicious user and legitimate user using DQL algorithm, obtains the optimal of legitimate user Defence policies, wherein attack mode is expressed asDefence mode is expressed as
DQL algorithm alternately updates the income that respective action is executed under each state using two Q value tables.Herein, will The attack mode of intelligent attacker's selection is expressed as state in a certain moment previous time slot, will select in moment t legitimate user Defence mode be expressed as acting.The calculation formula for updating two Q value tables is as follows:
Wherein, stIndicating the system mode in moment t, μ is incentive decay coefficient, and δ is learning efficiency, μ ∈ (0,1], δ ∈ [0,1],Indicate that the legitimate user under moment t state of Q value table 1 takes defence modeFinancial value,Indicate that legitimate user takes defence mode under moment t stateEffectiveness immediately.WithRespectively Q1、Q2State s in tablet+1Under make the maximum defence mode of Q value, their calculation formula is as follows:
V(st) indicate to correspond to Q under each defence mode in current state1+Q2Mean-max, Indicate that the legitimate user under moment t+1 state of Q value table 1 takes defence modeFinancial value, therefore optimal defence policies λ* It is given by:
In each state, legitimate user selects defence mode and follows ε-greedy strategy when updating Q value table, it is provided Mode is defendd with the probability selection suboptimum of ε, V (s is met with the probability selection of 1- εt) defence mode, wherein (0,1) ε ∈.
According to above-mentioned formula, the DQL algorithm steps for obtaining optimal defence policies are summarized as follows:
1. initializing, calculate
2. t=1,2,3...,
3. defending mode using ε-greedy policy selection
4. it was found that NextState
5. calculating and obtaining
6. being updated with 0.5 probability by formula (16), (18)Otherwise more by formula (17), (19) Newly
7. updating V (s by formula (20)t)。
It 2. continues to execute 8. returning until reaching system end-state, according to Q1、Q2Table and formula (21) obtain optimal defence Policy lambda*
Creativeness of the invention is mainly reflected in:
(1) it is easy to be sent out by malicious user when the present invention is interacted for mist node in mobile mist calculating with terminal user The problem of dynamic intelligence is attacked, the description by prospect theory in game theory to game participant's subjectivity construct and intelligence The static state that attack mode and the relevant mobile mist of defence mode calculate between security model and intelligent attacker and legitimate user is main See zero-sum game;The mobility that terminal under environment user is calculated according to mobile mist, is based on the available dynamic of nitrification enhancement The optimal policy of INFORMATION OF INCOMPLETE under environment devises in a mobile mist calculating and obtains optimal defence policies based on DQL algorithm Scheme, enhance the safety communicated between mist node and terminal user.
(2) present invention demonstrates the subjective attack motivation that lower objective probability weight is able to suppress intelligent attacker, and And 4 indexs: effectiveness, attack rate, the maximum Q value, average motion value of legitimate user are provided with, by the method for proposition and it is based on Q- Learning algorithm, Sarsa algorithm, Greedy strategy are resisted the method intelligently attacked and are compared.Method energy proposed by the present invention Enough decision processes for optimizing optimal defence policies by adjusting Q value, improve the effectiveness of legitimate user, attack rate are promoted to reduce, There is good security protection performance in mobile mist environment.
Detailed description of the invention
Fig. 1 is intelligently to attack security model figure in the mobile mist calculating environment of the present invention
Fig. 2 be under the conditions of initial parameter in static subjective game objective weight to Nash Equilibrium and legitimate user's effectiveness Influence comparison diagram.
Fig. 3 is that present invention defence intelligence is attacked and Sarsa algorithm, Greedy in 1-300 time slot under the conditions of initial parameter Strategy, Q-learning algorithm defend the legitimate user's value of utility comparison diagram intelligently attacked.
Fig. 4 is that present invention defence intelligence is attacked and Sarsa algorithm, Greedy in 1-300 time slot under the conditions of initial parameter Strategy, Q-learning algorithm defend the attack rate comparison diagram intelligently attacked.
Fig. 5 is that present invention defence intelligence is attacked and Sarsa algorithm, Q- in 1-300 time slot under the conditions of initial parameter Learning algorithm defends the maximum Q value comparison diagram intelligently attacked.
Fig. 6 is that present invention defence intelligence is attacked and Sarsa algorithm, Q- in 1-300 time slot under the conditions of initial parameter Learning algorithm defends the average motion value comparison diagram intelligently attacked.
Specific embodiment
Present invention obtains the intelligent attack defense methods based on DQL algorithm in a kind of mobile mist calculating, devise movement The security model of intelligence attacker involved in mist calculating, and intelligent attacker and legitimate user's progress are constructed based on prospect theory The static method of subjective zero-sum game and dynamic subjective game method based on DQL algorithm.By this method defensive attack, so that The defence policies of legitimate user are optimal, and improve the detection effectiveness of legitimate user, reduce attack rate, while enhancing shifting Dynamic mist calculates internet security and protective performance.
Present invention employs the following technical solution and realize step:
1. the intelligence attack security model in mobile mist calculating
Security model of the invention considers the communication of mist layer and user's interlayer, such as Fig. 1 institute towards mist node and terminal user Show, mobile mist calculates the intelligent attacker of any one in network as the terminal user with subjectivity, is likely to other Legal terminal user initiates intelligence attack, and the value set expression of intelligent attacker is? Their attack mode of moment t is represented asMoment value set expression isSeparately Outside, the value set expression of legitimate user is,In moment t,Their defence mode is represented asAssuming that at a time t, intelligent attacker 1 utilizes Intelligent programmable wireless device, takesMode is in the legitimate user under same mist node with it to some and initiates intelligence It can attack, whenWhen, indicate that the attacker halts attacks;WhenWhen, indicate the attacker by sending interference letter Number attack legitimate user, reduce legitimate user from mist node receive signal SINR;WhenWhen, indicate that the attacker adopts Eavesdropping attack mode is taken, the information propagated between mist node and legitimate user is intercepted and captured;WhenWhen, it is empty to indicate that the attacker uses False media access control address (Media Access Control Address, MAC-A) pretends to be mist node to legitimate user Data are sent, i.e. the attacker takes spoof attack mode;WhenWhen, indicate that the attacker takes Replay Attack mould Formula sends the data packet of legitimate user's received mistake, achievees the purpose that cheat legitimate user.Legitimate user under attackWhen facing different types of attack, there are two types of the modes of defence: whenWhen, legitimate user is used only PLS and defends intelligence It can attack, this defence mode is referred to as basic schema;WhenWhen, legitimate user will spend more overheads, first It first uses the PLS technology based on channel parameter to carry out Preliminary detection, filtering and anti-eavesdrop, then passes through physics by HLSM detection The data of layer verifying.
2. a kind of intelligent attack defense method of subjectivity based on DQL algorithm
Method includes the following steps:
(1) the static zero-sum game of subjectivity between intelligent attacker and legitimate user is established based on PT.Wherein, attack mode table It is shown as SAm, the quantity of attack mode is expressed as Num, Num >=1, and defence mode is expressed as EUn.Based on PT, intelligent attacker and conjunction Method user takes subjective decision to carry out game, realizes Nash Equilibrium.The present invention is calculated subjective using Prelec probability right function Probability, its calculation formula is:
Wherein p be objective probability, p ∈ (0,1], σobjectIndicate objective probability weight, σobject∈(0,1].Object table Show the object for participating in game, herein, object=attac or object=user.The description of Prelec probability right function The game object of subjective game is participated in because the result of adjustment is given to the objective probability of decision in the influence of weight.By PT's It inspires, when facing high-probability event, subjective decision person can underestimate corresponding objective probability;On the contrary, when facing low probability event When, subjective decision person can over-evaluate corresponding objective probability.In zero-sum game, legitimate user is in defence mode EUnLower detection intelligence Attack SAmThe income of acquisition is expressed asIf intelligent attack is not detected, the security loss being subjected to is expressed as? All there is rate of false alarm and omission factor under any defence mode, the valid data that rate of false alarm refers to that legitimate node is sent is detected as The probability of illegal data, omission factor indicate that illegal data are detected as the probability of valid data, this is that safety damage occur The reason of mistake.Therefore, both comprehensive ratios, legitimate user is in defence mode EUnSA is intelligently attacked in lower detectionmError rate table It is shown asAccording to system model, the attack mode of intelligent attacker shares 5 kinds, and the defence mode of legitimate user has 2 kinds, intelligence The value of utility of energy attacker and legitimate user are shown below:
Wherein, Uuser(SAm,EUn) indicate legitimate user value of utility, It is non-zero etc. to be quantified as C Grade,It isProbability, and obeyProbability distribution, whereinIt is all Quantify the sum of probability
According to formula (23), when game both sides, which are based on EUT, calculates effectiveness, calculation formula are as follows:
Wherein,Indicate the value of utility that legitimate user is calculated based on EUT.When game both sides are counted using PT When calculating effectiveness, they are made a policy based on subjective probability, and there is no calculated based on objective average detected error rate.Cause This, according to formula (22) and (23), the calculation formula of both sides' effectiveness is respectively as follows:
During subjective game, game both sides change subjective probability by adjusting objective weight, pursue respective effectiveness Maximum value, reach Nash Equilibrium.When intelligent attacker think by he current time initiate intelligence attack can be legal User detected, then he can select to halt attacks.When think can using the Security mechanism of higher by legitimate user When obtaining more multi-purpose, he can enable EUn=1.The strategy combination of Nash Equilibrium is represented as hereinThe plan Slightly combination is the combination for making game both sides obtain maximum utility, it should meet following condition:
Therefore, according to formula (25), (26), (27) and (28), this step, which summarizes, works as SAmIt is subjective quiet when=0,1,2,3 About the Nash Equilibrium condition of spoof attack in state zero-sum game, they respectively illustrate intelligent attacker take halt attacks or When both modes of spoof attack, the reason of legitimate user takes two kinds of defence pattern formation Nash Equilibrium states.
1. Nash Equilibrium strategy combination is (0,1) when the condition that meets (29), (30).
2. Nash Equilibrium strategy combination is (0,2) when the condition that meets (31), (32).
3. Nash Equilibrium strategy combination is (3,1) when the condition that meets (33), (34).
4. Nash Equilibrium strategy combination is (3,2) when the condition that meets (35), (36).
(2) dynamic subjective game method is constructed, the optimal defence policies intelligently attacked are resisted based on the acquisition of DQL algorithm.It dislikes Meaning user and legitimate user carry out static subjective game, using effectiveness as the standard of measurement decision-making results, however are dynamically moving Dynamic mist calculates in network, participates in understanding of both sides' shortage to overall network environment of game, and legitimate user not can determine that attack inspection The error rate of survey, therefore they are continually interacted, and enable legitimate user that suitable defence mode to be selected to resist malicious user hair The intelligence attack risen, increases the effectiveness of legitimate user, reduces attack rate.In intensified learning method, Q-learning algorithm is A method of obtaining optimal policy in the insufficient dynamic environment of information, it derives from Studying theory on behaviorism, by Q The superiority and inferiority that value evaluation object takes some to act in a particular state, wherein including two important parameters: learning efficiency and prize The weak coefficient of encouraging property.Learning efficiency is bigger, and the effect of training is fewer before retaining;Incentive weak coefficient is bigger, then more Ground considers income at a specified future date.When calculating income at a specified future date, Q-learning has used max function, is easy excessively to estimate Q value, Therefore, this step realizes the dynamic subjective game of malicious user and legitimate user using DQL algorithm, obtains the optimal of legitimate user Defence policies, wherein attack mode is expressed asDefence mode is expressed as
DQL algorithm alternately updates the income that respective action is executed under each state using two Q value tables.Herein, will The attack mode of intelligent attacker's selection is expressed as state in a certain moment previous time slot, will select in moment t legitimate user Defence mode be expressed as acting.The calculation formula for updating two Q value tables is as follows:
Wherein, stIndicating the system mode in moment t, μ is incentive decay coefficient, and δ is learning efficiency, μ ∈ (0,1], δ ∈ [0,1],Indicate that the legitimate user under moment t state of Q value table 1 takes defence modeFinancial value,Indicate that legitimate user takes defence mode under moment t stateEffectiveness immediately.WithRespectively Q1、Q2State s in tablet+1Under make the maximum defence mode of Q value, their calculation formula is as follows:
V(st) indicate to correspond to Q under each defence mode in current state1+Q2Mean-max, Indicate that the legitimate user under moment t+1 state of Q value table 1 takes defence modeFinancial value,.Therefore optimal defence policies λ*It is given by:
In each state, legitimate user selects defence mode and follows ε-greedy strategy when updating Q value table, it is provided Mode is defendd with the probability selection suboptimum of ε, V (s is met with the probability selection of 1- εt) defence mode, wherein (0,1) ε ∈.
According to above-mentioned formula, the DQL algorithm steps for obtaining optimal defence policies are summarized as follows:
1. initializing, calculate
2. t=1,2,3...,
3. defending mode using ε-greedy policy selection
4. it was found that NextState
5. calculating and obtaining
6. being updated with 0.5 probability by formula (37), (39)Otherwise more by formula (38), (40) Newly
7. updating V (s by formula (41)t)。
It 2. continues to execute 8. returning until reaching system end-state, according to Q1、Q2Table and formula (42) obtain optimal defence Policy lambda*
The present invention is provided with 300 time slots, and each time slot is expressed as 12500/32 microsecond, and adjacent moment interval is with microsecond Unit.In resisting the subjective game intelligently attacked, the index for evaluating 4 kinds of methods is as follows:
The effectiveness of index (1) legitimate user: average utility value of the legitimate user based on PT in each time slot.
Index (2) attack rate: the attack mode sum that intelligent attacker selects in each time slot accounts for all modes Ratio.
Index (3) maximum Q value: the maximum Q value updated in each time slot in Q table renewal process.
Index (4) average motion value: the defence mode that legitimate user selects in each time slot in Q table renewal process is flat Mean value.Since action value only has 1 and 2 two kind of possible value, the defence mode that action value is 1 spends less overhead, when Average motion value gets over hour, shows that legitimate user more selects the defence mode using only safety of physical layer technology, improves System performance.
The initiation parameter meaning and value that the present invention uses are as shown in the table.
Fig. 2 is shown under the conditions of initial parameter in static subjective game objective weight to Nash Equilibrium and legitimate user's effectiveness Influence comparison, X-axis: the objective probability weight of intelligent attacker, unit are " 1 ", and Y-axis: the effectiveness of legitimate user, unit are " 1 ", solid line are legitimate user's effectiveness when the objective probability weight of legitimate user is equal to 0.7, and dotted line is the objective of legitimate user Probability right is equal to legitimate user's effectiveness when 1.Under the conditions of initial parameter in 1-300 time slot the present invention defence intelligence attack with Legitimate user's value of utility comparison that Sarsa algorithm, Greedy strategy, the defence of Q-learning algorithm are intelligently attacked such as Fig. 3, X-axis: Time slot, unit are " 1 ", and Y-axis: the effectiveness of legitimate user, unit are " 1 ", and thick dashed line is intelligently attacked based on the defence of DQL algorithm Legitimate user's value of utility, fine dotted line are that legitimate user's value of utility for intelligently attacking is defendd based on Sarsa algorithm, fine line be based on Q-learning algorithm defends the legitimate user's value of utility intelligently attacked, and heavy line is that intelligence attack is defendd based on Greedy strategy Legitimate user's value of utility.Under the conditions of initial parameter in 1-300 time slot the present invention defence intelligence attack with Sarsa algorithm, The attack rate comparison that Greedy strategy, the defence of Q-learning algorithm are intelligently attacked such as Fig. 4, X-axis: time slot, unit are " 1 ", Y Axis: attack rate, unit are " 1 ", and thick dashed line is the attack rate for defending intelligently to attack based on DQL algorithm, and fine dotted line is based on Sarsa Algorithm defends the attack rate intelligently attacked, and fine line is the attack rate for defending intelligently to attack based on Q-learning algorithm, solid Line is the attack rate for defending intelligently to attack based on Greedy strategy.The present invention defence in 1-300 time slot under the conditions of initial parameter The maximum Q value comparison that intelligence attack is intelligently attacked with Sarsa algorithm, the defence of Q-learning algorithm such as Fig. 5, X-axis: time slot, it is single Position is " 1 ", and Y-axis: maximum Q value, unit are " 1 ", and thick dashed line is the maximum Q value for defending intelligently to attack based on DQL algorithm, fine dotted line To defend the maximum Q value intelligently attacked based on Sarsa algorithm, fine line is intelligently attacked based on the defence of Q-learning algorithm Maximum Q value.The attack of present invention defence intelligence is calculated with Sarsa algorithm, Q-learning in 1-300 time slot under the conditions of initial parameter The maximum Q value comparison that method defence is intelligently attacked such as Fig. 6, X-axis: time slot, unit are " 1 ", and Y-axis: average motion value, unit are " 1 ", Thick dashed line is the average motion value for defending intelligently to attack based on DQL algorithm, and fine dotted line is that intelligence attack is defendd based on Sarsa algorithm Average motion value, fine line is to defend the average motion value intelligently attacked based on Q-learning algorithm.- 6 institute according to fig. 2 Show, method proposed by the present invention obtains more accurate optimal defence policies under the conditions of same initial parameter, improves conjunction The effectiveness of method user, reduces attack rate.

Claims (1)

1. a kind of user oriented intelligent attack defense method in movement mist calculating, which is characterized in that the intelligence in mobile mist calculating It is specific as follows security model to be attacked:
Security model considers the communication of mist layer and user's interlayer towards mist node and terminal user, and mobile mist, which calculates in network, appoints What one intelligent attacker is likely to initiate other legal terminals user intelligent attack as the terminal user with subjectivity It hitting, the value set expression of intelligent attacker is M={ 1,2 ..., m },In their attack mode quilt of moment t It is expressed asMoment value set expression is T={ 0,1,2 ..., t },In addition, the value set of legitimate user It is expressed as, N={ 1,2 ..., n },In moment t, T={ 1,2 ..., t },Their defence mode It is represented asAssuming that at a time t, intelligent attacker 1 utilize Intelligent programmable wireless device, takeMode to Some is in the legitimate user under same mist node with it and initiates intelligence attack, whenWhen, indicate that the attacker stops attacking It hits;WhenWhen, it indicates that the attacker attacks legitimate user by sending interference signal, reduces legitimate user and connect from mist node The SINR of the collection of letters number;WhenWhen, it indicates that the attacker takes eavesdropping attack mode, intercepts and captures and passed between mist node and legitimate user The information broadcast;WhenWhen, indicate that the attacker uses false media access control address (MediaAccess Control Address, MAC-A) pretend to be mist node to send data to legitimate user, i.e. the attacker takes spoof attack mould Formula;WhenWhen, it indicates that the attacker takes Replay Attack mode, sends the data packet of legitimate user's received mistake, reach To the purpose of deception legitimate user;Legitimate user under attackWhen facing different types of attack, there are two types of defend mould Formula: whenWhen, PLS defence intelligence attack is used only in legitimate user, and this defence mode is referred to as basic schema;WhenWhen, legitimate user will spend more overheads, use the PLS technology based on channel parameter to carry out first preliminary Detection, filtering and anti-eavesdrop, the data then verified by HLSM detection by physical layer;
The following steps are included:
(1) the static zero-sum game of subjectivity between intelligent attacker and legitimate user is established based on PT;Wherein, attack mode is expressed as SAm, the quantity of attack mode is expressed as Num, Num >=1, and defence mode is expressed as EUn;Based on PT, intelligent attacker and legal use Family takes subjective decision to carry out game, realizes Nash Equilibrium;The present invention calculates subjective probability using Prelec probability right function, Its calculation formula is:
Wherein p be objective probability, p ∈ (0,1], σobjectIndicate objective probability weight, σobject∈(0,1];Object indicates ginseng With the object of game, object=attac or object=user;Prelec probability right function describes participation subjective game Game object because the result of adjustment is given in the influence of weight to the objective probability of decision;By the inspiration of PT, when in face of height When probability event, subjective decision person can underestimate corresponding objective probability;On the contrary, when facing low probability event, subjective decision person Corresponding objective probability can be over-evaluated;In zero-sum game, legitimate user is in defence mode EUnSA is intelligently attacked in lower detectionmThe receipts of acquisition Benefit is expressed asIf intelligent attack is not detected, the security loss being subjected to is expressed asUnder any defence mode All there is rate of false alarm and omission factor, the valid data that rate of false alarm refers to that legitimate node is sent is detected as the general of illegal data Rate, omission factor indicate that illegal data are detected as the probability of valid data, both comprehensive ratios, and legitimate user is defending Mode EUnSA is intelligently attacked in lower detectionmError rate be expressed asAccording to system model, the attack mode of intelligent attacker is total There are 5 kinds, the defence mode of legitimate user there are 2 kinds, and the value of utility of intelligent attacker and legitimate user are shown below:
Wherein, Uuser(SAm,EUn) indicate legitimate user value of utility, C non-zero grades are quantified as,It isProbability, and obeyProbability distribution, whereinAll quantizations The sum of probability
According to formula (2), when game both sides, which are based on EUT, calculates effectiveness, calculation formula are as follows:
Wherein,Indicate the value of utility that legitimate user is calculated based on EUT;When game both sides calculate effect using PT Used time, they are made a policy based on subjective probability, and there is no calculated based on objective average detected error rate;Therefore, root According to formula (1) and (2), the calculation formula of both sides' effectiveness is respectively as follows:
During subjective game, game both sides change subjective probability by adjusting objective weight, pursue respective effectiveness most Big value, reaches Nash Equilibrium;When intelligent attacker think by he current time initiate intelligence attack can be by legitimate user It detected, then he can select to halt attacks;When legitimate user thinks to obtain using the Security mechanism of higher When more multi-purpose, he can enable EUn=1;The strategy combination of Nash Equilibrium is represented asThe strategy combination is to make to win The combination that both sides obtain maximum utility is played chess, it should meet following condition:
Therefore, according to formula (4), (5), (6) and (7), this step, which summarizes, works as SAmWhen=0,1,2,3, subjective static state zero and rich About the Nash Equilibrium condition of spoof attack in playing chess, they, which respectively illustrate intelligent attacker, takes and halts attacks or spoof attack When both modes, the reason of legitimate user takes two kinds of defence pattern formation Nash Equilibrium states;
1. Nash Equilibrium strategy combination is (0,1) when the condition that meets (8), (9);
2. Nash Equilibrium strategy combination is (0,2) when the condition that meets (10), (11);
3. Nash Equilibrium strategy combination is (3,1) when the condition that meets (12), (13);
4. Nash Equilibrium strategy combination is (3,2) when the condition that meets (14), (15);
(2) dynamic subjective game method is constructed, the optimal defence policies intelligently attacked are resisted based on the acquisition of DQL algorithm;
(3) the dynamic subjective game that malicious user and legitimate user are realized using DQL algorithm, obtains the optimal defence of legitimate user Strategy, wherein attack mode is expressed asDefence mode is expressed as
DQL algorithm alternately updates the income that respective action is executed under each state using two Q value tables;The a certain moment is previous The attack mode of intelligent attacker's selection is expressed as state in time slot, and the defence mode selected in moment t legitimate user is indicated For movement;The calculation formula for updating two Q value tables is as follows:
Wherein, stIndicating the system mode in moment t, μ is incentive decay coefficient, and δ is learning efficiency, μ ∈ (0,1], δ ∈ [0, 1],Indicate that the legitimate user under moment t state of Q value table 1 takes defence modeFinancial value,Indicate that legitimate user takes defence mode under moment t stateEffectiveness immediately;WithRespectively Q1、Q2State s in tablet+1Under make the maximum defence mode of Q value, their calculation formula is as follows:
V(st) indicate to correspond to Q under each defence mode in current state1+Q2Mean-max,Indicate Q The legitimate user under moment t+1 state of value table 1 takes defence modeFinancial value,;Therefore optimal defence policies λ*Under Formula provides:
In each state, legitimate user selects defence mode and follows ε-greedy strategy when updating Q value table, it is provided with ε Probability selection suboptimum defend mode, V (s is met with the probability selection of 1- εt) defence mode, wherein (0,1) ε ∈;
According to above-mentioned formula, the DQL algorithm steps for obtaining optimal defence policies are summarized as follows:
1. initializing, μ, δ are calculated,ε,V(st)=0;
2. t=1,2,3...,
3. defending mode using ε-greedy policy selection
4. it was found that NextState
5. calculating and obtaining
6. being updated with 0.5 probability by formula (16), (18)Otherwise it is updated by formula (17), (19)
7. updating V (s by formula (20)t);
It 2. continues to execute 8. returning until reaching system end-state, according to Q1、Q2Table and formula (21) obtain optimal defence policies λ*
CN201910287756.1A 2019-04-11 2019-04-11 User-oriented intelligent attack defense method in mobile fog calculation Active CN110049497B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910287756.1A CN110049497B (en) 2019-04-11 2019-04-11 User-oriented intelligent attack defense method in mobile fog calculation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910287756.1A CN110049497B (en) 2019-04-11 2019-04-11 User-oriented intelligent attack defense method in mobile fog calculation

Publications (2)

Publication Number Publication Date
CN110049497A true CN110049497A (en) 2019-07-23
CN110049497B CN110049497B (en) 2022-09-09

Family

ID=67276801

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910287756.1A Active CN110049497B (en) 2019-04-11 2019-04-11 User-oriented intelligent attack defense method in mobile fog calculation

Country Status (1)

Country Link
CN (1) CN110049497B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110401675A (en) * 2019-08-20 2019-11-01 绍兴文理学院 Uncertain ddos attack defence method under a kind of sensing cloud environment
CN110753383A (en) * 2019-07-24 2020-02-04 北京工业大学 Safe relay node selection method based on reinforcement learning in fog calculation
CN114448660A (en) * 2021-12-16 2022-05-06 国网江苏省电力有限公司电力科学研究院 Internet of things data access method
CN114666107A (en) * 2022-03-04 2022-06-24 北京工业大学 Advanced persistent threat defense method in mobile fog computing
WO2022151579A1 (en) * 2021-01-13 2022-07-21 清华大学 Backdoor attack active defense method and device in edge computing scene

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107147670A (en) * 2017-06-16 2017-09-08 福建中信网安信息科技有限公司 APT defence methods based on game system
CN107871164A (en) * 2017-11-17 2018-04-03 济南浪潮高新科技投资发展有限公司 A kind of mist computing environment personalization deep learning method
CN108512837A (en) * 2018-03-16 2018-09-07 西安电子科技大学 A kind of method and system of the networks security situation assessment based on attacking and defending evolutionary Game
CN108848535A (en) * 2018-05-31 2018-11-20 国网浙江省电力有限公司电力科学研究院 A kind of mist calculating environmental resource distribution method towards shared model
EP3407194A2 (en) * 2018-07-19 2018-11-28 Erle Robotics, S.L. Method for the deployment of distributed fog computing and storage architectures in robotic modular components
CN109194685A (en) * 2018-10-12 2019-01-11 天津大学 Man-in-the-middle attack defence policies based on safe game theory

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107147670A (en) * 2017-06-16 2017-09-08 福建中信网安信息科技有限公司 APT defence methods based on game system
CN107871164A (en) * 2017-11-17 2018-04-03 济南浪潮高新科技投资发展有限公司 A kind of mist computing environment personalization deep learning method
CN108512837A (en) * 2018-03-16 2018-09-07 西安电子科技大学 A kind of method and system of the networks security situation assessment based on attacking and defending evolutionary Game
CN108848535A (en) * 2018-05-31 2018-11-20 国网浙江省电力有限公司电力科学研究院 A kind of mist calculating environmental resource distribution method towards shared model
EP3407194A2 (en) * 2018-07-19 2018-11-28 Erle Robotics, S.L. Method for the deployment of distributed fog computing and storage architectures in robotic modular components
CN109194685A (en) * 2018-10-12 2019-01-11 天津大学 Man-in-the-middle attack defence policies based on safe game theory

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CAIXIA XIE 等: "User-centric view of smart attacks in wireless networks", 《2016 IEEE INTERNATIONAL CONFERENCE ON UBIQUITOUS WIRELESS BROADBAND (ICUWB)》 *
SHANSHAN TU 等: "Security in Fog Computing: A Novel Technique to Tackle an Impersonation Attack", 《IEEE ACCESS》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110753383A (en) * 2019-07-24 2020-02-04 北京工业大学 Safe relay node selection method based on reinforcement learning in fog calculation
CN110401675A (en) * 2019-08-20 2019-11-01 绍兴文理学院 Uncertain ddos attack defence method under a kind of sensing cloud environment
WO2022151579A1 (en) * 2021-01-13 2022-07-21 清华大学 Backdoor attack active defense method and device in edge computing scene
CN114448660A (en) * 2021-12-16 2022-05-06 国网江苏省电力有限公司电力科学研究院 Internet of things data access method
CN114666107A (en) * 2022-03-04 2022-06-24 北京工业大学 Advanced persistent threat defense method in mobile fog computing

Also Published As

Publication number Publication date
CN110049497B (en) 2022-09-09

Similar Documents

Publication Publication Date Title
CN110049497A (en) A kind of user oriented intelligent attack defense method in mobile mist calculating
CN108512837A (en) A kind of method and system of the networks security situation assessment based on attacking and defending evolutionary Game
Sagduyu et al. Jamming games in wireless networks with incomplete information
Gianvecchio et al. Battle of botcraft: fighting bots in online games with human observational proofs
Min et al. Defense against advanced persistent threats in dynamic cloud storage: A colonel blotto game approach
CN110166428B (en) Intelligent defense decision-making method and device based on reinforcement learning and attack and defense game
CN107147670A (en) APT defence methods based on game system
CN108833402A (en) A kind of optimal defence policies choosing method of network based on game of bounded rationality theory and device
CN108898010A (en) A method of establishing the attacking and defending Stochastic Game Model towards malicious code defending
CN109589607A (en) A kind of game anti-cheating method and game anti-cheating system based on block chain
CN111064702B (en) Active defense strategy selection method and device based on bidirectional signal game
CN110417733A (en) Attack Prediction method, apparatus and system based on QBD attacking and defending random evolution betting model
Abdalzaher et al. Using Stackelberg game to enhance node protection in WSNs
CN110278198A (en) The safety risk estimating method of assets in network based on game theory
CN107517200A (en) A kind of malice reptile defence policies system of selection of Web server
CN105100017A (en) LDoS attack detection method based on signal cross correlation
CN112329009A (en) Defense method for noise attack in joint learning
Estiri et al. A game-theoretical model for intrusion detection in wireless sensor networks
Wu et al. I-CIFA: An improved collusive interest flooding attack in named data networking
CN114666107A (en) Advanced persistent threat defense method in mobile fog computing
CN109787996B (en) Camouflage attack detection method based on DQL algorithm in fog calculation
Seredynski et al. Evolutionary game theoretical analysis of reputation-based packet forwarding in civilian mobile ad hoc networks
Yang et al. Dishonest behaviors in online rating systems: cyber competition, attack models, and attack generator
Zhang et al. A multi-criteria detection scheme of collusive fraud organization for reputation aggregation in social networks
Miller Distributed virtual environment scalability and security

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant