CN112115469A - Edge intelligent moving target defense method based on Bayes-Stackelberg game - Google Patents

Edge intelligent moving target defense method based on Bayes-Stackelberg game Download PDF

Info

Publication number
CN112115469A
CN112115469A CN202010966915.3A CN202010966915A CN112115469A CN 112115469 A CN112115469 A CN 112115469A CN 202010966915 A CN202010966915 A CN 202010966915A CN 112115469 A CN112115469 A CN 112115469A
Authority
CN
China
Prior art keywords
model
models
edge
student
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010966915.3A
Other languages
Chinese (zh)
Other versions
CN112115469B (en
Inventor
钱亚冠
关晓惠
王滨
陶祥兴
周武杰
云本胜
陈晓霞
李蔚
楼琼
吴淑慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Lover Health Science and Technology Development Co Ltd
Zhejiang University of Water Resources and Electric Power
Original Assignee
Zhejiang Lover Health Science and Technology Development Co Ltd
Zhejiang University of Water Resources and Electric Power
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lover Health Science and Technology Development Co Ltd, Zhejiang University of Water Resources and Electric Power filed Critical Zhejiang Lover Health Science and Technology Development Co Ltd
Priority to CN202010966915.3A priority Critical patent/CN112115469B/en
Publication of CN112115469A publication Critical patent/CN112115469A/en
Application granted granted Critical
Publication of CN112115469B publication Critical patent/CN112115469B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The invention discloses a Bayes-Stackelberg game-based intelligent edge moving target defense method, and provides a dynamic defense mechanism called as edge intelligent moving target defense (EI-MTD). Firstly, a member model which is small in scale and suitable for being deployed at an edge node is obtained from a complex teacher model of a cloud data center through difference knowledge distillation. And then, a Bayes-Stackelberg game strategy is utilized to dynamically schedule the member models, so that an attacker cannot judge a target model for executing the classification task. The defense mechanism can effectively prevent an attacker from selecting an optimal agent model to make a countersample, thereby blocking black box attack. Experiments on an ILSVRC2012 image data set show that the EI-MTD provided by the invention can effectively protect edge intelligence from being attacked by a malicious black box.

Description

Edge intelligent moving target defense method based on Bayes-Stackelberg game
Technical Field
The invention relates to a security technology of edge intelligent computation, and provides an edge intelligent moving target defense method based on a Bayes-Stackelberg game.
Background
Artificial intelligence based on deep learning has been successfully applied in various fields, from facial recognition, natural language processing, to computer vision. With the vigorous development of intelligent technology, people's life has changed greatly, and people increasingly rely on convenient services provided by intelligent life and hope to enjoy intelligent services anytime and anywhere. Over the past few years, the theory of edge computing has moved towards applications, and various applications have been developed to improve our lives. The maturation of deep learning techniques and edge computing systems and the growing demand for intelligent life has facilitated the development and implementation of Edge Intelligence (EI). Current EI implementations are based on deep learning models, i.e., Deep Neural Networks (DNNs), which are deployed to devices at the edge of the network (e.g., smart cameras of a surveillance system) to achieve real-time performance for applications such as object recognition and anomaly detection.
Currently, security of edge intelligence is a concern. Past work has mostly focused on data privacy with marginal intelligence, but has not focused enough on combating sample attacks. Existing work has shown that DNNs are extremely vulnerable to attack against samples. The challenge sample is the input image with a carefully designed small perturbation added in order to spoof the deep neural network. The challenge samples have a special property that for a particular model generated challenge sample, another model can also be successfully spoofed, called transferability. In models with similar architecture, low model capacity and high test accuracy, the challenge samples have higher transferability. In theory, with this property, an attacker can make a target model of the attack against a sample on the local proxy model without knowing anything about the target model, called a black box attack. In fact, an attacker can find an agent model which is closer to the target model by repeatedly inquiring the target model, so that a higher attack success rate is obtained, and the effect of white-box attack is achieved.
Model compression is considered an effective way to reduce the size of the model due to the limitations of computing, storage, etc. resources on the edge nodes, including edge devices and edge servers. However, the robustness of the model is positively correlated to the size of the model. Therefore, the compression model on the edge node is more vulnerable to the countersample. In addition, most of the methods for defending the countermeasure samples proposed at present need to work under the condition of rich GPU computing resources, and are not suitable for edge nodes. Thus, the limited resources limit the application of edge intelligence to sensitive areas.
We summarize the security challenges facing edge intelligence computation as follows: (1) how to prevent attackers from finding the optimal proxy model, (2) how to reduce the transferability of countermeasure samples without compromising the normal sample accuracy, (3) how to defend against resource-limited countermeasure samples on edge nodes.
Disclosure of Invention
The invention provides a defense method for an edge intelligent moving target to solve the problems. For the first challenge, we change the static object model to a dynamic object model, randomly scheduling the classification service. Since the attacker does not know the model that really serves them, they cannot estimate which candidate agent model is close to the target model. For the second challenge, we try to increase the difference between the models deployed on the edge nodes. We use the gradient of the loss function as the basis for the difference metric, since current attacks mainly use the gradient to make examples of countermeasures. For the third challenge, we use transfer learning to extract knowledge, from a large capacity powerful teacher model to a small capacity student model in a cloud data center. The benefit of this approach is that the classification knowledge and robustness is transferred and the size of the model is compressed.
The present invention integrates these solutions into a middle defense framework, referred to as edge intelligence with moving target defense (EI-MTD). To this end, we constructed the EI-MTD by the following steps: (1) a robust teacher model is obtained through countermeasure training by utilizing powerful GPUs of a cloud data center; (2) transferring the robust knowledge of the teacher model to the student models through differential knowledge distillation to obtain diversity; and (3) switching the student model by using a Bayes-Stackelberg game strategy to balance accuracy and safety.
The invention realizes the purpose through the following technical scheme:
the invention provides an EI-MTD system comprising three key technologies, namely confrontation training, differential knowledge distillation and dynamic scheduling of a service model. We use countermeasure training to obtain a powerful teacher model of the cloud data center. And then using transfer learning to extract robust knowledge from the teacher model and to use resources in a small-scale student model with limited resources. Meanwhile, a differential regularization term is added to obtain the diversity of an extraction model, and the transferability of a confrontation sample is effectively inhibited. These student models, also referred to as membership models in a mobile object environment, are further used in a dynamic scheduling scheme for services to schedule users. Thanks to the diversity derived from the distillation of the differential knowledge, dynamic scheduling can perfectly confuse attackers to find the best surrogate model, as shown on the right side of fig. 1.
The invention comprises the following steps:
s1: and (4) performing confrontational training for the teacher model. Suppose that the cloud data center has been trainedExercise data set
Figure BDA0002682680440000021
And teacher model Ftt). A neural network with a layer 101 of ResNet-101 is used as a teacher model, FGSM (functional short message System) confrontation samples are used for confrontation training in a cloud data center, and a combined 'FAST' confrontation training method is used for accelerating the process. Work has shown that resistance training allows for greater network robustness.
S2: differential knowledge distillation of student models. First from the teacher model Ftt) To obtain a sample x at a suitable distillation temperature TiSoft label of
Figure BDA0002682680440000022
Creating a new training data set
Figure BDA0002682680440000023
To obtain the diversity of the student model, we define a new model with regularization term CScoherenceIs equal to sigma T2J/K+λ·CScoherenceTraining all student models simultaneously
Figure BDA0002682680440000024
To minimize the common loss function L. Note that in the present invention, the student model, the member model, and the object model refer to the same object, and are referred to as the student model in knowledge distillation and the member model in dynamic scheduling.
S3: dynamic service scheduling of member models. After differential knowledge distillation, the student models are deployed to edge nodes, one model for each node. Here, the edge node includes an edge device and an edge server. And (3) a certain edge server is designated as a service scheduling controller, and all member models and nodes where the member models are located are registered in the scheduling controller. When a user (including an attacker) inputs an image request classification service through an edge device (e.g., a smartphone), the edge device first uploads the service request to the dispatch controller rather than processing it directly on the local model. And then the scheduling controller selects one edge node through a Bayes-Stackelberg game to execute a classification task. The whole process is transparent to the attacker, and the attacker cannot know which edge node provides service finally.
Further, as described in step S3, the diversity of the model plays a key role in the effectiveness of dynamic scheduling. The counterattack exploits the fact that the gradient with respect to the input is taken as the perturbation direction, taking gradient alignment as a measure of diversity.
Suppose there are two member models
Figure BDA0002682680440000031
And
Figure BDA0002682680440000032
e omega and agent model F selected by attackeraIs e.g. U, with
Figure BDA0002682680440000033
Respectively represent
Figure BDA0002682680440000034
And
Figure BDA0002682680440000035
is applied to the gradient of the sample x. If it is not
Figure BDA0002682680440000036
And
Figure BDA0002682680440000037
the included angle between them is small enough, which means that it is possible to make
Figure BDA0002682680440000038
Misclassified xadvCan also make
Figure BDA0002682680440000039
Is misclassified, therefore
Figure BDA00026826804400000310
And
Figure BDA00026826804400000311
the difference between
Figure BDA00026826804400000312
And
Figure BDA00026826804400000313
the included angle therebetween. We use Cosine Similarity (CS) to denote +xJ1And +xJ2Degree of alignment of (a):
Figure BDA00026826804400000314
wherein < +xJ1,▽xJ2Is > is
Figure BDA00026826804400000315
And
Figure BDA00026826804400000316
the inner product of (d). If CS (+xJ1,▽xJ2) Not equal to-1, then +xJ1And +xJ2The gradient of (2) is opposite in direction, meaning that the gradient can be made
Figure BDA00026826804400000317
Misclassified xadvCan not make
Figure BDA00026826804400000318
And (4) carrying out error classification.
Further, the cosine similarity is further applied to the training process of the student models in step S2 to obtain a member model set with greater diversity. Since cosine similarity is calculated using two gradients, to further generalize to K models, the maximum value on pairwise cosine similarity is defined as the EI-MTD diversity measure:
Figure BDA00026826804400000319
wherein, JaAnd JbRespectively representing student models
Figure BDA00026826804400000320
And
Figure BDA00026826804400000321
a loss function of theta(a)And theta(b)Respectively representing student models
Figure BDA00026826804400000322
And
Figure BDA00026826804400000323
is determined by the parameters of (a) and (b),
Figure BDA00026826804400000324
is x gets the soft label from the teacher model. Due to CScoherenceIs a non-smooth function, cannot use a gradient descent optimization method, and further uses a LogSumExp function to approximate CScoherence
Figure BDA00026826804400000325
The student models are distilled from a teacher model of a cloud data center, and the diversity among the student models needs to be ensured during distillation, so that a regularization term CS is added in the knowledge distillation processcoherenceRedefining a new distillation loss function:
Figure BDA00026826804400000326
wherein, lambda is a regularization coefficient, and CS is controlled in the training processcoherenceThe importance of (c). In order to make the student model fully learn the antagonistic knowledge of the teacher model, set β ═ 1, i.e. only use the soft label example to train learningAnd (4) generating a model. Differential knowledge distillation algorithm 1 is shown below.
Figure BDA0002682680440000041
Further, after the student models are obtained by distilling the difference knowledge in the step S3, the member models are deployed to the edge nodes. When the edge device receives the image, it does not perform classification on its own model, but instead forwards the image to the dispatch controller. The scheduling controller will select the registered service model by the scheduling policy, specifically:
in a confrontational environment, both the defender and the attacker would like to maximize their "yield" through some strategy, which is typical of gaming problems. In the present invention, a Bayes-Stackelberg game is used to model the scheduling strategy. The defender's strategy is to select a suitable service classification model, and the adversary's strategy is to select an optimal agent model to generate the confrontation sample. The present invention represents Bayes-Stackelberg game as seven-element group
Figure BDA0002682680440000042
Wherein L is defensive, SLIs a group of student models obtained after differential distillation
Figure BDA0002682680440000043
The type of follower F includes two types, legal user F(1)And attacker F(2)(ii) a Legitimate user F(1)Movement space of
Figure BDA0002682680440000044
Only one action, namely requesting services using a legitimate sample; attacker F(2)Movement space of
Figure BDA0002682680440000045
Is to select different agent models
Figure BDA0002682680440000046
Collection of defender LBenefit to
Figure BDA0002682680440000047
And a legitimate user F(1)Gain of (2)
Figure BDA0002682680440000048
Defining the classification accuracy of the member model to the natural image; income of defender L
Figure BDA0002682680440000049
Is the classification accuracy of the member model on the antibody sample; illegal user F(2)Gain of (2)
Figure BDA00026826804400000410
Defined as the success rate against sample attacks; p(1)Indicating a legitimate user F(1)Probability of occurrence, P(2)Represents an attacker F(2)The probability of occurrence; converting a model scheduling strategy problem based on a Bayes-Stackelberg game into a mixed integer quadratic programming problem (MIQP) as follows:
Figure BDA0002682680440000051
Figure BDA0002682680440000052
Figure BDA0002682680440000053
Figure BDA0002682680440000054
0≤sn≤1
Figure BDA0002682680440000055
v(c)∈R
wherein P is(1)=1-α,P(2)=α,s=(p1,p2,...,pK) Solving to obtain the scheduling strategy of the member model, piMember model Fs(i)) The probability of being selected. q. q.s(c)Is a user F(c)Optimal strategy of response, user's profit is v(c). The above problem can be solved using DOBSS algorithm.
The invention relates to a defense method for resisting sample attack, which is researched for the first time aiming at an edge intelligent system. The first of saiik et al proposed defending confrontation samples (MTDeep) with moving targets, and our approach differs from them in two ways, the first of which is that they do not consider the application scenario of edge intelligence, only for applications on cloud platforms, and the second of which is that they do not consider the differences of member models, so that the final defense effect is not obvious. The hrs (hierarchical Random switching) network proposed by Wang et al sets several parallel network modules in the network, and can switch randomly in the forward propagation process, but our method switches the whole network randomly, and the switching strategies are also different. Abhishek et al analyzed the impact of the limited rationality of the attacker on the MTD performance achieved based on the Stackelberg game, and the results showed that the MTD game framework designed for rational attackers was sufficient to defend the limited rational attackers, and thus the method of the present invention also assumed that the attackers were rational. Song et al propose fMTD detection and defense countermeasure samples based on the phenomenon that countermeasure samples of different models are distributed differently, fMTD is mainly retrained by using different countermeasure samples on a basic model to obtain a group of bifurcation (fork) models, then the countermeasure samples are detected by using the output consistency of the samples on the group of fork models, and meanwhile, the countermeasure samples are correctly classified by using a voting principle, and MTD in the method is mainly embodied in that after the models are deployed, the countermeasure samples can still be dynamically generated to perform countermeasure training, so that the fork models of a system at a certain stage are dynamically changed. Unlike the method of the present invention, the method still needs forward reasoning on multiple models and cannot be deployed on edge devices with limited resources.
The invention has the beneficial effects that:
the invention relates to a Bayes-Stackelberg game-based intelligent moving target defense method, which has the following advantages compared with the prior art:
(1) the invention firstly proposes the defense against adversarial attacks on edge intelligent systems. The EI-MTD provided by the invention is well combined with the inference system architecture of the edge device and the edge server, namely a deep learning model carries out inference independently on an edge node, thereby realizing dynamic execution. The dynamic scheduling mechanism is completely transparent to the user and does not reduce the classification accuracy.
(2) To prevent transferability, the present invention proposes differential knowledge distillation to increase the diversity of membership models on edge nodes. Unlike the knowledge distillation of a single model, the present invention employs multiple student models that distill simultaneously with a common loss function. In addition, the method simultaneously compresses the scale of the model, and can overcome the limitation of edge node resources.
(3) An EI simulation platform was built using GPU servers, PCs and Raspberry pi to test our EI-mtd. The experiment used a real image dataset ILSVRC 2012. Experimental results indicate that EI-MTD can defend against 80% of challenge samples generated by M-DI 2-FGSM.
Drawings
FIG. 1 is a static object model and a dynamic object model.
In the figure: on the left is a typical device-based static service attack architecture. The attacker attacks node K with a challenge sample of "cat," knowing that it is the model on node K that performs the classification. On the right is a dynamic scheduling object model scheme. Although the attacker tries to attack node K, it does not know the model that is actually executed in detail.
FIG. 2 is a frame of an EI-MTD;
in the figure: the black line represents the process of membership model deployment to edge nodes and the red line represents the process of EI-MTD classification countermeasure samples.
FIG. 3 is the accuracy of top-1 and top-5 against the teacher model in training. After each training epoch, the teacher model is tested with two data sets, one containing clean samples and the other containing PGD samples
FIG. 4 accuracy of the training model and the distillation model of the differential knowledge
Fig. 5 shows that the EI-MTD has higher accuracy than the single membership model under different aggressor occurrence probabilities. Note that these member models are somewhat robust because they distill out of the teacher model.
FIG. 6 shows the EI-MTD accuracy at different distillation temperatures T.
Fig. 7 shows the values of differential immunity γ for different distillation temperatures tdifferent.
FIG. 8 is the effect of differential immunization γ on EI-MTD.
FIG. 9 is the EI-MTD accuracy at different regularization coefficients λ.
Fig. 10 is the values of the different regularization coefficients λ differential immunity γ.
Fig. 11 shows the effect of differential immunity γ on EI-MTD at a temperature T of 10.
Fig. 12 is a thermodynamic diagram of EI-TMD differential immunity gamma and accuracy for different temperature T in combination with regularization coefficients lambda. The left column represents differential immunity γ, and the right column represents classification accuracy. (a) The (b), (c) and (d) respectively correspond to different methods for generating confrontation samples, including FGSM, PGD, MI-FGSM and M-DI 2-FGSM.
Detailed Description
The invention is further illustrated by the following specific examples:
1. preparing knowledge:
1.1 deep neural network and challenge samples
Deep learning models (DNNs) can often be mapped using a mapping function F (X, θ): Rd→RLIs represented by, wherein X ∈ RdIs an input sample variable; θ represents a parameter of DNNs; l represents the DNN prediction class number. As used herein, a DNNs with a Softmax output layer, where the Softmax function is defined as:
Figure BDA0002682680440000071
DNN may be denoted as f (x) softmax (z), where z denotes the output vector of the last hidden layer of DNN. Given an input sample X ∈ X, the prediction label for DNNs can be expressed as: y ═ argmaxi∈{1,..,L}F(x)iWherein the probability value F (x)y′Referred to as the confidence score of the prediction. The goal of training DNNs is to make the difference between their predicted y' and true labels y smaller and smaller. The loss of an input-label pair (x, y) is represented by J (x, y, θ), and the objective function for training DNNs herein is the cross-entropy loss function, defined as: j (x, y, θ) — 1yLog (Softmax (z (x, θ))), where-1yIs the one-hot encoding of the real tag, and the logarithm of the vector is defined as the logarithm of each element.
The countermeasures are to add a disturbance r which cannot be detected by human eyes to the input sample x, so that the model with certain generalization capability is misclassified. Specifically, x isadv=x+r,s.t||r||p≦ prediction of DNN argmaxiF(xadv)≠-1yOr argmaxiF(xadv) T, where t is an attacker-specified category. As used herein, /)Norm measures the size of disturbance r, i.e. R≤。
1.2 gradient-based attacks
When the model information of DNNs is known, the white-box attack method can be under the constraint of | | xadv-x||pAt most, passing through an optimization function
Figure BDA0002682680440000072
To construct a challenge sample. This section mainly introduces an attack method for generating a countermeasure sample based on gradient optimization.
FGSM (fast gradient notation): the first method proposed by GoodFellow for generating a challenge sample based on model gradient information obtains a challenge sample x by maximizing a loss function J (x, y, θ)advFor this purpose, the perturbation r, i.e. x, is sought in the direction in which the gradient of the loss function with respect to x changes maximallyadv=x+r·sign(▽xJ(x,y;θ) Sign () represents a sign function +xJ (x, y; θ) is the gradient of the loss function to the input x, | | r | | luminance≤。
PGD (projection gradient descent): alekscan et al extend FGSM to an iterative approach to finding countermeasures against disturbances, i.e.
Figure BDA0002682680440000073
Where T is the number of iterations, the iteration step α ═ T, T is the total number of whole iterations, clip(. -) represents clipping the perturbation within the constraint.
MI-FGSM: dong et al replace the iterative part of the PGD with momentum iterations to stabilize the gradient direction from entering local maxima. The gradient descent-based momentum iteration method is represented as:
Figure BDA0002682680440000074
wherein
Figure BDA0002682680440000081
u is the momentum term decay factor.
MDI 2-FGSM: in order to improve the black box attack rate of the multi-step iteration method, Xie et al propose to perform input transformation on a sample after each step of iteration is completed, and specifically:
Figure BDA0002682680440000082
where p denotes the probability of transformation, the random transformation function
Figure BDA0002682680440000083
1.3 confrontational training
The countermeasure training is a method for learning DNNs, and can improve the robustness of DNNs. It was first proposed by Goodfellow et al that the reason why challenge samples can confuse DNNs is the lack of training data, and therefore to defend against challenge samples, it was proposed to generate a large number of challenge samples with FGSMs and then retrain DNNs as part of the training data with their correct labels. Mardy et al describe the resistance training learning problem as a robust optimization problem as follows:
Figure BDA0002682680440000084
they propose to solve the internal maximization problem with the PGD approach. The generated challenge samples are then used for training to solve the external minimization problem. However, the method of the antagonistic training has a gradient computation complexity of o (mn) in a single batch, where M is the data volume and N is the number of iterations of PGD, which is N times greater than the standard training o (M).
1.4 knowledge distillation
Hinton first proposed knowledge distillation, who thought the prediction vector of the model contained structured information between classes, and it could be used to remove part of redundancy of the neural network, achieving the goal of compressing the network structure. Specifically, for a trained teacher model Ft(θ) its logits layer output is Z ═ Z1(x),...,ZL(x) Redefines the softmax function:
Figure BDA0002682680440000085
wherein the parameter T is a temperature parameter,
Figure BDA0002682680440000086
called soft tag, the original tag y of sample x is called hard tag. Soft and hard labels may better train student models than training using only hard labels. Training of the student model is to minimize knowledge distillation loss:
Figure BDA0002682680440000087
wherein the content of the first and second substances,
Figure BDA0002682680440000088
the soft label is generated by the teacher model, and the beta is the weight for adjusting the calculation loss of the hard label and the soft label in the training process of the student model.
1.5Bayes-Stackelberg Game
Stackelberg game is nonA cooperative, sequenced decision game whose participants (players) include a leader L that takes action first, and a follower F that starts later. We represent the Stackelberg game G ═ (L, F, S) by a six-tupleL,SF,RL,RF) Here SLIs the action space of the leader, SFIs the motion space of the follower, RLIs a revenue function of the leader, RFIs the follower's revenue function. The revenue function being a function R defined over a combination of actionsi:[SL]×[SF]→ R, where i ═ L, F, [ S ═ Li]An index set representing an action space. A pure policy is one that can only select one action, while a hybrid policy is one in which each action can be selected with a probability 0 ≦ p < 1. In the Stackelberg game, a leader adopts a mixed strategy s to take action first, and a follower F optimizes the income of the leader under the strategy of the leader and responds to a pure strategy q. Finally solving a mixed integer quadratic programming problem (MIQP) of one leader:
Figure BDA0002682680440000091
Figure BDA0002682680440000092
Figure BDA0002682680440000093
Figure BDA0002682680440000094
here, N is a large positive number, the solved objective function value is the optimal benefit of the leader, the optimal mixed strategy of the leader is s, q is the optimal strategy responded by the follower, and the benefit of the follower is v.
In the field of information security, it is common to assume that a leader is defendingThe party and the follower is the attacker. The attacker may contain multiple attack types, and thus extends the Stackelberg game to situations where there are multiple type followers, referred to as the Bayes-Stackelberg game, denoted as the Bayes-Stackelberg game
Figure BDA0002682680440000095
C e
1, C, i.e. the followers, contains C types, each type of follower F(c)All have their own policy sets
Figure BDA0002682680440000096
And a revenue function
Figure BDA0002682680440000097
p(c)Indicating follower F(c)The probability of occurrence. In this game, the leader is unaware of the follower F(c)But knows the probability distribution p of his type(c)And is therefore a Stackelberg game of incomplete information. And finally solving the MIQP problem of the leader of the Bayes-Stackelberg game:
Figure BDA0002682680440000098
Figure BDA0002682680440000099
Figure BDA00026826804400000910
Figure BDA00026826804400000911
and solving to obtain the optimal income of the leader, wherein the objective function value is the optimal income of the leader, and the optimal mixing strategy of the leader is s and q(c)Is the follower F(c)Optimal strategy of response, when the benefit of the follower is v(c)
2. Defense method
The invention provides an edge intelligent moving target defense framework comprising three key technologies: resistance training, differential knowledge distillation, and model dynamic scheduling. We used countermeasure training in the cloud data center to obtain a powerful teacher model. Secondly, robust knowledge is extracted from a teacher model by using transfer learning and applied to a small-scale student model with limited resources, and different from Hinton knowledge distillation, difference regular terms are added to improve the diversity among the student models and effectively reduce the transferability of a confrontation sample. These student models, also called membership models, are further dynamically scheduled. Thanks to the diversity obtained, our dynamic scheduling can increase the difficulty for an attacker to find the optimal proxy model, as shown on the right side of fig. 1.
Antagonistic training of the teacher model: suppose we have a training data set
Figure BDA00026826804400001019
And a teacher model. Work has shown that a larger capacity network can be made more robust against training. Therefore, we choose a network with 101 layers like ResNet-101 as the teacher model. Countermeasure training is then performed at the cloud data center, in conjunction with a "FAST" countermeasure training approach to speed up the process.
Differential knowledge distillation of student models. The soft labels of the training set at the appropriate distillation temperature are first obtained from the teacher model and then a new training data set is created. The essence of knowledge distillation is to train student models with teacher model soft labels. To obtain the diversity of the student models, we define a new loss function with regularization term, and train all student models simultaneously to minimize the common loss function. Note that in the present invention, the student model, the member model, and the object model refer to the same object, which have specific names in specific contexts.
Dynamic service scheduling of member models. After differential knowledge distillation, student models are deployed to edge nodes. Note that there is only one student model per edge node, including edge devices and edge servers. Where the edge server is designated as the dispatch controller. All student models, i.e., membership models, are registered in the dispatch controller. When a user (including an attacker) inputs an image through an edge device (e.g., a smartphone) requesting a classification service. The edge device first uploads the service request to the dispatch controller rather than processing it directly on the local model. The scheduling controller selects an edge node, and more precisely the model on it, to perform the classification. Thus, the attacker cannot know which edge node will ultimately provide service. The edge server provides a best target model selection through the Bayes-Stackelberg game.
3.1 measure of dissimilarity
As described above, the diversity of the models plays an important role in the effectiveness of dynamic scheduling. For this reason, how to properly balance the diversity of the quantities is an important issue. Inspired by the fact that the counterattack exploits the gradient relative to the input as the perturbation direction, we use gradient alignment as a diversity measure.
Suppose there are two member models
Figure BDA0002682680440000101
And
Figure BDA0002682680440000102
e omega and agent model F selected by attackeraIs e.g. U, with
Figure BDA0002682680440000103
Respectively represent
Figure BDA0002682680440000104
And
Figure BDA0002682680440000105
is applied to the gradient of the sample x. If it is not
Figure BDA0002682680440000106
And
Figure BDA0002682680440000107
the included angle between them is small enough, which means that it is possible to make
Figure BDA0002682680440000108
Misclassified xadvCan also make
Figure BDA0002682680440000109
Is misclassified, therefore
Figure BDA00026826804400001010
And
Figure BDA00026826804400001011
the difference between
Figure BDA00026826804400001012
And
Figure BDA00026826804400001013
the included angle therebetween. We use Cosine Similarity (CS) to denote +xJ1And +xJ2Degree of alignment of (a):
Figure BDA00026826804400001014
wherein < +xJ1,▽xJ2Is > is
Figure BDA00026826804400001015
And
Figure BDA00026826804400001016
the inner product of (d). If CS (+xJ1,▽xJ2) Not equal to-1, then +xJ1And +xJ2The gradient of (2) is opposite in direction, meaning that the gradient can be made
Figure BDA00026826804400001017
Misclassified xadvCan not make
Figure BDA00026826804400001018
And (4) carrying out error classification.
3.2 differential knowledge distillation
This section further applies cosine similarity to the training process of the membership models to obtain membership models with greater diversity. Since cosine similarity is calculated with two gradients and our EI-MTD includes K models, the maximum on pairwise cosine similarity is defined as the EI-MTD diversity measure:
Figure BDA0002682680440000111
wherein JaAnd JbRespectively representing member models
Figure BDA0002682680440000112
And
Figure BDA0002682680440000113
a loss function of theta(a)And theta(b)Respectively representing member models
Figure BDA0002682680440000114
And
Figure BDA0002682680440000115
is determined by the parameters of (a) and (b),
Figure BDA0002682680440000116
is x gets the soft label from the teacher model. Due to CScoherenceNon-smooth function, not using gradient descent equal first order optimization method, LogSumExp function is used herein to smoothly approximate CScoherence
Figure BDA0002682680440000117
Smaller CScoherenceSmall means greater variability between member models. Attention is paid toThe member models are distilled from a teacher model of the cloud data center, and meanwhile, the diversity among the member models needs to be ensured, so that a regularization term is added in the knowledge distillation process, and a new distillation loss function is newly defined as follows:
Figure BDA0002682680440000118
where λ is the regularization system, controlling CS during the training processcoherenceThe importance of (c). In order to allow the student model to sufficiently learn the antagonistic knowledge of the teacher model, β is set to 1. That is, we train the student model using only the soft label example. Differential knowledge distillation algorithm 1 is shown below.
Figure BDA0002682680440000119
3.3 model scheduling policy
3.2 after the student models are obtained by distilling the difference knowledge in subsection, the member models are deployed to the edge nodes, as shown in FIG. 2. When the edge device receives the image, it does not perform classification on its own model, but instead forwards the image to the dispatch controller. The scheduling controller will select the registered service model by the scheduling policy. In this section, we will describe the scheduling policy in detail.
In a confrontational environment, both the defender and the attacker would like to maximize their "yield" through some strategy, which is typical of gaming problems. In the present invention, a Bayes-Stackelberg game is used to model the scheduling strategy. The defender's strategy is to select a suitable service classification model, and the adversary's strategy is to select an optimal agent model to generate the confrontation sample. The present invention represents Bayes-Stackelberg game as seven-element group
Figure BDA00026826804400001214
Wherein L is defensive, SLIs a group of student models obtained after differential distillation
Figure BDA0002682680440000121
The type of follower F includes two types, legal user F(1)And attacker F(2)(ii) a Legitimate user F(1)Movement space of
Figure BDA0002682680440000122
Only one action, namely requesting services using a legitimate sample; attacker F(2)Movement space of
Figure BDA0002682680440000123
Is to select different agent models
Figure BDA0002682680440000124
Income of defender L
Figure BDA0002682680440000125
And a legitimate user F(1)Gain of (2)
Figure BDA0002682680440000126
Defining the classification accuracy of the member model to the natural image; income of defender L
Figure BDA0002682680440000127
Is the classification accuracy of the member model on the antibody sample; illegal user F(2)Gain of (2)
Figure BDA0002682680440000128
Defined as the success rate against sample attacks; p(1)Indicating a legitimate user F(1)Probability of occurrence, P(2)Represents an attacker F(2)The probability of occurrence; converting a model scheduling strategy problem based on a Bayes-Stackelberg game into a mixed integer quadratic programming problem (MIQP) as follows:
Figure BDA0002682680440000129
Figure BDA00026826804400001210
Figure BDA00026826804400001211
Figure BDA00026826804400001212
0≤s≤1
Figure BDA00026826804400001213
v(c)∈R
wherein P is(1)=1-α,P(2)=α,s=(p1,p2,...,pK) Solving to obtain the scheduling strategy of the member model, piMember model Fs(i)) The probability of being selected. q. q.s(c)Is an end user F(c)Optimal strategy of response, end user's profit is v(c)
MIQP is an NP difficult problem, and the method solves the problem by using a resolving optimal Bayes-Stackelberg game resolving method. In addition, the DOBSS algorithm has three key advantages over other solving methods. First, the method allows Bayes-Stackelberg to be expressed compactly, without the need for gaming to be converted to normal form by hasani (Harsanyi); secondly, the method only needs to solve a mixed integer linear programming problem, but not calculate a set of linear programming problems, so that the solving speed is further improved; finally, it directly finds the optimal leader strategy, instead of Nash balancing, thus enabling it to find a highly profitable Stackelberg balancing strategy (exploiting the pre-emptive dominance of the leader). And for the solved optimal strategy s of the leader, the edge server serves the users according to the model which is arranged on the edge equipment in a dispatching way under the optimal strategy according to the server affinity of the users.
4. Experiment of
4.1 Experimental setup
In the experimental verification of the invention, a GPU cluster, a PC and a raspberry dispatching machine are used for respectively simulating a cloud data center, an edge server and edge equipment.
The cloud computing center: in the embodiment of the invention, an X745-G30 server of Ubuntu eosin is used for simulating a cloud computing center, the operating system of the server is Ubuntu16.04.6LTS, the GPU model is NVIDIA Geforce RTX 2080Ti 4, and extension packages such as Python3.7.3 and Pytroch 1.2 are used. On the server, antagonistic training of teacher models and differential knowledge distillation of student models are performed.
An edge server: the HUAWEI MateBook 142020 notebook was used to emulate an edge server, 64-bit Windows 10 operating system, and the CPU processor was Intel Core i5-10210U 2.11GHz, 16GB RAM. The solution of DOBSS algorithm is implemented using python3.6, pulp 2.1.
Edge equipment: we select a set of 6 Raspberry Pi 3Model B + as edge devices, the processor of each Raspberry Pi 3Model B + is Broadcom BCM2837B0, the operating system is 64-bit quad-core ARM Cortex-A53, and the memory is 1GB LPDDR2 SDRAM. In addition to the student models on these edge devices, we have also developed test programs, i.e., sending images to the edge server at any point in time, simulating image classification requests.
A teacher model: the teacher model uses the ResNet-101 model. The model has 101 layers and 33 residual blocks. The teacher model was trained at the GPU with 120 million clean images and their corresponding confrontational samples, which were generated by the FGSM method.
Student/member model: several light-weight model structures which are mainstream at present, namely MobileNet V2, ShuffleNet V2 and SqueezeNet are adopted as student/member models. On the three model structures, six models are obtained through different hyper-parameters: MobileNet V2-1.0, MobileNet V2-0.75, ShuffleNet V2-0.5, ShuffleNet V2-1.0, SqueezeNet-1.0, SqueezeNet-1.1.
The proxy model comprises the following steps: to simulate an attacker's strategy, five surrogate models were selected: MobileNet V2-1.0, Shufflentv 2-1.0, SuqeezeNe-1.0, ResNet-18, and VGG-13. The first three models are structurally very similar to the member models that simulate the white-box attack. Given a pre-trained surrogate model, we generated countermeasure samples by FGSM, PGD, MI-FGSM and M-DI 2-FGSM.
Data set: an example of the invention was an experiment on an ILSVRC2012 dataset containing 1000 classes, consisting of 120 ten thousand images as a training set and 150,000 images as a test set. Each image is 224 x 224 in size, with three color channels. It is currently a reference dataset in the field of image classification.
4.2 antagonistic training and differential knowledge distillation
Accuracy of the teacher model: to ensure a better transfer of knowledge from the teacher model to the student model, the teacher model itself must be of sufficient accuracy and robustness. We used 120 million clean pictures and their corresponding confrontational samples to confront and train teacher model Ft. During the training process, we selected 10,000 confrontational sample strategy teacher models F from the test set using PGD generationtEach training phase of (a). For the PGD method, the perturbation size is 5, the iteration step is/5, and the number of iterations is 20. FIG. 3 shows a teacher model FtThe effect of the resistance training. Teacher model F with the deepening of training roundstThe accuracy of (2) is gradually improved. First, for the clean example, the teacher model has a top-1 accuracy of 11.83% and a top-5 accuracy of 15.31%. Through 15 rounds of confrontation training, the accuracy of top-1 is improved to 64.03%, and the accuracy of top-5 is improved to 82.8%. Similarly, teacher model FtThe accuracy of top-1 increased from 3.37% to 52.35% and the accuracy of top-5 increased from 13.55% to 73.71% for the PGD challenge samples. In conclusion, the teacher model obtains higher accuracy and robustness through the confrontation training, and the student model is guaranteed to obtain good performance.
Accuracy of student/member models. We evaluated using two sets of models, corresponding to normal training and differential distillation, respectively, each set containing 6 models, as shown in fig. 4. FIG. 4 shows the accuracy of top-1 and top-5 for two group models tested by clean examples and hostile examples. For example, the normal trained shufflentv 2-1.0 model has a top-1 accuracy of 6.12% and a top-5 accuracy of 20.49% against challenge. In contrast, the same model of differential distillation can achieve 39.15% top-1 accuracy and 67.43% top-5 accuracy, with a slight decrease in accuracy for the clean examples. These results indicate that the student model, distilled from the robust teacher model, has better ability to defend against challenge samples, and has lower model capacity. This means that the student model obtained by differential distillation can be applied to the edge intelligence computing environment.
4.3 revenue matrix for moving target defense
The revenue matrix in the game represents the revenue of the participants under different strategies. The elements of the revenue matrix are the doublets (a, b), where a is the classification accuracy when attacked by the challenge sample and b is the attack success rate. We obtain the value of a by the test set testing membership model of ILSVRC2012 and generate the challenge sample testing membership model of the test set on the surrogate model to obtain b. For legitimate users, their revenue is the accuracy of the classifier. Table 2 shows the results of the game matrix between defenders and legitimate users in the MTD-EI framework. Tables 3, 4, 5 and 6 give the revenue matrices between defenders and attackers (PGD, FGSM, MI-FGSM and M-DI 2-FGSM). For example, (56.73, 43.27) in table 3 shows that when the attacker generates the confrontation sample by PGD on the agent model ResNet-18, and attacks the classification model MobileNetV2-1.0, the defender's profit is 56.73% of the classification accuracy of the confrontation sample, and the attacker's profit is 43.27% of the success rate of the confrontation sample attack.
Table 3: game gains of defenders and PGD attackers, wherein the gains of attackers are attack success rates (%), and the gains of image classification systems are classification accuracy rates when attacked (1-attack success rate)
Figure BDA0002682680440000141
Table 4: game profit of defender and FGSM attacker, wherein the profit of attacker is attack success rate (%), and the profit of defender is classification accuracy rate when attacked (1-attack success rate)
Figure BDA0002682680440000151
Table 5: the game profit of defender and MI-FGSM attacker, wherein the profit of attacker is attack success rate (%), and the profit of defender is classification accuracy when attacked (1-attack success rate)
Figure BDA0002682680440000152
Table 6: defense and M-DI2-gambling yield of FGSM attacker, wherein the yield of attacker is attack success rate (%), and the yield of defender is classification accuracy when attacked (1-attack success rate)
Figure BDA0002682680440000153
4.4 EI-MTD effectiveness
Given the revenue matrix in section 5.2, we can select the probability vector of the appropriate membership model by solving. Since the defender's optimal strategy depends on the probability a of the attacker's occurrence, we verified the effectiveness of the EI-MTD under different conditions compared to a single membership model without dynamic scheduling. These member models are somewhat robust in that they are distilled from a robust teacher model. The results are shown in FIG. 5, where (a), (b), (c) and (d) correspond to FGSM, PGD, MI-FGSM and M-DI2-FGSM, respectively. Next, we discuss the effectiveness of the EI-MTD defense system according to the probability of occurrence of an adversary, α, using PGD (fig. 4 (confrontation sample)) as an example:
(1) assuming that the user type is only legal users, that is, all requests are clean samples, at this time, setting α to 0, the EI-MTD selects the membership model MobileNetV2-1.0 with the highest classification accuracy of the clean samples, which is equivalent to that the EI-MTD uses the pure policy MobileNetV2-1.0, and does not enable model switching.
(2) Assuming that the user type is only an attacker, that is, all requests are countermeasure samples, when α is set to 1, the optimal scheduling policy solved by the EI-MTD is s (0.13, 0.15, 0.16, 0.12, 0.14, 0.3), and the EI-MTD randomly selects a membership model according to the probability vector s. Under this strategy, the expected classification accuracy of EI-MTD for normal samples was 64.57%, and the expected accuracy of challenge samples was 40.86%, but the challenge sample accuracy was less than 32% for a single DNN. It can be seen that EI-MTD has better defense effectiveness.
(3) The actual situation is that the legal users and the attackers are distributed with a certain prior probability, and the probability of the appearance of the attackers is assumed to be alpha (0 < alpha < 1), so that the probability of the appearance of the legal users is 1-alpha. In the experiment, alpha is 0.1,0.2, … and 0.9 respectively, and various possible situations are simulated. From fig. 5, it can be observed that as α increases, i.e. the proportion of challenge samples in the request increases, the accuracy of all models tends to decrease, since the destructive effect of the challenge samples increases. But we can find that the classification accuracy of the EI-MTD is still higher than that of the single member model.
We analyzed FGSM, MI-FGSM and M-DI2-FGSM against attacks simultaneously, and EI-MTD classification accuracy is also higher than that of single member models. Especially for M-DI2-FGSM with the strongest black box attack capability, EI-MTD improves the accuracy of SqueezeNet-1.0 of a single member model with the worst defense capability from 15.77% to 41.09%. It can be seen that the EI-MTD method proposed herein can improve the robustness of the entire image classification system.
4.5 transferability of EI-TMD
The transferability of the challenge samples can be measured by the transfer rate, i.e. the ratio of the number of transferred challenge samples to the total number of challenge samples constructed by the original model. Essentially, the transfer rate is equal to 100 minus the classification accuracy of the target model. We can observe from fig. 5 that the transferable ratio of challenge samples on EI-MTD is lower than other membership models. For example, in FIG. 5d, the transfer rate on EI-MTD is (100% -41.09%), while the transferable rate on the membership model of MobileNet V2-1.0 is (100% -28.74%). Similarly, the transfer rates on other membership models were found to be higher than the EI-MTD, indicating that the EI-MTD can reduce transferability against the samples.
4.6 Effect of T and λ on EI-MTD
To analyze in depth the effect of differential knowledge distillation on EI-MTD, we further analyzed two important parameters T and λ, which represent distillation temperature and regularization coefficient. Sailik et al propose differential immunity as a measure of MTD effectiveness, which considers that for an ideal MTD, a particular attack exhibits differences over different model configurations. Therefore, they defined differential immunity using challenge success rate:
Figure BDA0002682680440000161
wherein FaE.U denotes the agent model chosen by the attacker to generate the countermeasure sample, FsE.omega represents a target model selected by a defender for classification services, and ASR (F)a,Fs) Representing the attack success rate of the countermeasure sample generated by the agent model on the target model. A larger gamma value indicates good MTD performance. In this section, we used differential immunization γ to investigate the effect of T and λ on EI-MTD.
Influence of T: for ease of analysis, λ is fixed to 0.3, while assuming that all requests are challenge samples. The relationship between the accuracy of EI-MTD and the distillation temperature T is shown in FIG. 6. As the distillation temperature T increased, we observed a corresponding increase in EI-MTD classification accuracy for challenge samples generated from FGSM, PGD, MI-FGSM and M-DI 2-FGSM.
Differential immunity gamma can be easily calculated due to the classification accuracy of all member models. Figure 7 shows the corresponding differential immunity gamma to distillation temperature T. Differential immunity γ can be observed to increase with increasing distillation temperature T, which means that higher distillation temperature T can expand the diversity of membership models. The reason is that higher temperatures will be of the membership modelThe decision boundary approaches to a robust teacher model, and max is reducedFsASR(Fa,Fs). However, the increase in the differential immunity γ became gradual after the distillation temperature T was increased to 12, indicating that the distillation temperature T may no longer be the major factor affecting the membership model differences at this time. This result is a good demonstration of EI-MTD effectiveness.
Based on the above observations, we further analyzed the correlation between the accuracy of EI-MTD and differential immunity γ. In fig. 8, we experimentally show the classification accuracy of EI-MTD at different differential immunizations γ. The results show that increasing differential immunity γ can improve the performance of EI-MTD, which confirms again the idea described in section 3.1 that the diversity of the membership models determines the effectiveness of EI-MTD. For example, when γ is 0.15, the accuracy of EI-MTD is only 27.34%, whereas when we increase γ to 0.38 by increasing the temperature T to 20, the accuracy of EI-MTD reaches 47.86%. We can clearly explain how distillation temperature T works (1) increasing temperature T increases differential immunity γ; (2) the accuracy of EI-MTD can be further improved by increasing the differential immunity gamma; therefore (3) increasing the distillation temperature T can increase the effectiveness of EI-MTD.
Influence of λ: λ is a regularization coefficient, controlling CS during trainingcoherenceThe importance of (c). To analyze the effect of λ on EI-MTD performance, we fixed the distillation temperature T-10. As shown in fig. 9, increasing λ may improve the accuracy of EI-MTD well. The results do not show the essential relationship between them. Therefore, we first show in fig. 10 how the regularization coefficient λ affects the differential immunity γ. In particular, if we reduce λ to 0, this means that all membership models are the same, i.e., EI-MTD is not dynamically scheduled, and the accuracy of the challenge samples generated for PGD is only 27.34%. In contrast, if the differential immunity is increased to 1, the EI-MTD reaches an accuracy of 55.68%. In fact, increasing λ represents the importance of increasing membership model variability in the differential distillation process. This correspondingly increases the differential immunity. In this way, it can be seen that a larger γ can improve the accuracy of EI-MTD. Fig. 11 shows that increasing γ increases the level of EI-MTD at a temperature T of 10And (5) determining. Therefore, we briefly summarize the above analysis as follows (1) a larger λ can increase the diversity of member models, thus further enhancing differential immunity γ; (2) higher differential immunity gamma can ensure higher precision; therefore, (3) a larger lambda is beneficial to improving EI-MTD accuracy.
Optimal combination of T and λ: although we analyzed the effects of T and λ separately, the effect of their combination on EI-MTD accuracy is not clear. We show by a thermodynamic diagram in fig. 12 the accuracy of differential immune γ and EI-MTD under the insufficient combination. It can be seen that T and λ do not cancel out the effect of each other. This is because increasing both T and λ increases the differential immunity γ. In the present example experiment, T18 and λ 0.9 may achieve optimum performance, but too large a value does not appear to have further significant impact.
The foregoing shows and describes the general principles and features of the present invention, together with the advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (4)

1. A border intelligent moving target defense method based on a Bayes-Stackelberg game is characterized by comprising the following steps:
s1: antagonistic training for teacher models: existing training data set of cloud data center
Figure FDA00026826804300000119
And teacher model Ftt) (ii) a Adopting a ResNet-101 neural network with 101 layers as a teacher model, carrying out countermeasure training in a cloud data center by using FGSM countermeasure samples, and accelerating the process by using a combined 'FAST' countermeasure training method;
s2: student model's difference knowledge is evaporatedDistilling: first from the teacher model Ftt) To obtain a sample x at a suitable distillation temperature TiSoft label of
Figure FDA0002682680430000011
Creating a new training data set
Figure FDA0002682680430000012
Defining a new CS with regularization termcoherenceIs equal to sigma T2J/K+λ·CScoherenceTraining all student models simultaneously
Figure FDA0002682680430000013
To minimize the common loss function L;
s3: dynamic service scheduling of member models: after differential knowledge distillation, the student models are deployed to edge nodes, and each node is deployed with one model; here, the edge node includes an edge device and an edge server; appointing a certain edge server as a service scheduling controller, and registering all member models and nodes where the member models are located in the scheduling controller; when a user inputs an image request classification service through the edge device, the edge device firstly uploads the service request to the scheduling controller, and then the scheduling controller selects one edge node through a Bayes-Stackelberg game to execute a classification task.
2. The Bayes-Stackelberg game-based edge intelligent mobile target defense method according to claim 1, characterized in that: using the gradient alignment as a diversity measure according to the description in step S3;
is provided with two member models Fs (1)And Fs (2)E omega and agent model F selected by attackeraIs e.g. U, with
Figure FDA0002682680430000014
Respectively represent Fs (1)And Fs (2)Step of the loss function of (2) on sample xDegree; if it is not
Figure FDA0002682680430000015
And
Figure FDA0002682680430000016
the angle between them is sufficiently small, which means that F can be mades (1)Misclassified xadvAlso enable Fs (2)Misclassification, therefore Fs (1)And Fs (2)The difference between
Figure FDA0002682680430000017
And
Figure FDA0002682680430000018
the included angle therebetween is related; using Cosine Similarity (CS) representation
Figure FDA0002682680430000019
And
Figure FDA00026826804300000110
degree of alignment of (a):
Figure FDA00026826804300000111
wherein
Figure FDA00026826804300000112
Is that
Figure FDA00026826804300000113
And
Figure FDA00026826804300000114
inner product of (d); if it is
Figure FDA00026826804300000115
Then
Figure FDA00026826804300000116
And
Figure FDA00026826804300000117
the gradient of (2) is opposite in direction, meaning that F can be mades (1)Misclassified xadvFail to make Fs (2)And (4) carrying out error classification.
3. The Bayes-Stackelberg game-based edge intelligent mobile target defense method according to claim 1, characterized in that: in the step S2, cosine similarity is further applied to the training process of the student models to obtain a member model set with greater diversity; since cosine similarity is calculated using two gradients, to further generalize to K models, the maximum value on pairwise cosine similarity is defined as the EI-MTD diversity measure:
Figure FDA00026826804300000118
wherein, JaAnd JbRespectively represent student models Fs (a)And Fs (b)A loss function of theta(a)And theta(b)Respectively represent student models Fs (a)And Fs (b)Is determined by the parameters of (a) and (b),
Figure FDA0002682680430000021
x obtains the soft label from the teacher model; due to CScoherenceIs a non-smooth function, cannot use a gradient descent optimization method, and further uses a LogSumExp function to approximate CScoherence
Figure FDA0002682680430000022
Student model is from teacher model of cloud data centerDistillation is carried out, and the diversity among the student models needs to be ensured at the same time of distillation, so that the regularization term CS is added in the knowledge distillation processcoherenceRedefining a new distillation loss function:
Figure FDA0002682680430000023
wherein, lambda is a regularization coefficient, and CS is controlled in the training processcoherenceThe importance of (c); in order to enable the student model to fully learn the confrontational knowledge of the teacher model, setting beta to 1, namely training the student model by using only the soft label example; differential knowledge distillation algorithm 1 is as follows:
Figure FDA0002682680430000024
4. the Bayes-Stackelberg game-based edge intelligent mobile target defense method according to claim 1, characterized in that: after the student models are obtained through differential knowledge distillation in the step S3, the student models are member models and are deployed to edge nodes; when the edge device receives the image, the image is not classified on the model of the edge device, but the image is forwarded to the scheduling controller; the scheduling controller will select the registered service model by the scheduling policy, specifically:
representing Bayes-Stackelberg games as seven-tuple
Figure FDA0002682680430000025
Wherein L is defensive, SLIs a group of student models obtained after differential distillation
Figure FDA0002682680430000026
The type of follower F includes two types, legal user F(1)And attacker F(2)(ii) a Legitimate user F(1)Movement space of
Figure FDA0002682680430000027
Only one action, namely requesting services using a legitimate sample; attacker F(2)Movement space of
Figure FDA0002682680430000028
Is to select different agent models
Figure FDA0002682680430000029
Income of defender L
Figure FDA00026826804300000210
And a legitimate user F(1)Gain of (2)
Figure FDA00026826804300000211
Defining the classification accuracy of the member model to the natural image; income of defender L
Figure FDA00026826804300000212
Is the classification accuracy of the member model on the antibody sample; illegal user F(2)Gain of (2)
Figure FDA00026826804300000213
Defined as the success rate against sample attacks; p(1)Indicating a legitimate user F(1)Probability of occurrence, P(2)Represents an attacker F(2)The probability of occurrence; converting a model scheduling strategy problem based on a Bayes-Stackelberg game into a mixed integer quadratic programming problem (MIQP) as follows:
Figure FDA0002682680430000031
Figure FDA0002682680430000032
Figure FDA0002682680430000033
Figure FDA0002682680430000034
0≤sn≤1
Figure FDA0002682680430000035
v(c)∈R
wherein P is(1)=1-α,P(2)=α,s=(p1,p2,...,pK) Solving to obtain the scheduling strategy of the member model, piMember model Fs(i)) A probability of being selected; q. q.s(c)Is a user F(c)Optimal strategy of response, user's profit is v(c)(ii) a And (5) solving by using a DOBSS algorithm.
CN202010966915.3A 2020-09-15 2020-09-15 Edge intelligent mobile target defense method based on Bayes-Stackelberg game Active CN112115469B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010966915.3A CN112115469B (en) 2020-09-15 2020-09-15 Edge intelligent mobile target defense method based on Bayes-Stackelberg game

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010966915.3A CN112115469B (en) 2020-09-15 2020-09-15 Edge intelligent mobile target defense method based on Bayes-Stackelberg game

Publications (2)

Publication Number Publication Date
CN112115469A true CN112115469A (en) 2020-12-22
CN112115469B CN112115469B (en) 2024-03-01

Family

ID=73802745

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010966915.3A Active CN112115469B (en) 2020-09-15 2020-09-15 Edge intelligent mobile target defense method based on Bayes-Stackelberg game

Country Status (1)

Country Link
CN (1) CN112115469B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112860932A (en) * 2021-02-19 2021-05-28 电子科技大学 Image retrieval method, device, equipment and storage medium for resisting malicious sample attack
CN113163006A (en) * 2021-04-16 2021-07-23 三峡大学 Task unloading method and system based on cloud-edge collaborative computing
CN113449865A (en) * 2021-08-30 2021-09-28 算筹(深圳)信息科技有限公司 Optimization method for enhancing training artificial intelligence model
CN114299313A (en) * 2021-12-24 2022-04-08 北京瑞莱智慧科技有限公司 Method and device for generating anti-disturbance and storage medium
CN114978654A (en) * 2022-05-12 2022-08-30 北京大学 End-to-end communication system attack defense method based on deep learning
WO2022178652A1 (en) * 2021-02-23 2022-09-01 华为技术有限公司 Method for model distillation training and communication apparatus
CN115022067A (en) * 2022-06-17 2022-09-06 中国人民解放军国防科技大学 Network security defense method and device under game-based asymmetric information
CN115170919A (en) * 2022-06-29 2022-10-11 北京百度网讯科技有限公司 Image processing model training method, image processing device, image processing equipment and storage medium
CN117040809A (en) * 2023-07-20 2023-11-10 浙江大学 Method for generating defense strategy of industrial information physical system based on Bayesian random game

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180268292A1 (en) * 2017-03-17 2018-09-20 Nec Laboratories America, Inc. Learning efficient object detection models with knowledge distillation
CN109218440A (en) * 2018-10-12 2019-01-15 上海拟态数据技术有限公司 A kind of mimicry web server isomery execution body dynamic dispatching method of displaying
CN110768971A (en) * 2019-10-16 2020-02-07 伍军 Confrontation sample rapid early warning method and system suitable for artificial intelligence system
CN111027060A (en) * 2019-12-17 2020-04-17 电子科技大学 Knowledge distillation-based neural network black box attack type defense method
CN111047054A (en) * 2019-12-13 2020-04-21 浙江科技学院 Two-stage countermeasure knowledge migration-based countermeasure sample defense method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180268292A1 (en) * 2017-03-17 2018-09-20 Nec Laboratories America, Inc. Learning efficient object detection models with knowledge distillation
CN109218440A (en) * 2018-10-12 2019-01-15 上海拟态数据技术有限公司 A kind of mimicry web server isomery execution body dynamic dispatching method of displaying
CN110768971A (en) * 2019-10-16 2020-02-07 伍军 Confrontation sample rapid early warning method and system suitable for artificial intelligence system
CN111047054A (en) * 2019-12-13 2020-04-21 浙江科技学院 Two-stage countermeasure knowledge migration-based countermeasure sample defense method
CN111027060A (en) * 2019-12-17 2020-04-17 电子科技大学 Knowledge distillation-based neural network black box attack type defense method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王晋东;余定坤;张恒巍;王娜: "静态贝叶斯博弈主动防御策略选取方法", 西安电子科技大学学报, vol. 43, no. 1, 14 April 2015 (2015-04-14), pages 144 - 150 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112860932A (en) * 2021-02-19 2021-05-28 电子科技大学 Image retrieval method, device, equipment and storage medium for resisting malicious sample attack
CN112860932B (en) * 2021-02-19 2022-08-12 电子科技大学 Image retrieval method, device, equipment and storage medium for resisting malicious sample attack
WO2022178652A1 (en) * 2021-02-23 2022-09-01 华为技术有限公司 Method for model distillation training and communication apparatus
CN113163006A (en) * 2021-04-16 2021-07-23 三峡大学 Task unloading method and system based on cloud-edge collaborative computing
CN113449865A (en) * 2021-08-30 2021-09-28 算筹(深圳)信息科技有限公司 Optimization method for enhancing training artificial intelligence model
CN114299313A (en) * 2021-12-24 2022-04-08 北京瑞莱智慧科技有限公司 Method and device for generating anti-disturbance and storage medium
CN114299313B (en) * 2021-12-24 2022-09-09 北京瑞莱智慧科技有限公司 Method and device for generating anti-disturbance and storage medium
CN114978654A (en) * 2022-05-12 2022-08-30 北京大学 End-to-end communication system attack defense method based on deep learning
CN114978654B (en) * 2022-05-12 2023-03-10 北京大学 End-to-end communication system attack defense method based on deep learning
CN115022067A (en) * 2022-06-17 2022-09-06 中国人民解放军国防科技大学 Network security defense method and device under game-based asymmetric information
CN115022067B (en) * 2022-06-17 2024-04-19 中国人民解放军国防科技大学 Network security defense method and device under game-based asymmetric information
CN115170919A (en) * 2022-06-29 2022-10-11 北京百度网讯科技有限公司 Image processing model training method, image processing device, image processing equipment and storage medium
CN115170919B (en) * 2022-06-29 2023-09-12 北京百度网讯科技有限公司 Image processing model training and image processing method, device, equipment and storage medium
CN117040809A (en) * 2023-07-20 2023-11-10 浙江大学 Method for generating defense strategy of industrial information physical system based on Bayesian random game
CN117040809B (en) * 2023-07-20 2024-04-05 浙江大学 Method for generating defense strategy of industrial information physical system based on Bayesian random game

Also Published As

Publication number Publication date
CN112115469B (en) 2024-03-01

Similar Documents

Publication Publication Date Title
CN112115469A (en) Edge intelligent moving target defense method based on Bayes-Stackelberg game
Kurakin et al. Adversarial attacks and defences competition
CN108764453B (en) Modeling method and action prediction system for multi-agent synchronous game
CN113505855B (en) Training method for challenge model
Chen et al. Backdoor attacks and defenses for deep neural networks in outsourced cloud environments
Liu et al. Adversaries or allies? Privacy and deep learning in big data era
Rashid et al. Adversarial training for deep learning-based cyberattack detection in IoT-based smart city applications
Yang et al. Intrusion detection: A model based on the improved vision transformer
Sheikh et al. Untargeted white-box adversarial attack to break into deep learning based COVID-19 monitoring face mask detection system
Kamran et al. Semi-supervised conditional GAN for simultaneous generation and detection of phishing URLs: A game theoretic perspective
CN113255526B (en) Momentum-based confrontation sample generation method and system for crowd counting model
Li et al. Defensive few-shot learning
Veerasamy et al. Rising above misinformation and deepfakes
CN113435264A (en) Face recognition attack resisting method and device based on black box substitution model searching
Wang et al. Latent coreset sampling based data-free continual learning
Lin et al. PSO-BPNN-based prediction of network security situation
Yin et al. Adversarial attack, defense, and applications with deep learning frameworks
Benegui et al. Adversarial attacks on deep learning systems for user identification based on motion sensors
Qiu et al. MT-MTD: Muti-training based moving target defense trojaning attack in edged-AI network
Fan et al. An intrusion detection framework for IoT using partial domain adaptation
Cao et al. Towards Black-box Attacks on Deep Learning Apps
Peng et al. Fedgm: Heterogeneous federated learning via generative learning and mutual distillation
Liu et al. 3D action recognition using multi-temporal skeleton visualization
CN115758337A (en) Back door real-time monitoring method based on timing diagram convolutional network, electronic equipment and medium
Cao et al. Cheating your apps: Black‐box adversarial attacks on deep learning apps

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant