CN114821227B - Deep neural network countermeasures sample scoring method - Google Patents
Deep neural network countermeasures sample scoring method Download PDFInfo
- Publication number
- CN114821227B CN114821227B CN202210378464.0A CN202210378464A CN114821227B CN 114821227 B CN114821227 B CN 114821227B CN 202210378464 A CN202210378464 A CN 202210378464A CN 114821227 B CN114821227 B CN 114821227B
- Authority
- CN
- China
- Prior art keywords
- sample
- challenge
- model
- calculating
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 21
- 238000013077 scoring method Methods 0.000 title abstract description 7
- 238000011156 evaluation Methods 0.000 claims abstract description 24
- 239000011159 matrix material Substances 0.000 claims abstract description 19
- 238000000034 method Methods 0.000 claims abstract description 12
- 238000003062 neural network model Methods 0.000 claims description 27
- 238000004364 calculation method Methods 0.000 claims description 26
- 238000004422 calculation algorithm Methods 0.000 claims description 21
- 239000013598 vector Substances 0.000 claims description 16
- 238000012360 testing method Methods 0.000 claims description 12
- 238000012549 training Methods 0.000 claims description 7
- 238000007635 classification algorithm Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims description 2
- 238000012545 processing Methods 0.000 claims description 2
- 230000000694 effects Effects 0.000 abstract description 13
- 238000004458 analytical method Methods 0.000 abstract 1
- 238000013136 deep learning model Methods 0.000 description 8
- 230000008901 benefit Effects 0.000 description 3
- 230000001066 destructive effect Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000013508 migration Methods 0.000 description 3
- 230000005012 migration Effects 0.000 description 3
- 230000008485 antagonism Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 230000007123 defense Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 230000003042 antagnostic effect Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a deep neural network countersample scoring method, and provides a novel method for evaluating countersample attack effects in a black box mode, wherein the countersample attack effects are evaluated and quantified by adopting a fuzzy comprehensive evaluation method and an index named countersample scoring (Adversarial Examples Score, AES). The method specifically comprises the steps of calculating mobility, imperceptibility, attack success rate and label offset of a countermeasure sample, determining a membership subset table, determining evaluation weight A in each aspect by using a hierarchical analysis method, and blurring a comprehensive evaluation matrix to obtain a score index of the countermeasure sample. The output of the AES index is a score that measures the effect of combating a sample attack, which can be used to evaluate the hazard of combating a sample to a deep neural network.
Description
Technical Field
The invention relates to the field of deep neural networks, in particular to a deep neural network countermeasures sample scoring method.
Background
Deep neural network technology has made a major breakthrough in solving complex tasks, however, deep neural network technology (especially artificial neural networks and data driven artificial intelligence) is very vulnerable to attack against samples during training or testing, which can easily subvert the original output of machine learning technology. For example, for an image classification deep neural network model, an contrast sample may be generated by adding some disturbance to a given image, which does not see the difference from the original image from the human eye, but is misclassified by a known good-performing deep neural network model, which exhibits a strong vulnerability to a resistance attack as the resistance machine learning technique is getting more advanced and complicated and the update speed is extremely fast. Therefore, it is necessary to evaluate the countermeasure effect of the challenge sample, the model performance of the deep neural network model, the defensive ability, and the like, and discover potential safety hazards of the challenge sample to the deep neural network model. And recommending a defense strategy for improving the safety of the model according to the evaluation result of the countermeasure sample, thereby improving the safety of the deep neural network model.
The existing work needs to evaluate the attack effect of the challenge sample on the target neural network in a white box manner according to whether the given neural network can correctly classify the challenge sample. The method is unstable and has high randomness. In many confidentiality scenarios, the assessment becomes impractical because it is difficult for an evaluator to grasp the internal structure of the deep learning model.
Thus, a new method of evaluating the effect against a sample attack is needed. Currently, there is no systematic, intuitive index to reflect the effect of the challenge sample on the deep neural network, nor is there a standard system to remotely evaluate the jeopardy of the challenge sample in a black box manner. Therefore, the invention provides a deep neural network challenge sample scoring method for evaluating and quantifying the challenge effect of a challenge sample.
Disclosure of Invention
In order to overcome the problems of the prior art, the invention provides a deep neural network countersample scoring method. The invention comprises a challenge sample mobility calculating module, a challenge sample imperceptibility calculating module, a challenge sample attack success rate calculating module, a challenge sample label offset calculating module and a challenge sample scoring calculating module. The challenge sample score calculation module calculates the total destructive power score of the final challenge sample to evaluate and quantify the vulnerability of the deep neural network and the harmfulness of the challenge sample.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows: a deep neural network challenge sample scoring method comprising the steps of:
step one, calculating the mobility, imperceptibility, attack success rate and label offset degree of the challenge sample, wherein the challenge sample is an image challenge sample and/or a text challenge sample.
And step two, determining a membership degree subset table.
And thirdly, determining the evaluation weight A of each aspect by using an analytic hierarchy process.
And step four, blurring the comprehensive evaluation matrix to obtain a score index of the countermeasure sample.
The invention has the advantages and beneficial effects as follows:
the invention proposes an challenge sample score AES (Adversarial Examples Score) index to evaluate the effect of challenge samples on image and text deep learning networks. The advantages are as follows:
the AES index provides an impact against sample evaluation score. In terms of computer vision, application scenes of the image countermeasure samples exist in image classification, face recognition, image semantic segmentation, target detection, automatic driving and the like, and in terms of natural language processing, application scenes of the text countermeasure samples include text classification, machine translation, text abstracts and the like. Because the AES index is designed by integrating different factors (such as a sample, an countermeasure generation algorithm, a deep neural network model) and aiming at the characteristics of the image sample and the text sample, the AES index can be used for evaluating the harmfulness of the image type countermeasure sample and the text type countermeasure sample to the deep neural network in a universal way, and can be used as other indexes, such as a reference index for evaluating and measuring the quality of a certain type of sample of a target model and measuring the vulnerability of the model.
First, the AES index may be used to quantify the quality of the antagonism samples generated by different antagonism sample generation algorithms and the effect of attacks on the neural network. By means of the AES index, after the characteristics and attack effects of different countermeasure sample generation algorithms are obtained, a practitioner can select the most appropriate and most efficient countermeasure sample generation algorithm according to the actual conditions of the neural network model and the samples in the attack and defense scene. For example, under the application scenes of image classification, face recognition, machine translation and the like, a practitioner can attack and test the neural network model better by means of the AES index, and can recommend a defending strategy for improving the safety of the model according to the evaluation result of the countermeasure sample, so that the safety of the deep neural network model is improved.
Second, the AES index may be used as a reference for the quality of the selected training samples. For a target neural network, given the current training sample, if the model can correctly classify the original training sample but cannot correctly classify the challenge sample, this may indicate that the model needs to be further trained on more or better quality training samples to make the model sufficiently robust.
Finally, the AES index may be used to measure and evaluate the security and vulnerability of the model. Traditionally, deep learning researchers and practitioners have focused mainly on the performance of deep neural network models, ignoring security and vulnerabilities. In the fields of image recognition, target detection, automatic driving, text classification and the like, a large number of deep neural network models exist, but a safety evaluation scheme for the models is lacking, and by means of an AES index, the models can be tested and the safety of the models can be measured at the same time. This will enable practitioners to determine the best deep neural network model to use, even by improving the model with new vulnerability issues.
Drawings
FIG. 1 is a diagram illustrating generation of a deep learning model challenge sample in accordance with the present invention;
FIG. 2 is a flowchart of an anti-sample migration algorithm according to the present invention;
FIG. 3 is a flowchart of an algorithm for calculating the LO index according to the present invention;
fig. 4 is a flowchart of the challenge sample scoring AES calculation of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and specifically described below with reference to the drawings in the embodiments of the present invention. The described embodiments are only a few embodiments of the present invention.
The technical scheme for solving the technical problems is as follows:
referring to fig. 1-4, in an embodiment of the present invention, the system includes a challenge sample mobility calculation module, a challenge sample unaware calculation module, a challenge sample attack success rate calculation module, a challenge sample label offset calculation module, and a challenge sample score calculation module. The challenge sample score calculation module calculates the total destructive power score of the final challenge sample to evaluate and quantify the vulnerability of the deep neural network and the harmfulness of the challenge sample.
1. Computing migratability
The mobility as shown in fig. 2 represents the ability of the challenge sample generated by one method to maintain a certain challenge under different deep learning models, which represents the applicability of the challenge sample. Some mobility exists against the sample mainly due to the deep learning model classifier having the following features, called discriminant model. When using a discriminant model to solve the classification problem, the goal is to better classify the data. Thus, the model will maximize the distance between the sample and the decision boundary and expand the space of each class. The advantage is that classification is made easier, but the disadvantage is that each region has redundant spaces that do not belong to this class, in which there are antagonistic samples. Mobility is the ability of an opposing disturbance calculated by one model to migrate to another independently trained model. Since it is possible for any two models to learn similar non-robust features, perturbations that manipulate such features can be applied to both. The calculation process of the challenge sample mobility is as follows:
step 1: m is M N Is a group of neural network models for evaluation, and the target neural network model M is generated based on an countermeasure sample generation algorithm a to be evaluated 1 Generating challenge sample a c The method comprises the steps of carrying out a first treatment on the surface of the For example M 1 Is BidLSTM model, M 2 Is Fasttext model, M 3 For the Bert model, the challenge sample generation algorithm a is a WordHandling algorithm, and the WordHandling algorithm is used for generating the challenge sample a c 。
Step 2: retraining a target neural network model M 1 Using challenge sample a c Testing the device to obtain the identification accuracy AR 1 ;
Step 3: training neural network model M i (i=2, 3,..n.) using challenge sample a c Testing it to obtain AR i Until i > N, N represents the number of test neural network models, n=3 in this embodiment;
step 4: calculating the mobility Tf of the countermeasure sample, wherein the calculation formula is as follows
2. Calculating imperceptibility
Referring to fig. 1, some fine disturbance of the challenge sample to the original sample, which is difficult for human eyes to perceive, can cause a deep learning model classification error with high confidence, if the disturbance in the sample can be perceived by human after the challenge sample is generated, the attack of the challenge sample can be avoided, so the imperceptibility of the challenge sample can also represent the attack capability of the challenge sample, namely that the challenge sample is difficult to perceive only through human senses, and thus the challenge attack can be camouflaged. An attack against an attack is to deliberately add some imperceptible fine disturbance to the input sample, resulting in the model giving a false output with high confidence. Imperceptibility of the challenge sample is also an important indicator of the challenge sample.
In terms of image samples, the p-norm is most commonly used to measure the magnitude and number of disturbances added to an image, considering that it is difficult to define a metric that measures human visual ability. p-norm L p Calculating a distance x-x 'i of an input space between the clean image x and the generated challenge sample x' p Where p ε {0,1,2, +. specific distance calculation formulaAs shown below (p-norm represents manhattan distance when p=1; euclidean distance when p=2):
in the aspect of text samples, the invention adopts the score of language model confusion (perplexity) to evaluate the fluency of sentences, so as to judge the disturbance size and the semantic authenticity of the sentences. The basic idea of confusion is: the language model giving the sentence of the test set with the higher probability value is better, and after the language model is trained, the sentence of the test set is a normal sentence, so that the trained model is better when the probability on the test set is higher, and the calculation formula is as follows:
wherein w is i Representing word sequence w 1 ,w 2 ,…,w i-1 The i-th word in (a), N represents the total number of words, p (w i |w 1 ,w 2 ,…,w i-1 ) Representing the first i-1 words in a given sentence, the language model can predict the probability distribution that the ith word may appear, the greater the sentence probability, the better the language model, the less confusion, and the higher the imperceptibility of the text against the sample.
3. Calculating attack success rate
The attack success rate refers to the percentage of samples that misclassify the target model after attack. The model may output erroneous results after challenge. If the attack effect is good, the classification accuracy of the target model after being attacked is greatly reduced. The attack success rate is an important aspect for measuring the attack effect. For directional attacks, the calculation formula of the attack success rate is as follows:
wherein A represents an challenge sample generation algorithm, f represents a classification algorithm of the target model,is the target type of the directed attack. For non-directional attack, only the classification result and the original sample y need to be calculated i In different cases, the formula is as follows:
4. calculating a label offset
The label offset refers to the offset of the model classifying the challenge sample as a correct label, specifically, the difference between the probability of classifying the challenge sample as a correct label and the probability of classifying the original sample as correct, that is, the confidence offset value of the correct label. Recall that the output layer of the deep learning model determines the final class based on the probabilities assigned to each class. Thus, the predictive probability information for each category provides a basis for how the model classifies the challenge samples. For a robust deep learning model, it should assign the highest probability to the correct class. For a given raw sample and its challenge sample, the model may generate two different probabilities for each category. The difference between the probability of predicting the challenge sample as the correct class and the probability of predicting the original sample as the correct class reflects their distance from the predicted result. The probability that the original sample predicts as the correct class must be the largest in all class space, and the probability that the model predicts that the challenge sample is the correct class may be reduced to varying degrees depending on the effectiveness of the challenge sample, but if it is still the most probable then the final classification result is still correct. The more the predicted outcome of the challenge sample deviates from the correct category, the more destructive the challenge sample is. As shown in fig. 3, the detailed procedure for calculating the tag offset is as follows:
step 1: inputting a target neural network model M and an original sample set x c Countermeasure againstThe sample generation algorithm a.
Step 2: calculating the original sample of the target neural network model M pairPrediction category +.>Model M +.>Predicted probability set for each class +.>Model M +.>The predicted category result is->Probability of->Wherein->If->Returning to the step 2, and performing calculation of the next sample.
Step 3: generating raw samples according to the challenge sample generation algorithm aIs->Calculating the model M versus the challenge sample>Predicted probability set for each class +.>Model M +.>The predicted category result is->Probability of (2)
Step 4: calculate the countermeasures in model MThe prediction category is +.>Degree of deviation of-> Let i=i+1 until i > n.
Step 5: the LO-index is calculated and,
the input of the algorithm is a target neural network model M and an original sample image set x c Challenge sample generation algorithm a. The output is the label offset against the samples.
5. Calculating challenge sample score (AES index)
As shown in fig. 4, the AES index is calculated by a fuzzy comprehensive evaluation method, and the AES index is intended to provide a measure for evaluating the ability of a given challenge sample to destroy the target deep learning model. The steps for calculating the AES index are as follows:
step 1: and determining a membership degree subset table of the mobility, imperceptibility, attack success rate and label offset of the challenge sample, and further determining a membership degree matrix.
Step 2: and constructing a pair comparison matrix, and determining the weight and the maximum characteristic root of the migration performance, the imperceptibility, the attack success rate and the label offset of the challenge sample.
Step 3: and (5) performing consistency test.
Step 4: by the formulaAnd calculating to obtain an evaluation result matrix, wherein A is the weight of four indexes, R is a membership matrix, and then performing defuzzification to obtain an AES index.
The membership subset tables for the mobility, imperceptibility, attack success rate and label offset of the challenge sample are as follows:
table 1 mobility membership subset table
Table 2 text challenge sample imperceptible membership subset table
Table 3 image challenge sample imperceptible membership subset table
Table 4 attack success rate membership subset table
Table 5 tag offset membership subset table
A pair-wise comparison matrix is created, weight vectors are calculated and consistency checks are made. The calculation method of each index weight is as follows:
normalizing each column of the judgment matrix
Averaging the normalized columns
ThenI.e. the feature vector that is sought.
Calculating the maximum characteristic root of the judgment matrixWherein W is i Representing normalized ith feature vector, (BW) i Representing the i-th element of the vector BW.
The four indexes of the invention are weighted to obtain the following results:
TABLE 6 index weight calculation
Lambda is calculated according to the above formula max =4.048, thus finding the consistency index ci= 0.01598, looking up the random consistency index value ri=0.90, so cr= 0.01796<0.1. Meeting the consistency requirement.
The membership degree subset table of the non-migration, perceptibility and sample construction cost constructed by the invention can be utilized to obtain the following membership degree matrix of each index. Element r in the matrix ij A membership vector representing mobility when i=1, a membership vector representing imperceptibility when i=2, a membership vector representing attack success rate when i=3, and a membership vector representing label offset when i=4.
The weights of the mobility, imperceptibility, attack success rate and label offset degree of the four indexes are A= (A) 1 ,A 2 ,A 3 ,A 4 ) The fuzzy comprehensive evaluation formula is as follows:
wherein A is the weight of four indexes of mobility, imperceptibility, attack success rate and label offset, R is a membership matrix obtained according to index calculation results, and B is a final evaluation result matrix of availability. Since the calculation result is only a fuzzy vector, the hazard of the countermeasure sample cannot be intuitively seen, and the membership vector is required to be subjected to anti-fuzzification processing to obtain a final AES index to score the countermeasure sample.
Handle b j For the evaluation set v j Taking the value obtained by weighted average as a judgment result to obtain an anti-fuzzified result b * The following formula is shown, where m represents the number of elements of the evaluation result matrix B:
if judge index b j Normalized, i.e. as follows:
Evaluation set v of the invention j = (1, 2,3, 4), the formula for the final calculation of AES index is as follows:
the above examples should be understood as illustrative only and not limiting the scope of the invention. Various changes and modifications to the present invention may be made by one skilled in the art after reading the teachings herein, and such equivalent changes and modifications are intended to fall within the scope of the invention as defined in the appended claims.
Claims (1)
1. A method for scoring a challenge sample by a deep neural network, comprising the steps of:
step one, calculating the mobility, imperceptibility, attack success rate and label offset of a countermeasure sample, wherein the countermeasure sample is an image countermeasure sample and/or a text countermeasure sample;
the step of calculating the mobility of the challenge sample comprises:
step 1: m is M N Is a group of neural network models for evaluation, and the target neural network model M is generated based on an countermeasure sample generation algorithm a to be evaluated 1 Generating challenge sample a c ;
Step 2: retraining a target neural network model M 1 Using challenge sample a c Testing the device to obtain the identification accuracy AR 1 ;
Step 3: training neural network model M i I=2, 3,..n, using challenge sample a c Enter itPerforming row test to obtain AR i Until i > N, N represents the number of test neural network models;
step 4: calculating the mobility Tf of the countermeasure sample, wherein the calculation formula is as follows
The calculating imperceptibility comprises calculating imperceptibility of an image challenge sample and calculating imperceptibility of a text challenge sample;
the imperceptibility of the computed image against the sample is: p-norm L p Calculating a distance x-x 'of an input space between the clean image x and the generated image challenge sample x' | p Where p ε {0,1,2, +. the specific distance calculation formula is as follows:
the imperceptibility of the calculated text challenge sample is: judging the disturbance size and the authenticity of the semantics by adopting the score of the confusion degree of the language model, wherein the smaller the confusion degree is, the higher the imperceptibility of the text countermeasure sample is, and the calculation formula of the confusion degree PP (w) of the text countermeasure sample is as follows:
wherein w is i Representing word sequence w 1 ,w 2 ,...,w i-1 The i-th word in (a), N represents the total number of words, p (w i |w 1 ,w 2 ,...,w i-1 ) Representing i-1 words before a given sentence, the language model can predict the probability distribution that the i-th word possibly appears, and the better the sentence probability, the lower the confusion degree is;
the calculating attack success rate comprises the following steps:
for directional attack, the calculation formula of the attack success rate is as follows:
where a denotes an challenge sample generation algorithm, f denotes a classification algorithm of the target model,is the target type of the directional attack, N represents the number of samples, x i Is the i-th original sample, a (x i ) Representing sample x i The challenge samples generated under algorithm a,is indicated at->A decrease in the accuracy of model identification;
for non-directional attack, only the classification result and the original sample y need to be calculated i In different cases, the formula is as follows:
wherein N represents the number of samples, x i Is the i-th original sample, a (x i ) Representing sample x i The challenge samples generated under algorithm a, f represents the classification algorithm of the target model, I (f (a (x i ))≠y i ) Is represented by f (a (x) i ))≠y i A decrease in the accuracy of model identification;
the step of calculating the label offset specifically comprises the following steps:
step 1: inputting a target neural network model M and an original sample set x c An challenge sample generation algorithm a;
step 2: calculating the original sample of the target neural network model M pairPrediction category +.>Where i=1, 2,..n, n represents the number of samples, model M +.>Predicted probability set for each class +.>Model M +.>The predicted category result is->Probability of->Wherein->If-> Returning to the step 2, and calculating the next sample;
step 3: generating raw samples according to the challenge sample generation algorithm aIs->Calculating the model M versus the challenge sample>Predicted probability set for each class +.>Model M +.>The predicted category result is->Probability of->
Step 4: calculate the countermeasures in model MThe prediction category is +.>Degree of deviation of-> Let i=i+1 until i > n;
step 5: calculating the label offset LO of the countermeasure sample, wherein the calculation formula is as follows
Step two, determining a membership degree subset table, and obtaining each membership degree subset table by utilizing the mobility, imperceptibility and sample construction costA membership matrix R of the index, wherein the elements R in the matrix ij A membership vector representing mobility when i=1, a membership vector representing imperceptibility when i=2, a membership vector representing attack success rate when i=3, a membership vector representing label offset when i=4,
step three, determining evaluation weights A of all aspects by using an analytic hierarchy process; the evaluation weight A comprises weights of mobility, imperceptibility, attack success rate and label offset, and A= { A 1 ,A 2 ,A 3 ,A 4 );
Step four, fuzzy comprehensive evaluation matrix is adopted to obtain score index of the countermeasure sample; the fuzzy comprehensive evaluation formula is as follows:
wherein A is the weight of four indexes of mobility, imperceptibility, attack success rate and label offset, R is a membership matrix obtained according to index calculation results, and B is a final evaluation result matrix of the obtained utilizability; since the calculation result is only a fuzzy vector, the harmfulness of the countermeasure sample cannot be intuitively seen, and the membership vector is required to be subjected to anti-fuzzification processing to obtain a final AES index to score the countermeasure sample, and the final formula for calculating the AES index is as follows:
b j is weight, v j Is an evaluation set.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210378464.0A CN114821227B (en) | 2022-04-12 | 2022-04-12 | Deep neural network countermeasures sample scoring method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210378464.0A CN114821227B (en) | 2022-04-12 | 2022-04-12 | Deep neural network countermeasures sample scoring method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114821227A CN114821227A (en) | 2022-07-29 |
CN114821227B true CN114821227B (en) | 2024-03-22 |
Family
ID=82534421
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210378464.0A Active CN114821227B (en) | 2022-04-12 | 2022-04-12 | Deep neural network countermeasures sample scoring method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114821227B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111858343A (en) * | 2020-07-23 | 2020-10-30 | 深圳慕智科技有限公司 | Countermeasure sample generation method based on attack capability |
CN112465015A (en) * | 2020-11-26 | 2021-03-09 | 重庆邮电大学 | Adaptive gradient integration adversity attack method oriented to generalized nonnegative matrix factorization algorithm |
CN112464245A (en) * | 2020-11-26 | 2021-03-09 | 重庆邮电大学 | Generalized security evaluation method for deep learning image classification model |
CN112882382A (en) * | 2021-01-11 | 2021-06-01 | 大连理工大学 | Geometric method for evaluating robustness of classified deep neural network |
CN113947016A (en) * | 2021-09-28 | 2022-01-18 | 浙江大学 | Vulnerability assessment method for deep reinforcement learning model in power grid emergency control system |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220100867A1 (en) * | 2020-09-30 | 2022-03-31 | International Business Machines Corporation | Automated evaluation of machine learning models |
CN115438337A (en) * | 2022-08-23 | 2022-12-06 | 中国电子科技网络信息安全有限公司 | Method for evaluating safety of deep learning confrontation sample |
-
2022
- 2022-04-12 CN CN202210378464.0A patent/CN114821227B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111858343A (en) * | 2020-07-23 | 2020-10-30 | 深圳慕智科技有限公司 | Countermeasure sample generation method based on attack capability |
CN112465015A (en) * | 2020-11-26 | 2021-03-09 | 重庆邮电大学 | Adaptive gradient integration adversity attack method oriented to generalized nonnegative matrix factorization algorithm |
CN112464245A (en) * | 2020-11-26 | 2021-03-09 | 重庆邮电大学 | Generalized security evaluation method for deep learning image classification model |
CN112882382A (en) * | 2021-01-11 | 2021-06-01 | 大连理工大学 | Geometric method for evaluating robustness of classified deep neural network |
CN113947016A (en) * | 2021-09-28 | 2022-01-18 | 浙江大学 | Vulnerability assessment method for deep reinforcement learning model in power grid emergency control system |
Non-Patent Citations (3)
Title |
---|
Adversarial Examples: Attacks and Defenses for Deep Learning;Xiaoyong Yuan 等;《Machine Learning》;20180707;第1-20页 * |
对抗样本危害程度分析与评估;艾锐;《万方数据》;20230706;第1-58页 * |
面向中文文本分类的词级对抗样本生成方法;仝鑫;王罗娜;王润正;王靖亚;;信息网络安全;20200910(第09期);第12-16页 * |
Also Published As
Publication number | Publication date |
---|---|
CN114821227A (en) | 2022-07-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111971698B (en) | Detection of back door using gradient in neural network | |
Zhong et al. | Backdoor embedding in convolutional neural network models via invisible perturbation | |
JP7059368B2 (en) | Protecting the cognitive system from gradient-based attacks through the use of deceptive gradients | |
CN110941794B (en) | Challenge attack defense method based on general inverse disturbance defense matrix | |
Nesti et al. | Detecting adversarial examples by input transformations, defense perturbations, and voting | |
CN115186816B (en) | Back door detection method based on decision shortcut search | |
Bountakas et al. | Defense strategies for adversarial machine learning: A survey | |
US20220374606A1 (en) | Systems and methods for utility-preserving deep reinforcement learning-based text anonymization | |
He et al. | Semi-leak: Membership inference attacks against semi-supervised learning | |
Xiao et al. | Latent imitator: Generating natural individual discriminatory instances for black-box fairness testing | |
CN113988293A (en) | Method for generating network by antagonism of different hierarchy function combination | |
Hu et al. | EAR: an enhanced adversarial regularization approach against membership inference attacks | |
Tuna et al. | Closeness and uncertainty aware adversarial examples detection in adversarial machine learning | |
CN118350436A (en) | Multimode invisible back door attack method, system and medium based on disturbance countermeasure | |
Bharath Kumar et al. | Analysis of the impact of white box adversarial attacks in resnet while classifying retinal fundus images | |
Moskal et al. | Translating intrusion alerts to cyberattack stages using pseudo-active transfer learning (PATRL) | |
CN112613032B (en) | Host intrusion detection method and device based on system call sequence | |
CN114821227B (en) | Deep neural network countermeasures sample scoring method | |
Stock et al. | Lessons learned: How (not) to defend against property inference attacks | |
CN113378985A (en) | Countermeasure sample detection method and device based on layer-by-layer correlation propagation | |
Liu et al. | GAN-based classifier protection against adversarial attacks | |
Diepgrond | Can prediction explanations be trusted? On the evaluation of interpretable machine learning methods | |
Kumar | Adversarial attacks and defenses for large language models (LLMs): methods, frameworks & challenges | |
Gunasekaran | Evasion and Poison attacks on Logistic Regression-based Machine Learning Classification Model | |
Shi et al. | Learning-Based Difficulty Calibration for Enhanced Membership Inference Attacks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |