CN114511018A - Countermeasure sample detection method and device based on intra-class adjustment cosine similarity - Google Patents
Countermeasure sample detection method and device based on intra-class adjustment cosine similarity Download PDFInfo
- Publication number
- CN114511018A CN114511018A CN202210082385.5A CN202210082385A CN114511018A CN 114511018 A CN114511018 A CN 114511018A CN 202210082385 A CN202210082385 A CN 202210082385A CN 114511018 A CN114511018 A CN 114511018A
- Authority
- CN
- China
- Prior art keywords
- sample
- samples
- training
- cosine similarity
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 36
- 238000012549 training Methods 0.000 claims abstract description 241
- 238000013528 artificial neural network Methods 0.000 claims abstract description 106
- 238000000034 method Methods 0.000 claims abstract description 60
- 238000012417 linear regression Methods 0.000 claims abstract description 26
- 230000000694 effects Effects 0.000 claims abstract description 9
- 238000012360 testing method Methods 0.000 claims description 18
- 238000012795 verification Methods 0.000 claims description 5
- 239000011159 matrix material Substances 0.000 claims description 4
- 230000017105 transposition Effects 0.000 claims description 4
- 230000003042 antagnostic effect Effects 0.000 claims description 2
- 238000005259 measurement Methods 0.000 abstract description 3
- 238000004590 computer program Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 4
- 230000001360 synchronised effect Effects 0.000 description 2
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000004821 distillation Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The application relates to a confrontation sample detection method and device based on intra-class adjustment cosine similarity, computer equipment and a storage medium. The method comprises the following steps: inputting a training sample into a trained deep neural network to extract the output of each layer as sample characteristics; and then calculating the intra-class adjusted cosine similarity output by each layer between the training sample and the natural sample, training a linear regression classifier by using the obtained intra-class adjusted cosine similarity and label information output by the deep neural network to obtain a classification threshold, and detecting the confrontation sample according to the classification threshold. The method and the device effectively improve the detection accuracy of the countermeasure sample by taking the adjusted cosine similarity as a measurement mode of the difference between the countermeasure sample and the natural sample and introducing the label information predicted by the neural network, and particularly have obvious effect on a complex data set.
Description
Technical Field
The present application relates to the field of countermeasure sample detection technologies, and in particular, to a countermeasure sample detection method and apparatus based on intra-class cosine similarity adjustment, a computer device, and a storage medium.
Background
The deep neural network is vulnerable to the countermeasure sample, and an attacker can make wrong judgment by adding small disturbance to the input variable. This greatly affects the application of deep neural networks, especially in areas with higher safety requirements such as finance, medicine, autopilot, etc. Along with the appearance of the countermeasure sample, there are also many countermeasure modes aiming at the countermeasure sample, which can be mainly classified into three types, one type is an enhancement mode aiming at the model robustness, such as countermeasure distillation, countermeasure training, parameter regularization, etc., the second type is mainly aiming at the input sample to be processed, and mainly includes sample reconstruction, gaussian blur, sample compression, etc., the third type is mainly an inspection method aiming at the countermeasure sample, and the methods with higher efficiency at present mainly include methods based on Kernel Density estimation (Kernel Density Estimates), Bayesian uncertainty (Bayesian uncertainty), Local inherent Dimension (Local Intrinsic Dimension), etc. Although some detection methods work well for certain attack samples, they work poorly for other challenge samples. Therefore, the prior art is not good in accuracy and universality.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a robust sample detection method, apparatus, computer device and storage medium based on intra-class cosine similarity, which can improve the universality and effect of robust sample detection.
A method for confrontation sample detection based on intra-class adjusted cosine similarity, the method comprising:
acquiring a training sample set, inputting the training sample set into a trained deep neural network, extracting the output of each training sample in the training sample set on each layer of the deep neural network, and further acquiring the corresponding sample characteristic of each training sample on each layer; the training sample set comprises a natural sample and a confrontation sample; the sample characteristics of each layer are one or more;
calculating the characteristic mean value of all natural samples on each characteristic of each layer according to the sample characteristics of the natural samples on each layer;
obtaining adjusted cosine similarity between each training sample and each natural sample according to the sample characteristics of the training samples and the characteristic mean value of all the natural samples;
obtaining label information of the training samples predicted by the deep neural network, selecting a plurality of similar samples which are the same as the label information of the training samples and have the largest adjusted cosine similarity in the natural samples according to the label information, and obtaining the intra-class adjusted cosine similarity of the training samples according to the adjusted cosine similarity of the similar samples;
training a linear regression classifier according to the intra-class adjustment cosine similarity of the training samples to obtain classification threshold values of natural samples and confrontation samples;
inputting the samples to be detected in the test sample set into the deep neural network, calculating the intra-class adjusted cosine similarity of the samples to be detected, and outputting the detection result of whether the samples to be detected are the countermeasure samples according to the intra-class adjusted cosine similarity and the classification threshold.
In one embodiment, the method further comprises the following steps: acquiring a training sample set;
inputting the training sample set into a trained deep neural network, and extracting the output of each training sample in the training sample set on each layer of the deep neural network;
and reducing the output dimension of each training sample on each layer of the deep neural network to one dimension to obtain the corresponding sample characteristic of each training sample on each layer.
In one embodiment, the method further comprises the following steps: obtaining each training sample x according to the sample characteristics of the training samples and the characteristic mean value of all the natural samplesiAnd each natural sample xjThe adjusted cosine similarity between them is:
wherein m is the number of features in the nth layer of the deep neural network, N is the total number of layers of the deep neural network,representing a training sample xiSample features at the nth layer of the deep neural network,representing a natural sample xjSample features at the nth layer of the deep neural network,representing a training sample xiAt the kth sample feature at the nth layer of the deep neural network,representing a natural sample xjAt the kth sample feature at the nth layer of the deep neural network,and the mean value of the characteristics of the kth sample of all natural samples at the nth layer of the deep neural network is represented.
In one embodiment, the method further comprises the following steps: obtaining the training sample xiTag information predicted by the deep neural network;
selecting the natural sample and the training sample x according to the label informationiThe label information is the same, and a plurality of similar samples with the largest cosine similarity are adjusted;
according to the homogeneous sample yjThe average value of the cosine similarity is adjusted to obtain the training sample xiThe intra-class adjusted cosine similarity of (2) is:
wherein,expressed in the nth layer of the deep neural network and the training sample xiAnd the same type sample set is formed by a plurality of same type samples with the same label information and the maximum cosine similarity.
In one embodiment, the method further comprises the following steps: obtaining the training sample xiThe intra-class adjusted cosine similarity Iclass (x) of (a)i) Comprises the following steps:
Iclass(xi)=[Iclass(xi)1,Iclass(xi)1,···Iclass(xi)N]Tt represents a matrix transposition;
obtaining the training sample xiWhether the sample information is the sample information of the countermeasure sample or not, and marking the characteristic of the intra-class cosine similarity as 0 or 1 according to the sample information; wherein 1 represents the training sample xiFor the challenge sample, 0 represents the training sample xiNot a challenge sample;
and training a linear regression classifier according to the intra-class adjusted cosine similarity and the feature labels of all the training samples to obtain the classification threshold values of the natural samples and the confrontation samples.
In one embodiment, when training a linear regression classifier according to the intra-class cosine similarity of the training samples, the method further includes: and verifying the training effect of each round of training through a verification sample set.
In one embodiment, the method further comprises the following steps: the challenge sample in the set of test samples is generated from a natural sample in the set of test samples.
A confrontation sample detection device based on intra-class adjusted cosine similarity, the device comprising:
the sample characteristic determining module is used for acquiring a training sample set, inputting the training sample set into a trained deep neural network, extracting the output of each training sample in the training sample set on each layer of the deep neural network, and further obtaining the sample characteristic corresponding to each training sample on each layer; the training sample set comprises a natural sample and a confrontation sample; the sample characteristics of each layer are one or more;
the adjusted cosine similarity determining module is used for calculating the characteristic mean value of all the natural samples on each layer of each characteristic according to the sample characteristics of the natural samples on each layer, and obtaining the adjusted cosine similarity between each training sample and each natural sample according to the sample characteristics of the training samples and the characteristic mean values of all the natural samples;
an intra-class adjusted cosine similarity determining module, configured to obtain label information of the training samples predicted by the deep neural network, select, according to the label information, a plurality of similar samples that are the same as the label information of the training samples and have the largest adjusted cosine similarity among the natural samples, and obtain intra-class adjusted cosine similarity of the training samples according to the adjusted cosine similarity of the similar samples;
the linear regression classifier training module is used for training a linear regression classifier according to the intra-class adjustment cosine similarity of the training samples to obtain classification threshold values of natural samples and confrontation samples;
and the confrontation sample identification module is used for inputting the samples to be detected in the test sample set into the deep neural network, calculating the intra-class adjustment cosine similarity of the samples to be detected, and outputting the detection result of whether the samples to be detected are the confrontation samples or not according to the intra-class adjustment cosine similarity and the classification threshold.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
acquiring a training sample set, inputting the training sample set into a trained deep neural network, extracting the output of each training sample in the training sample set on each layer of the deep neural network, and further acquiring the corresponding sample characteristic of each training sample on each layer; the training sample set comprises a natural sample and a confrontation sample; the sample characteristics of each layer are one or more;
calculating the characteristic mean value of all the natural samples on each characteristic of each layer according to the sample characteristics of the natural samples on each layer, and obtaining the adjusted cosine similarity between each training sample and each natural sample according to the sample characteristics of the training samples and the characteristic mean values of all the natural samples;
obtaining label information of the training samples predicted by the deep neural network, selecting a plurality of similar samples which are the same as the label information of the training samples and have the largest adjusted cosine similarity in the natural samples according to the label information, and obtaining the intra-class adjusted cosine similarity of the training samples according to the adjusted cosine similarity of the similar samples;
training a linear regression classifier according to the intra-class adjustment cosine similarity of the training samples to obtain classification threshold values of natural samples and confrontation samples;
inputting the samples to be detected in the test sample set into the deep neural network, calculating the intra-class adjusted cosine similarity of the samples to be detected, and outputting the detection result of whether the samples to be detected are the countermeasure samples according to the intra-class adjusted cosine similarity and the classification threshold.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring a training sample set, inputting the training sample set into a trained deep neural network, extracting the output of each training sample in the training sample set on each layer of the deep neural network, and further acquiring the corresponding sample characteristic of each training sample on each layer; the training sample set comprises a natural sample and a confrontation sample; the sample characteristics of each layer are one or more;
calculating the characteristic mean value of all the natural samples on each characteristic of each layer according to the sample characteristics of the natural samples on each layer, and obtaining the adjusted cosine similarity between each training sample and each natural sample according to the sample characteristics of the training samples and the characteristic mean values of all the natural samples;
obtaining label information of the training samples predicted by the deep neural network, selecting a plurality of similar samples which are the same as the label information of the training samples and have the largest adjusted cosine similarity in the natural samples according to the label information, and obtaining the intra-class adjusted cosine similarity of the training samples according to the adjusted cosine similarity of the similar samples;
training a linear regression classifier according to the intra-class adjustment cosine similarity of the training samples to obtain classification threshold values of natural samples and confrontation samples;
inputting the samples to be detected in the test sample set into the deep neural network, calculating the intra-class adjusted cosine similarity of the samples to be detected, and outputting the detection result of whether the samples to be detected are the countermeasure samples according to the intra-class adjusted cosine similarity and the classification threshold.
According to the method, the device, the computer equipment and the storage medium for detecting the confrontation sample based on the intra-class adjustment cosine similarity, the training sample is input into the trained deep neural network to extract the output of each layer as the sample characteristic; and then calculating the intra-class adjusted cosine similarity output by each layer between the training sample and the natural sample, training a linear regression classifier by using the obtained intra-class adjusted cosine similarity and label information output by the deep neural network to obtain a classification threshold, and detecting the confrontation sample according to the classification threshold. The method and the device effectively improve the detection accuracy of the countermeasure sample by taking the adjusted cosine similarity as a measurement mode of the difference between the countermeasure sample and the natural sample and introducing the label information predicted by the neural network, and particularly have obvious effect on a complex data set.
Drawings
FIG. 1 is a flow diagram illustrating an exemplary method for robust sample detection based on intra-class cosine similarity;
FIG. 2 is a ROC-AUC comparison graph in one embodiment, wherein (a) is a ROC-AUC comparison graph of a fast gradient symbolic attack method, (b) is a ROC-AUC comparison graph of a projection symbolic gradient attack method, (c) is a ROC-AUC comparison graph of an attack method based on a Jacobian significant graph, (b) is a DEEPFOOL attack method ROC-AUC comparison graph, and (e) is a ROC-AUC comparison graph of a CW2 attack method;
FIG. 3 is a block diagram of an exemplary apparatus for testing countermeasure samples based on intra-class cosine similarity;
FIG. 4 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In one embodiment, as shown in fig. 1, a method for detecting countermeasures based on intra-class cosine similarity is provided, which includes the following steps:
102, acquiring a training sample set, inputting the training sample set into a trained deep neural network, extracting the output of each training sample in the training sample set on each layer of the deep neural network, and further obtaining the corresponding sample characteristics of each training sample on each layer.
The training sample set comprises a natural sample and a confrontation sample; the sample characteristics of each layer are one or more.
A natural sample is a sample that is not disturbed. A challenge sample is an input sample formed by deliberately adding subtle perturbations in the data set that will cause the model to give an erroneous output with high confidence. Data points are deliberately constructed by an optimization process on a neural network with human-level accuracy, with an error rate close to 100%, and the output of the model at this input point x is very different from the nearby data points x'. In many cases, x is very similar to x', and a human observer will not perceive the difference between the natural and challenge samples, but the network will make very different predictions.
And 104, calculating the characteristic mean value of all the natural samples on each characteristic of each layer according to the sample characteristics of the natural samples on each layer, and obtaining the adjusted cosine similarity between each training sample and each natural sample according to the sample characteristics of the training samples and the characteristic mean values of all the natural samples.
Step 104 and step 106 are to calculate the intra-class Adjusted Cosine Similarity (Inner-class Adjusted Cosine Similarity) between the training sample and the natural sample, which is output from each layer of the deep neural network.
And step 106, obtaining label information of the training samples predicted by the deep neural network, selecting a plurality of similar samples which are the same as the label information of the training samples and have the largest adjusted cosine similarity in the natural samples according to the label information, and obtaining the intra-class adjusted cosine similarity of the training samples according to the adjusted cosine similarity of the similar samples.
And 108, training a linear regression classifier according to the intra-class adjustment cosine similarity of the training samples to obtain classification threshold values of the natural samples and the confrontation samples.
And step 110, inputting the samples to be detected in the test sample set into the deep neural network, calculating the intra-class adjusted cosine similarity of the samples to be detected, and outputting a detection result of whether the samples to be detected are confrontation samples according to the intra-class adjusted cosine similarity and the classification threshold.
In the method for detecting the confrontation sample based on the intra-class adjustment cosine similarity, the training sample is input into a trained deep neural network to extract the output of each layer as the sample characteristic; and then calculating the intra-class adjusted cosine similarity output by each layer between the training sample and the natural sample, training a linear regression classifier by using the obtained intra-class adjusted cosine similarity and label information output by the deep neural network to obtain a classification threshold, and detecting the confrontation sample according to the classification threshold. The method takes the adjusted cosine similarity as a measurement measure mode of the difference between the confrontation sample and the natural sample, and introduces the label information predicted by the neural network, thereby effectively improving the detection accuracy of the confrontation sample, and particularly having obvious effect on a complex data set.
In one embodiment, the method further comprises the following steps: acquiring a training sample set; inputting the training sample set into a trained deep neural network, and extracting the output of each training sample in the training sample set on each layer of the deep neural network; and reducing the output dimension of each training sample on each layer of the deep neural network to one dimension to obtain the corresponding sample characteristic of each training sample on each layer.
In one embodiment, the method further comprises the following steps: obtaining each training sample x according to the sample characteristics of the training samples and the characteristic mean value of all natural samplesiAnd each natural sample yjThe adjusted cosine similarity between them is:
wherein m is the number of features in the nth layer of the deep neural network, N is the total number of layers of the deep neural network,representing a training sample xiSample features at the nth layer of the deep neural network,representing a natural sample yjSample features at the nth layer of the deep neural network,representing a training sample xiAt the kth sample feature at the nth layer of the deep neural network,representing a natural sample yjAt the kth sample feature at the nth layer of the deep neural network,and the mean value of the characteristics of the kth sample of all natural samples at the nth layer of the deep neural network is represented.
In one embodiment, the method further comprises the following steps: obtaining training samples xiLabel information predicted by the deep neural network;
selecting natural sample x and training sample x according to label informationiThe label information is the same, and a plurality of similar samples with the largest cosine similarity are adjusted;
according to the same kind of sample yjThe average value of the cosine similarity is adjusted to obtain a training sample xiThe intra-class adjusted cosine similarity of (2) is:
wherein,expressed in the nth layer of the deep neural network and the training sample xiAnd the same type sample set is formed by a plurality of same type samples with the same label information and the maximum cosine similarity.
In one embodiment, the method further comprises the following steps: obtaining training samples xiIs used to adjust the cosine similarity Iclass (x)i) Comprises the following steps: iclass (x)i)=[Iclass(xi)1,Iclass(xi)1,···Iclass(xi)N]TT represents a matrix transposition; obtaining training samples xiWhether the sample information is the sample information of the countermeasure sample or not, and marking the characteristic of the intra-class cosine similarity adjustment as 0 or 1 according to the sample information; where 1 denotes a training sample xiFor the challenge sample, 0 represents the training sample xiNot a challenge sample; and training a linear regression classifier according to the intra-class adjusted cosine similarity and the feature labels of all the training samples to obtain the classification threshold values of the natural samples and the confrontation samples.
In one embodiment, when the linear regression classifier is trained according to the intra-class cosine similarity of the training samples, the method further includes: and verifying the training effect of each round of training through a verification sample set.
In one embodiment, the method further comprises the following steps: the challenge sample in the test sample set is generated from a natural sample in the test sample set.
In the experiment, according to different purposes, an original data set is divided into a training sample set, a verification sample set and a test sample set for experiment, so that the detection performance of the anti-sample is evaluated in a stable manner. As shown in table 1, the classifier is trained on a training sample set, and at the end of each training round, the training effect is verified by verifying the sample set, and the test sample set is used for verifying the performance of the classifier on one hand, and is used for generating an antagonistic sample and used as a natural sample in a later detection stage. On one hand, the robustness of the method is further verified by using the test sample set data to generate the countermeasure sample, because the countermeasure sample has a greater similarity with the natural sample generated by the countermeasure sample, and if the detection method can grasp the difference in the situation, the performance of the method is better demonstrated, and on the other hand, in the real process, the generation of the countermeasure sample is uncontrollable, and it is difficult to ensure that the natural sample generated by the countermeasure sample does not coincide with the natural sample for comparison, so in the process, the natural sample generated by the countermeasure sample and the natural sample for comparison are allowed to coincide.
TABLE 1
In one embodiment, experiments were performed on three datasets (MNIST dataset, SVHN dataset, and CIFAR10 dataset), with the network structure shown in table 2, model training parameters shown in table 3, and parameters for generating challenge samples shown in table 4; for the parameters in the comparison method, a network model completely consistent with the experiment and a confrontation sample are adopted.
TABLE 2
Note: [. shows a residual network Block or structure (Resnet Block)
TABLE 3
TABLE 4
Let e denote the disturbance amplitude, c denote the confidence constant, θ controls the maximum iteration number, γ denotes the disturbance coefficient, search _ step denotes the number of two-term search, iteration denotes the iteration number, max _ iters denotes the maximum iteration number, overshot denotes the termination condition constant to prevent class update, lr denotes the gradient update rate.
This example uses the auc (area Under cut) value as a measure of the separability of the extracted intra-class locally adjusted cosine similarity features (separating challenge and natural samples). Fig. 2 shows ROC-AUC comparison of the intra-class cosine similarity based countermeasure sample detection method and the intra-class cosine similarity based countermeasure sample detection method provided by the present invention, wherein (a) is AUC score comparison of the above method applied to a fast gradient sign attack method, (b) is AUC score comparison of the above method applied to a projection gradient descent attack method, (c) is AUC score comparison of the above method applied to a saliency map attack method based on a jacobian determinant, (d) is AUC score comparison of the above method applied to a deepool attack method, and (e) is AUC score comparison of the above method applied to a CW2 attack method. It can be seen that the method of the present invention has significant advantages over the other two methods, especially in more complex data sets. The experimental result shows that the countermeasure sample detection method based on intra-class adjustment cosine similarity is significantly superior to other countermeasure sample detection methods.
It should be understood that, although the steps in the flowchart of fig. 1 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in fig. 1 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
In one embodiment, as shown in fig. 3, there is provided a confrontation sample detection apparatus based on intra-class cosine similarity, including: a sample feature determination module 302, an adjusted cosine similarity determination module 304, an intra-class adjusted cosine similarity determination module 306, a linear regression classifier training module 308, and a confrontation sample identification module 310, wherein:
a sample characteristic determining module 302, configured to obtain a training sample set, input the training sample set into a trained deep neural network, extract an output of each training sample in the training sample set at each layer of the deep neural network, and further obtain a sample characteristic corresponding to each training sample at each layer; the training sample set comprises a natural sample and a confrontation sample; the sample characteristics of each layer are one or more;
an adjusted cosine similarity determination module 304, configured to calculate a feature mean value of all the natural samples on each layer of each feature according to the sample features of the natural samples on each layer, and obtain an adjusted cosine similarity between each training sample and each natural sample according to the sample features of the training samples and the feature mean values of all the natural samples;
an intra-class adjusted cosine similarity determining module 306, configured to obtain label information of the training samples predicted by the deep neural network, select, according to the label information, a plurality of similar samples that are the same as the label information of the training samples and have the largest adjusted cosine similarity among the natural samples, and obtain intra-class adjusted cosine similarities of the training samples according to the adjusted cosine similarities of the similar samples;
the linear regression classifier training module 308 is configured to train a linear regression classifier according to the intra-class adjustment cosine similarity of the training samples to obtain classification thresholds of the natural samples and the countermeasure samples;
the confrontation sample identification module 310 is configured to input a to-be-detected sample in the test sample set into the deep neural network, calculate an intra-class adjusted cosine similarity of the to-be-detected sample, and output a detection result of whether the to-be-detected sample is the confrontation sample according to the intra-class adjusted cosine similarity and the classification threshold.
The sample feature determination module 302 is further configured to obtain a training sample set; inputting the training sample set into a trained deep neural network, and extracting the output of each training sample in the training sample set on each layer of the deep neural network; and reducing the output dimension of each training sample on each layer of the deep neural network to one dimension to obtain the corresponding sample characteristic of each training sample on each layer.
The cosine similarity determination module 304 is further configured to obtain each training sample x according to the sample features of the training samples and the feature mean of all the natural samplesiAnd each natural sample xjThe adjusted cosine similarity between them is:
wherein m is the number of features in the nth layer of the deep neural network, N is the total number of layers of the deep neural network,representing a training sample xiSample features at the nth layer of the deep neural network,representing a natural sample xjSample features at the nth layer of the deep neural network,representing a training sample xiAt depth ofThe kth sample feature of the nth layer of the neural network,representing a natural sample xjAt the kth sample feature at the nth layer of the deep neural network,and the mean value of the characteristics of the kth sample of all natural samples at the nth layer of the deep neural network is represented.
The intra-class adjusted cosine similarity determination module 306 is further configured to obtain a training sample xiLabel information predicted by the deep neural network; selecting natural sample x and training sample x according to label informationiThe label information is the same, and a plurality of similar samples with the largest cosine similarity are adjusted; according to the same kind of sample yjThe average value of the cosine similarity is adjusted to obtain a training sample xiThe intra-class adjusted cosine similarity of (2) is:
wherein,expressed in the nth layer of the deep neural network and the training sample xiAnd the same type sample set is formed by a plurality of same type samples with the same label information and the maximum cosine similarity.
The linear regression classifier training module 308 is also used to obtain training samples xiIs used to adjust the cosine similarity Iclass (x)i) Comprises the following steps: iclass (x)i)=[Iclass(xi)1,Iclass(xi)1,···Iclass(xi)N]TT represents a matrix transposition; obtaining training samples xiWhether the sample information is the sample information of the countermeasure sample or not, and marking the characteristic of the intra-class cosine similarity adjustment as 0 or 1 according to the sample information; where 1 denotes a training sample xiFor the challenge sample, 0 represents the training sample xiNot a challenge sample;and training a linear regression classifier according to the intra-class adjusted cosine similarity and the feature labels of all the training samples to obtain the classification threshold values of the natural samples and the confrontation samples.
The linear regression classifier training module 308 is further configured to verify the training effect of each training round through the verification sample set when the linear regression classifier is trained according to the intra-class cosine similarity of the training samples.
For the specific limitation of the countermeasure sample detection apparatus based on the intra-class cosine similarity, reference may be made to the above limitation of the countermeasure sample detection method based on the intra-class cosine similarity, which is not described herein again. The modules in the countermeasure sample detection device based on the intra-class cosine similarity can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 4. The computer device comprises a processor, a memory, a network interface, a display screen and an input device which are connected through a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a countersample detection method based on intra-class adjusted cosine similarity. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the configuration shown in fig. 4 is a block diagram of only a portion of the configuration associated with the present application, and is not intended to limit the computing device to which the present application may be applied, and that a particular computing device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
In an embodiment, a computer device is provided, comprising a memory storing a computer program and a processor implementing the steps of the above method embodiments when executing the computer program.
In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (10)
1. A confrontation sample detection method based on intra-class adjustment cosine similarity is characterized by comprising the following steps:
acquiring a training sample set, inputting the training sample set into a trained deep neural network, extracting the output of each training sample in the training sample set at each layer of the deep neural network, and further acquiring the corresponding sample characteristic of each training sample at each layer; the training sample set comprises a natural sample and a confrontation sample; the sample characteristics of each layer are one or more;
calculating the characteristic mean value of all the natural samples on each characteristic of each layer according to the sample characteristics of the natural samples on each layer, and obtaining the adjusted cosine similarity between each training sample and each natural sample according to the sample characteristics of the training samples and the characteristic mean values of all the natural samples;
obtaining label information of the training samples predicted by the deep neural network, selecting a plurality of similar samples which are the same as the label information of the training samples and have the largest adjusted cosine similarity in the natural samples according to the label information, and obtaining the intra-class adjusted cosine similarity of the training samples according to the adjusted cosine similarity of the similar samples;
training a linear regression classifier according to the intra-class adjustment cosine similarity of the training samples to obtain classification threshold values of natural samples and confrontation samples;
inputting the samples to be detected in the test sample set into the deep neural network, calculating the intra-class cosine similarity of the samples to be detected, and outputting the detection result of whether the samples to be detected are antagonistic samples or not according to the intra-class cosine similarity and the classification threshold.
2. The method of claim 1, wherein obtaining a training sample set, inputting the training sample set into a trained deep neural network, extracting an output of each training sample in the training sample set at each layer of the deep neural network, and further obtaining a sample feature corresponding to each training sample at each layer comprises:
acquiring a training sample set;
inputting the training sample set into a trained deep neural network, and extracting the output of each training sample in the training sample set on each layer of the deep neural network;
and reducing the output dimension of each training sample on each layer of the deep neural network to one dimension to obtain the corresponding sample characteristic of each training sample on each layer.
3. The method of claim 2, wherein obtaining the adjusted cosine similarity between each training sample and each natural sample according to the sample features of the training samples and the feature mean of all the natural samples comprises:
obtaining each training sample x according to the sample characteristics of the training samples and the characteristic mean value of all the natural samplesiAnd each natural sample xjThe adjusted cosine similarity between them is:
wherein m is in depthThe number of features in the nth layer of the neural network, N is the total number of layers of the deep neural network,representing a training sample xiSample features at the nth layer of the deep neural network,representing a natural sample xjSample features at the nth layer of the deep neural network,representing a training sample xiAt the kth sample feature at the nth layer of the deep neural network,representing a natural sample xjAt the kth sample feature at the nth layer of the deep neural network,and the mean value of the characteristics of the kth sample of all natural samples at the nth layer of the deep neural network is represented.
4. The method according to claim 3, wherein obtaining label information of the training samples predicted by the deep neural network, selecting a plurality of similar samples which are the same as the label information of the training samples and have the largest adjusted cosine similarity among the natural samples according to the label information, and obtaining the intra-class adjusted cosine similarity of the training samples according to the adjusted cosine similarity of the similar samples comprises:
obtaining the training sample xiTag information predicted by the deep neural network;
selecting the natural sample and the training sample x according to the label informationiThe label information is the same, and a plurality of similar samples with the largest cosine similarity are adjusted;
obtaining the training sample x according to the average value of the adjusted cosine similarity of all the similar samplesiThe intra-class adjusted cosine similarity of (2) is:
5. The method of claim 4, wherein training a linear regression classifier based on the intra-class adjusted cosine similarity of the training samples to obtain classification thresholds for natural samples and challenge samples comprises:
obtaining the training sample xiThe intra-class adjusted cosine similarity Iclass (x) of (a)i) Comprises the following steps:
Iclass(xi)=[Iclass(xi)1,Iclass(xi)1,···Iclass(xi)N]Tt represents a matrix transposition;
obtaining the training sample xiWhether the sample information is the sample information of the countermeasure sample or not, and marking the characteristic of the intra-class cosine similarity as 0 or 1 according to the sample information; wherein 1 represents the training sample xiFor the challenge sample, 0 represents the training sample xiNot a challenge sample;
and training a linear regression classifier according to the intra-class adjusted cosine similarity and the feature labels of all the training samples to obtain the classification threshold values of the natural samples and the confrontation samples.
6. The method of claim 5, wherein in training a linear regression classifier based on the intra-class adjusted cosine similarity of the training samples, further comprising:
and verifying the training effect of each round of training through a verification sample set.
7. The method of any one of claims 1 to 6, wherein the challenge sample in the set of test samples is generated from a natural sample in the set of test samples.
8. A confrontation sample detection device based on intra-class cosine similarity adjustment, the device comprising:
the sample characteristic determining module is used for acquiring a training sample set, inputting the training sample set into a trained deep neural network, extracting the output of each training sample in the training sample set on each layer of the deep neural network, and further obtaining the sample characteristic corresponding to each training sample on each layer; the training sample set comprises a natural sample and a confrontation sample; the sample characteristics of each layer are one or more;
the adjusted cosine similarity determining module is used for calculating the characteristic mean value of all the natural samples on each layer of each characteristic according to the sample characteristics of the natural samples on each layer, and obtaining the adjusted cosine similarity between each training sample and each natural sample according to the sample characteristics of the training samples and the characteristic mean values of all the natural samples;
an intra-class adjusted cosine similarity determining module, configured to obtain label information of the training samples predicted by the deep neural network, select, according to the label information, a plurality of similar samples that are the same as the label information of the training samples and have the largest adjusted cosine similarity among the natural samples, and obtain intra-class adjusted cosine similarity of the training samples according to the adjusted cosine similarity of the similar samples;
the linear regression classifier training module is used for training a linear regression classifier according to the intra-class adjustment cosine similarity of the training samples to obtain classification threshold values of natural samples and confrontation samples;
and the confrontation sample identification module is used for inputting the samples to be detected in the test sample set into the deep neural network, calculating the intra-class adjustment cosine similarity of the samples to be detected, and outputting the detection result of whether the samples to be detected are the confrontation samples or not according to the intra-class adjustment cosine similarity and the classification threshold.
9. The apparatus of claim 8, wherein the sample characteristic determination module is further configured to:
acquiring a training sample set;
inputting the training sample set into a trained deep neural network, and extracting the output of each training sample in the training sample set on each layer of the deep neural network;
and reducing the output dimension of each training sample on each layer of the deep neural network to one dimension to obtain the corresponding sample characteristic of each training sample on each layer.
10. The apparatus of claim 9, wherein the adjusted cosine similarity determination module is further configured to:
obtaining each training sample x according to the sample characteristics of the training samples and the characteristic mean value of all the natural samplesiAnd each natural sample xjThe adjusted cosine similarity between them is:
wherein m is the number of features in the nth layer of the deep neural network, N is the total number of layers of the deep neural network,representing a training sample xiSample features at the nth layer of the deep neural network,representing a natural sample xjSample features at the nth layer of the deep neural network,representing a training sample xiAt the kth sample feature at the nth layer of the deep neural network,representing a natural sample xjAt the kth sample feature at the nth layer of the deep neural network,and the mean value of the characteristics of the kth sample of all natural samples at the nth layer of the deep neural network is represented.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210082385.5A CN114511018A (en) | 2022-01-24 | 2022-01-24 | Countermeasure sample detection method and device based on intra-class adjustment cosine similarity |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210082385.5A CN114511018A (en) | 2022-01-24 | 2022-01-24 | Countermeasure sample detection method and device based on intra-class adjustment cosine similarity |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114511018A true CN114511018A (en) | 2022-05-17 |
Family
ID=81549825
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210082385.5A Pending CN114511018A (en) | 2022-01-24 | 2022-01-24 | Countermeasure sample detection method and device based on intra-class adjustment cosine similarity |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114511018A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115188384A (en) * | 2022-06-09 | 2022-10-14 | 浙江工业大学 | Voiceprint recognition countermeasure sample defense method based on cosine similarity and voice denoising |
-
2022
- 2022-01-24 CN CN202210082385.5A patent/CN114511018A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115188384A (en) * | 2022-06-09 | 2022-10-14 | 浙江工业大学 | Voiceprint recognition countermeasure sample defense method based on cosine similarity and voice denoising |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111191539B (en) | Certificate authenticity verification method and device, computer equipment and storage medium | |
CN111191695B (en) | Website picture tampering detection method based on deep learning | |
CN111950329A (en) | Target detection and model training method and device, computer equipment and storage medium | |
CN112001932B (en) | Face recognition method, device, computer equipment and storage medium | |
CN111079841A (en) | Training method and device for target recognition, computer equipment and storage medium | |
CN111754519B (en) | Class activation mapping-based countermeasure method | |
CN111062036A (en) | Malicious software identification model construction method, malicious software identification medium and malicious software identification equipment | |
CN113610787A (en) | Training method and device of image defect detection model and computer equipment | |
CN116383814B (en) | Neural network model back door detection method and system | |
CN112733140A (en) | Detection method and system for model tilt attack | |
CN116015703A (en) | Model training method, attack detection method and related devices | |
CN110929724A (en) | Character recognition method, character recognition device, computer equipment and storage medium | |
CN114511018A (en) | Countermeasure sample detection method and device based on intra-class adjustment cosine similarity | |
CN115081618A (en) | Method and device for improving robustness of deep neural network model | |
CN112163110B (en) | Image classification method and device, electronic equipment and computer-readable storage medium | |
CN111783088B (en) | Malicious code family clustering method and device and computer equipment | |
CN110222724B (en) | Picture instance detection method and device, computer equipment and storage medium | |
CN110599665B (en) | Paper pattern recognition method and device, computer equipment and storage medium | |
US20230069960A1 (en) | Generalized anomaly detection | |
WO2022222832A1 (en) | Image attack detection method and apparatus and image attack detection model training method and apparatus | |
CN115880546A (en) | Confrontation robustness evaluation method based on class activation mapping chart and terminal equipment | |
Guerbai et al. | Techniques for Selecting the Optimal Parameters of One-Class Support Vector Machine Classifier for Reduced Samples | |
CN114724162A (en) | Training method and device of text recognition model, computer equipment and storage medium | |
Zheng et al. | A User Behavior-Based Random Distribution Scheme for Adversarial Example Generated CAPTCHA | |
CN110852400A (en) | Classification model evaluation method and device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |