CN115935179A - Model Stealing Detection Method Combining Training Set Data Distribution and W Distance - Google Patents

Model Stealing Detection Method Combining Training Set Data Distribution and W Distance Download PDF

Info

Publication number
CN115935179A
CN115935179A CN202211346069.0A CN202211346069A CN115935179A CN 115935179 A CN115935179 A CN 115935179A CN 202211346069 A CN202211346069 A CN 202211346069A CN 115935179 A CN115935179 A CN 115935179A
Authority
CN
China
Prior art keywords
data
model
distance
sample
dimension
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211346069.0A
Other languages
Chinese (zh)
Inventor
罗森林
张辰龙
潘丽敏
陆永鑫
张笈
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN202211346069.0A priority Critical patent/CN115935179A/en
Publication of CN115935179A publication Critical patent/CN115935179A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to a model stealing detection method combining training set data distribution and W distance, and belongs to the technical field of computers and information science. Firstly, reducing dimensions of a training set and a query set by utilizing a VAE method; secondly, calculating probability distribution of the query set by utilizing maximum likelihood estimation, and sampling according to the probability distribution to obtain a plurality of groups of samples to be detected; then, for each group of samples to be detected, randomly sampling in a training set to obtain reference samples with the same number, and calculating the W distance between each group of samples to be detected and the reference samples; and finally, weighting and calculating all W distances by using the ratio of the number of categories in the reference sample to the total number of categories as a weight, and judging that the model stealing is detected when a weighting calculation result is greater than a detection threshold value. The invention provides a model stealing detection method for associating training set data distribution, simultaneously considers the characteristics of the distribution of a query set and a training set sample, improves a W distance calculation method, and effectively improves the accuracy of model stealing detection.

Description

Model stealing detection method combining training set data distribution and W distance
Technical Field
The invention relates to a model stealing detection method combining training set data distribution and W distance, and belongs to the technical field of computers and information science.
Background
Model stealing attacks are a type of malicious behavior that steals model functions or simulates model decision boundaries. Stolen models are often trained by owners spending a lot of time and money, have great commercial value, and once stolen, the rights and interests of the model owners are damaged, and meanwhile, the stolen models provide springboards for resisting attacks. Therefore, the method for researching the high-accuracy model stealing detection has important theoretical significance and practical value.
The existing model stealing detection method is mainly based on an abnormal sample detection method. In order to improve the attack efficiency, an attacker can utilize a small number of samples in the training sample set to synthesize samples, so that the synthesized samples are closer to the classification boundary of the target model, or the attacker can fully utilize the prediction vector of the target model as feedback and use a related data set or a randomly generated vector to train the substitution model. The existing research is dedicated to mining the distribution change of query samples caused by abnormal samples, for example, on the premise that the distance between two randomly extracted points in a finite space obeys normal distribution, whether the distance of a group of query samples obeys normal distribution is analyzed; or judging whether the query sample has excessive neighbor samples through a K neighbor algorithm. However, the existing detection method cannot change along with the change of the model, when an attacker steals one model successfully, a plurality of models can be stolen according to the method, the detection standard is simple, and the attacker can easily make an inference after trying for many times, so that the attacker can be bypassed, subsequent attacks cannot be responded, and the detection accuracy is greatly influenced.
In summary, the existing model stealing detection method has simple and fixed detection standard, and is easy to be bypassed after being deduced by an attacker. Therefore, the invention provides a model stealing detection method combining training set data distribution and W distance.
Disclosure of Invention
The invention aims to provide a model stealing detection method combining training set data distribution and W distance, aiming at the problem that the detection standard is fixed and easy to infer during model stealing detection.
The design principle of the invention is as follows: firstly, training a VAE model by utilizing a training data set, and calculating the output of the training set in the VAE to form a dimension reduction data set S; secondly, taking a group of query samples as VAE model input samples to obtain a reduced-dimension data set S 'of the query samples, calculating probability distribution of the S' by utilizing maximum likelihood estimation, and sampling according to the probability distribution to obtain k groups of samples to be detected, wherein the capacity of each group is D; thirdly, for each group of data sampled from S', randomly sampling a group of reference sample groups with the capacity of D from S as W distance calculation pairs, and calculating W distances between the two groups, namely Wasserstein distances; and finally, weighting and calculating the W distance by using the ratio of the number of the reference sample categories to the total number of the categories as a weight to judge the query behavior.
The technical scheme of the invention is realized by the following steps:
step 1, training a VAE model and a target model by using a training data set, and obtaining a dimension reduction data set S of the training data set by using the VAE model.
Step 1.1, constructing a VAE model framework.
Step 1.2, determining a VAE model loss function.
And 1.3, coding the training set data by using a VAE model to obtain the data after dimension reduction to form a data set S.
And 2, reducing the dimensions of the query data by using a VAE model, calculating probability distribution of each dimension based on maximum likelihood estimation, and sampling multi-group data by the probability distribution.
And 2.1, maintaining a queue m for the input query sample, wherein the queue length is D.
And 2.2, reducing the dimension of the input sample by using a VAE model, wherein the dimension is h, adding a queue m, removing a queue head sample when m is full, and adding a new sample to the tail of the queue.
And 2.3, extracting the characteristic information of the query sample to the maximum extent by the VAE model, wherein each dimension of the data after dimension reduction is independent. Therefore, the probability density and the probability distribution of each dimension of the data in the queue m are calculated based on the maximum likelihood estimation, 1 group of data is obtained by sampling h groups of probability distributions, and the operation is repeated for k times.
And step 3, for each group of data obtained in the step 2.3, randomly sampling the data groups with the same capacity from the S to serve as Wasserstein distance calculation pairs.
And 4, calculating the Wasserstein distance of each pair of data sets in the step 3, and weighting and summing the result according to the ratio of the number of the classes of the samples in the data sets in the step 3 to the total number of the classes to obtain the final distance W.
And 5, judging the query behavior by using the final distance W.
Advantageous effects
Compared with the existing model stealing detection method, the method combines the training set data distribution and the W statistical distribution distance calculation method. Firstly, compared with the commonly used KL divergence and JS divergence, the W distance can be used for calculating the distribution distance of the two sample sets under the condition that the two sample sets are overlapped a little, and the detection stability is improved. Secondly, the method takes training set distribution as a detection standard, the number of model training set samples is large, the distribution is complex and the model training sets are not distributed to the public, and meanwhile, the model training sets are different, so that the detection standard which is difficult to bypass easily is provided; secondly, the method fully considers the characteristics of model stealing and considers that the sample distribution and the training set sample used for normal query are approximately the same, so that the method can effectively cope with various model stealing attacks; then, after probability distribution is calculated for the query sample data, a plurality of groups of samples are sampled, so that the distribution characteristics of the query sample can be fully reflected, and distribution judgment is facilitated; and finally, performing weighted calculation on each group of results, so that the influence of large W distance values caused by too few categories is reduced, and the detection accuracy is improved.
Drawings
FIG. 1 is a schematic diagram of a model stealing detection method based on W distance according to the present invention.
Detailed Description
In order to better illustrate the objects and advantages of the present invention, embodiments of the method of the present invention are described in further detail below with reference to examples.
The experimental data are from a plurality of image classification data, including a 10-classification clothing class fast-MNIST data set, a 10-classification small object class CIFAR10 data set, a Google street view doorplate number SVHN data set and a traffic sign GTSRB data set. The training set and the test set are segmented according to the proportion of 9.
Three attack means of JBDA, knockoffnets and MAZE are used in the experiment process, the three attack methods are respectively emphasized, the JBDA utilizes a small amount of training set samples to make synthetic samples, and therefore the classification boundary of the target model is approached; the knock offnets adopt a larger data set possibly related to a target model, and model stealing is completed by using output feedback; the MAZE does not use any sample related to a training set, so that on the premise that the maximum difference between output vector information of a clone model and output vector information of a target model is maximum, a data generator is trained, and stealing is finished.
The test adopts an Accuracy (Accuracy) evaluation model to steal the detection result, and the Accuracy calculation method is shown as formula (1):
Figure BDA0003917261150000031
wherein TP is the number of the stealing behaviors determined as stealing behaviors, FN is the number of the normal behaviors determined as normal behaviors, FP is the number of the normal behaviors determined as stealing behaviors, and TN is the number of the stealing behaviors determined as normal behaviors.
The experiment is carried out on a computer and a server, and the computer is specifically configured as follows: interi 7-8750H, CPU 2.20GHz, memory 8G, an operating system is windows 10, 64 bits; the specific configuration of the server is as follows: e7-4820v4, RAM 256G, and the operating system is Linux Ubuntu 64 bit.
The specific process of the experiment is as follows:
step 1, training a VAE model and a target model by using a training data set, and obtaining dimension reduction data of the training data set by using the VAE model to form a data set S.
Step 1.1, constructing a VAE model framework.
Step 1.2, determining a VAE model loss function, as shown in formula (2) and formula (3).
Figure BDA0003917261150000041
Figure BDA0003917261150000042
In the formula, D train In order to train the data set, the data set is,
Figure BDA0003917261150000043
for standard autoencoder reconstruction of losses, the calculation is for D train Calculating the expectation of the square of the difference before and after passing through the VAE model; />
Figure BDA0003917261150000044
Pass and/or>
Figure BDA0003917261150000045
Divergence-limited encoded data obey>
Figure BDA0003917261150000046
Distribution, the calculation mode is to D train The sample in (a) calculates its coded distribution and the @' of the given distribution>
Figure BDA0003917261150000047
The expectation of divergence.
And 1.3, coding the training set data by utilizing a VAE model to form a data set S.
And 2, reducing the dimension of the query data by using a VAE model, calculating the probability distribution of the query sample based on the maximum likelihood estimation, and sampling multi-group data by the probability distribution.
And 2.1, maintaining a queue m for the input query sample, wherein the queue length is D.
And 2.2, when the query sample is input, reducing the sample to a vector with dimension h through a VAE model, adding the vector into a queue m, removing a queue head sample when the queue m is full, adding a new sample into a queue tail, and judging query behavior once every time one sample is input when the queue m is full, so that the real-time detection of model stealing is realized.
Step 2.3, VAEThe model can fully extract sample characteristics, and each dimensionality of the reduced sample is independent, so that the probability density can be respectively obtained for each dimensionality. Assuming that each dimension data obeys a certain set of parameter distributions
Figure BDA0003917261150000048
Sample data is->
Figure BDA0003917261150000049
Calculating the probability density of the query sample by finding the parameter theta at which the maximum value of formula (4) is obtained, and calculating the probability distribution in each dimension of the query sample based thereon>
Figure BDA00039172611500000410
Figure BDA00039172611500000411
Step 2.4, in each P separately i Sampling once to obtain a group of data, repeating the sampling k times to obtain k groups of data, and recording the data as
Figure BDA00039172611500000412
Step 3, for each A i Random sampling is performed once in a data set S, the sample capacity is D, and the obtained data set is marked as B i
Step 4, calculating A according to a Wasserstein distance formula i And B i Wasserstein distance W between i And to W i A weighted sum is performed.
Step 4.1, calculate A according to formula (5) i And B i Wasserstein distance W between i
Figure BDA0003917261150000051
Step 4.2, recording all the categories in the training data set as T, B i Class of data contained inTotal number of digits is t i The weighted Wasserstein distance W is calculated according to equation (6).
Figure BDA0003917261150000052
And 5, setting a detection threshold value delta and judging the query behavior.
And 5.1, selecting a threshold value delta to ensure that the misjudgment rate of the detector to the normal query behavior is equal to 0.5%.
And 5.2, comparing W with delta, and when W is larger than delta, determining that the difference between the distribution of the query sample and the distribution of the sample in the training set is too large, the query behavior does not accord with the characteristics of normal query behavior, and judging that the query behavior is model stealing.
And (3) testing results: the method can accurately detect 3 kinds of attack behaviors and normal query behaviors, achieves 97.3% of model stealing detection accuracy and has a good detection effect.
The above detailed description is intended to illustrate the objects, aspects and advantages of the present invention, and it should be understood that the above detailed description is only exemplary of the present invention and is not intended to limit the scope of the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (3)

1. The model stealing detection method combining training set data distribution and W distance is characterized by comprising the following steps of:
step 1, training a VAE model and a target model by utilizing a training data set, obtaining a dimension reduction data set S of the training data set by utilizing the VAE model,
step 1.1, building a VAE model framework,
step 1.2, determining a VAE model loss function,
step 1.3, the training set data is coded by using a VAE model to obtain the data after dimensionality reduction to form a data set S,
step 2, using VAE model to reduce dimension of query data, calculating probability distribution of each dimension based on maximum likelihood estimation, sampling multi-group data from probability distribution,
step 2.1, maintaining a queue m for the input query sample, the queue length being D,
step 2.2, reducing the dimension of the input sample by utilizing a VAE model, adding a queue m with the dimension being h, removing a queue head sample when m is full, adding a new sample into the queue tail,
step 2.3, the VAE model extracts the characteristic information of the query sample to the maximum extent, and each dimension of the data after dimension reduction is independent, therefore, the probability density and the probability distribution of each dimension of the data in the queue m are calculated based on the maximum likelihood estimation, 1 group of data is obtained by sampling h groups of probability distributions, and the data is repeated for k times to obtain k groups of samples to be detected
Figure FDA0003917261140000011
Step 3, for each group of data A obtained in step 2.3 i Randomly sampling a reference sample group B with the same capacity from S i As a W distance calculation pair, namely, wasserstein distance calculation pair,
step 4, calculating the Wasserstein distance of each pair of data sets in the step 3 according to the B i The result is weighted and summed according to the ratio of the number of the classes of the medium samples to the total number of the classes to obtain the final distance W,
and 5, judging the query behavior by using the final distance W.
2. The model stealing detection method in combination with training set data distribution and W distance of claim 1, wherein: in step 2.3, the probability distribution of each dimension of the query sample after dimension reduction is calculated by utilizing maximum likelihood estimation, and each dimension of data is assumed to obey a certain group of parameter distribution
Figure FDA0003917261140000012
Sample data is->
Figure FDA0003917261140000013
A parameter theta is obtained when the following equation is maximized,
Figure FDA0003917261140000014
calculating the probability density of the query sample, and calculating the probability distribution of each dimension of the query sample
Figure FDA0003917261140000015
And respectively carrying out primary sampling in each Pi to obtain a group of data, and repeating the sampling for k times to obtain k groups of data.
3. The model stealing detection method in combination with training set data distribution and W distance of claim 1, wherein: in step 4, the Wasserstein distance is used to calculate the distribution distance between two sets of data, and the weighting calculation is performed according to the ratio of the category number of the reference sample in each pair of data sets to the total category number, as shown in the following formula,
Figure FDA0003917261140000021
A i and B i For a sampled data set, T is the total number of classes in the data set, T i Is B i The number of data classes contained in (a).
CN202211346069.0A 2022-10-31 2022-10-31 Model Stealing Detection Method Combining Training Set Data Distribution and W Distance Pending CN115935179A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211346069.0A CN115935179A (en) 2022-10-31 2022-10-31 Model Stealing Detection Method Combining Training Set Data Distribution and W Distance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211346069.0A CN115935179A (en) 2022-10-31 2022-10-31 Model Stealing Detection Method Combining Training Set Data Distribution and W Distance

Publications (1)

Publication Number Publication Date
CN115935179A true CN115935179A (en) 2023-04-07

Family

ID=86696677

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211346069.0A Pending CN115935179A (en) 2022-10-31 2022-10-31 Model Stealing Detection Method Combining Training Set Data Distribution and W Distance

Country Status (1)

Country Link
CN (1) CN115935179A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117173565A (en) * 2023-09-01 2023-12-05 中国科学院空天信息创新研究院 Method and system for estimating vegetation coverage of large-area wheat by deep migration learning

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117173565A (en) * 2023-09-01 2023-12-05 中国科学院空天信息创新研究院 Method and system for estimating vegetation coverage of large-area wheat by deep migration learning

Similar Documents

Publication Publication Date Title
CN107070943B (en) Industrial internet intrusion detection method based on flow characteristic diagram and perceptual hash
CN112491796B (en) Intrusion detection and semantic decision tree quantitative interpretation method based on convolutional neural network
CN111598179B (en) Power monitoring system user abnormal behavior analysis method, storage medium and equipment
CN111652290A (en) Detection method and device for confrontation sample
CN113283909B (en) Ether house phishing account detection method based on deep learning
CN111901340A (en) Intrusion detection system and method for energy Internet
CN115935179A (en) Model Stealing Detection Method Combining Training Set Data Distribution and W Distance
Guowei et al. Research on network intrusion detection method of power system based on random forest algorithm
CN115277189A (en) Unsupervised intrusion flow detection and identification method based on generative countermeasure network
Yu et al. WEB DDoS attack detection method based on semisupervised learning
CN114003900A (en) Network intrusion detection method, device and system for secondary system of transformer substation
CN116260565A (en) Chip electromagnetic side channel analysis method, system and storage medium
Yang et al. A classification method of fingerprint quality based on neural network
CN116662866A (en) End-to-end incomplete time sequence classification method based on data interpolation and characterization learning
CN107454084B (en) Nearest neighbor intrusion detection algorithm based on hybrid zone
Chao et al. Research on network intrusion detection technology based on dcgan
CN112804247B (en) Industrial control system network intrusion detection method and system based on ternary concept analysis
CN112291193B (en) LDoS attack detection method based on NCS-SVM
CN115051834A (en) Novel power system APT attack detection method based on STSA-transformer algorithm
CN116541698A (en) XGBoost-based network anomaly intrusion detection method and system
CN113852612A (en) Network intrusion detection method based on random forest
CN112860648A (en) Intelligent analysis method based on log platform
Lu et al. An Ensemble Learning-Based Cyber-Attacks Detection Method of Cyber-Physical Power Systems
CN115314254B (en) Semi-supervised malicious traffic detection method based on improved WGAN-GP
CN117290756B (en) Power transmission line fault identification method and device based on federal learning and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination