CN115860112A - Countermeasure sample defense method and equipment based on model inversion method - Google Patents

Countermeasure sample defense method and equipment based on model inversion method Download PDF

Info

Publication number
CN115860112A
CN115860112A CN202310059601.9A CN202310059601A CN115860112A CN 115860112 A CN115860112 A CN 115860112A CN 202310059601 A CN202310059601 A CN 202310059601A CN 115860112 A CN115860112 A CN 115860112A
Authority
CN
China
Prior art keywords
model
vector
space
sample
potential
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310059601.9A
Other languages
Chinese (zh)
Other versions
CN115860112B (en
Inventor
田博为
曹雨欣
王骞
龚雪鸾
沈超
李琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202310059601.9A priority Critical patent/CN115860112B/en
Publication of CN115860112A publication Critical patent/CN115860112A/en
Application granted granted Critical
Publication of CN115860112B publication Critical patent/CN115860112B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a countermeasure sample defense method and equipment based on a model inversion method, and provides a method for realizing countermeasure sample defense by a model inversion mechanism based on a StyleGAN generator, aiming at solving the problem that a low-cost and efficient countermeasure sample defense method is absent in the field of deep neural network security. Through the deep analysis of the generator StyleGAN, the enhanced information training and the improved proAdaIN are provided, the proAdaIN is innovatively applied to the characteristic generation scheme of the countermeasure sample defense system, and the problems of high cost, low efficiency, poor defense effect and the like in the traditional defense scheme are solved by adding noise, decoupling characteristics and distinguishing real samples and countermeasure samples by utilizing conflict semantics.

Description

Countermeasure sample defense method and equipment based on model inversion method
Technical Field
The invention belongs to the field of artificial intelligence safety, and mainly relates to a confrontation sample defense method and equipment based on a model inversion method.
Background
At present, the deep neural network has achieved great success in various fields, and has been widely deployed in various key task applications, and the safety and credibility of the deep neural network have become public concerns, such as in the fields of automatic driving, medical diagnosis, and credible computing. However, in these applications, erroneous decisions or predictions may result in catastrophic economic losses and even life hazards.
A challenge sample is an injected sample that is often used by attackers. It is one of the main threats faced by current DNN, and it introduces subtle malicious perturbations to the inputs in order to spoof the DNN model. Such countermeasures, which are usually imperceptible to humans, are very slightly perturbed with respect to the original samples, but can greatly alter the features extracted by the target DNN model, leading to erroneous inference results. According to different knowledge of attackers, the counterattack can be divided into three types, namely black box attack, white box attack and adaptive attack. Today, almost all defense schemes suffer from the serious impact of adaptive attacks, and attackers can use knowledge of the defense scheme to make new confrontational samples.
Potential space: in a neural network, some variables are observable, while in contrast, some variables are not observable, typically variables of intermediate layers, hidden layers in the neural network, or variables that are transmitted between different neural networks without requiring a user to obtain a specific value. The distribution of these variables constitutes the underlying space.
Model inversion is an application direction of deep learning, and aims to derive characteristics of original training data from classification vectors of a classifier by an idea, so that a serious security threat is caused. Among the more novel methods at present are GAN-based model inversion. GAN consists of two modules: generating a model G and a discriminant model D, and providing a training method for antagonism, namely: the aim of the generated model is to generate a real picture as much as possible to deceive the discriminant model, and the aim of the discriminant model is to distinguish the picture generated by the generated model from the real picture as much as possible. The net effect is that a higher dimensional output vector G (z), such as sound, image, etc., can be generated from a simple input vector, which may be a random noise z. The technology is widely applied in the directions of super-resolution tasks, semantic segmentation and the like.
In the existing defense research, there are methods of converting through input, reverse training a model, or detecting an abnormality based on a specific standard, to alleviate the influence of an antagonistic sample. For example, a gradient masking method, which constructs a robust model by a method difficult to be used by an attacker, is easily bypassed by an attack with gradient approximation capability; there are also competing training methods that improve the robustness of the model by adding competing samples during the training phase. Other methods include methods of changing the loss function, changing the activation function, ensemble learning, self-supervised learning, using artificially generated training samples or re-weighting misclassified samples, etc. Although the methods such as adversarial training and function changing are easy to implement, the accuracy of legal input is inevitably reduced.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a countermeasure sample defense method and equipment based on a model inversion method, and improves the accuracy of legal input.
The model inversion method is based on StyleGAN and specifically comprises the following processes:
step 1, preprocessing and space transformation are carried out on the potential space vector to be inverted to obtain a decoupling vector.
Because partial characteristics of an original sample can be better transformed through a hyper-parameter and are convenient to control, the sample is preprocessed, namely, an initial vector is decoupled, and a decoupled vector is obtained.
And 2, verifying the successful decoupling of the distribution of the w vector characteristics by sensing the path length.
The verification method is realized through indexes such as sensing path length, wherein the sensing path length is obtained by subdividing an interpolation path between two latent codes into small sections, and is defined as the sum of sensing differences of each small section. That is, in a feature map, the more continuous the feature gradient on a line segment is, the shorter the perceived path length of the two end points on the line segment is. It can be verified from this index whether the distribution of the sample features becomes clearer, i.e. whether the decoupling is successful. According to the experimental result, it is shown that the perceived path length between two end points is effectively reduced after 8 MLP transforms are adopted.
Step 3, transferring the decoupled vector to an initial vector for strengthening information training; and performing proAdaIN processing after the trained affine transformation A.
Performing trained affine transformation on the potential vector space W, and mapping the decoupled vector to different patterns
Figure SMS_1
In the above, an improved picture style migration operation (proada in) after the convolutional layer network of each integrated network g is controlled, which is to obtain various styles simply through different affine transformations, and apply the styles to the finally generated picture.
By intensive information training is meant that not only a fixed, carefully constructed initialization vector is used, but rather information based on a common data set as much as possible: if there is no public data set, starting with a default picture; otherwise, combining the common data set of the articles to be classified, firstly inputting the common data set into a classifier with the same input vector, and classifying the articles; then, the classification result and the input vector are subjected to cross entropy loss calculation. And taking the data with the minimum loss as a vector input into a subsequent comprehensive network after carrying out multi-layer perception MLP operation.
It is worth noting that proAdaIN is a great breakthrough in this operation over the ordinary AdaIN function, mainly because the ordinary AdaIN function only achieves normalization and affine change of data, and does not achieve Gaussian distribution processing of data. When the data after the Gaussian distribution processing is added faces more complex implementation scenes, the robustness relative to the AdaIN function is enhanced. Inversion can be achieved for not only human faces, but also objects and scenes.
And 4, introducing noise input.
These are single channel vectors consisting of independent gaussian noise, since the features of the tiny details can be randomized without affecting the perception of the image as long as the whole picture follows the correct distribution. Meanwhile, the generation of noise also helps to reduce the overfitting phenomenon during training. There are also many ways to deal with the over-fitting problem, such as regularization, increased number of samples, etc.
The invention also designs a countermeasure sample defense method based on the model inversion method, which is applied to a model for defending the countermeasure sample by semantic conflict and is trained.
The model to be trained mainly comprises a classifier, an encoder, a model inverter and a similarity discriminator. The method mainly utilizes a supervised learning mode and adopts a gradient descent method for training.
The purpose of the classifier is to provide a target label to be inverted to the model inverter;
the encoder is configured to extract potential spatial vectors of an input image;
the model inverter reconstructs the characteristic vectors classified by the encoder and outputs a picture with semantic information;
the similarity discriminator is used to determine whether the model inverter generated image is sufficiently close to the corresponding input image.
Based on the same inventive concept, the invention also designs one or more processors;
storage means for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement operations performed by a method of confrontational sample defense based on a model-based inversion method.
Based on the same inventive concept, the present invention also provides a computer-readable medium having a computer program stored thereon, characterized in that: the program, when executed by a processor, implements the operations performed by a confrontational sample defense method based on a model inversion method.
The invention has the advantages that:
according to the method, a model inversion mechanism is introduced into countermeasure sample defense, so that the problems that the safety and efficiency of the original countermeasure sample defense mechanism cannot be obtained at the same time and normal input is influenced are solved;
the invention improves the original model inversion mechanism, including the technologies of strengthening information training, proADAIN and the like, and respectively solves the problems of input data type limitation and data distribution limitation of the original model inversion technology in application.
Drawings
FIG. 1 is a diagram of an implementation of a model inverter in an embodiment of the invention.
FIG. 2 is a general flow chart of a model inverter applied against sample defense.
Fig. 3 is a flow chart of specific steps of the present invention.
Detailed Description
The invention mainly introduces a model inversion mechanism into confrontation sample defense. The implementation method of the model inverter is shown in the attached figure 1. The method fully considers two types of target attack situations and non-target attack situations. Consider a white-box attack that is challenging to defender. The deep learning model obtained by defense through the method is safer and more reliable, has better defense effect on various advanced sample attack resisting methods, and has smaller cost and time expenditure in the defense process.
The method provided by the invention can realize the process by using a computer software technology. FIG. 1 is a main flow of model inverter construction, which can be expanded by taking two layers of integrated networks as an example; fig. 3 is a general flow of the overall defense scheme. The following will specifically describe the process of the present invention by taking an experiment with challenge sample data set as an example.
The invention designs a model inversion method based on an improved StyleGAN network, which specifically comprises the following steps:
step 1, preprocessing a picture, and mapping a traditional potential vector space Z to a decoupled potential vector space W.
And (4) picture preprocessing, namely decoupling the Z space of the original GAN into the W space. Decoupling has many different definitions, but a common feature is the existence of multiple linear subspaces in a potential space, each containing an obvious feature. In the original Z-space, the subspaces are entangled, however, through multiple successive mappings, these linear subspaces are transformed into distinct regions that are more distinct and independent, rather than entangled, as the mapping can accommodate the "de-warping" condition. Therefore, it is expected that the mapped W-space will yield a decoupled distribution after such unsupervised training. Of course, there are other ways to achieve decoupling of entanglement distributions in current research, one of which is used empirically in the present solution.
To better achieve style uniqueness, mixed regularization is employed rather than single picture training serially. Specifically, in generating the W space vector, two potential vectors z are employed in parallel 1 ,z 2 Are processed simultaneously and their characteristics w are compared 1 ,w 2 And mixing the parameters to different degrees, and finally respectively generating the required feature vector spaces. Such a regularization technique may allow for increased flexibility in the generated pictures. Wherein the mixture is passed throughw=θ 1 w 1 +θ 2 w 2 Whereinθ 1 Andθ 2 is an adjustable hyper-parameter, and the adjustment can achieve different effects.
And 2, verifying the decoupling property of the W space by using the sensing path length measurement.
Interpolation of the potential vector space Z can produce surprising non-linear variations, for example, at either end point, missing features may appear in the middle of a linear interpolation path. This is a phenomenon where the Z-space is entangled and the variable factors are not properly separated, and to quantify this effect, the present invention measures the sharp changes in the image as it is interpolated in the underlying space using the perceived path length. Simply stated, the potential space for decoupling results in a significantly smooth transition relative to the potential space for high twist.
The perceptual path length is the subdivision of the interpolation path between two latent codes into small segments, defined as the sum of each small perceptual difference, and is represented by the following formula:
Figure SMS_2
wherein the potential vector z 1 ,z 2 ,z 1 ,z 2 P (Z), where P (Z) represents the probability distribution of the Z space vector, G is the model inverter, d (-) evaluates the perceived distance of the target picture,Eit is shown that it is desirable to,εthe steps are represented as steps in the process,tdenotes from z 1 To z 2 A certain intermediate vector in the vector process, t to U (0, 1), U (0, 1) represents the uniform distribution of 0-1,slerprepresenting spherical interpolation, is also the most suitable interpolation method in the regularized input latent space. To focus attention on the features of the core rather than the details and background, the picture is cropped. A number of samples are calculated and expected. Using a similar approach, the W hidden space is calculated:
Figure SMS_3
wherein the content of the first and second substances,gis the integral network part of the model inverter,lerpwhich represents a linear interpolation of the values of the pixels,fa function for spatial transformation in the pretreatment; according to corresponding experimental results, after the Z space is mapped to the W space, the perception distance of the target picture is obviously reduced, which shows that all the characteristics of the original picture are successfully decoupled. The experimental results are given in the next section.
Step 3, transferring the vector corresponding to the processed picture to an initial vector for strengthening information training; and simultaneously, after different affine transformations A, proAdaIN processing is carried out.
By intensive information training is meant that not only a fixed, carefully constructed initialization vector is used. If there is only one fixed vector, portability will become poor at the time of application. In order to adapt the model of the present invention to the recognition of more objects, it is proposed to find a common data set similar to the classification task, which is discussed in the following two cases: if there is no public data set, a default picture starts; otherwise, simultaneously combining the common data set of the articles to be classified, firstly inputting the articles to a classifier with the same input vector for classification; then, the classification result and the input vector are subjected to cross entropy loss calculation. The least lossy data is taken as the initial vector. The formula for cross entropy loss is as follows:
Figure SMS_4
where M represents how many categories in total,y ic representing a sign function, namely the true vector value of the sample;p ic represents the predicted value, i.e. the value in Z space.
Performing trained affine transformation on the potential vector space W, and mapping the decoupled vector to different patternsy=(y s y b ) And (c) controlling a picture-improved style migration mapping operation (proada in) after the convolutional layer network of each integrated network g. The proAdaIN operation is defined as the following equation:
Figure SMS_5
whereinerfcThe function is a Gaussian error function, and the function can better enable the whole data distribution to approach to normal distribution, so that the stability of the model is improved.y s Andy b is different patterns, subscripts, obtained by mapping the decoupled vectorsiIs an index of the sample(s),μis the average of the samples and is,σis the standard deviation of the sample; each feature vectort i Normalized separately and then using the data from the patternyScaling and biasing are performed so that, as a result,yis twice the number of feature mappings for that layer. The method can well improve the inversion effect.
Step 4, a direct method is provided for generating random details by introducing explicit noise input.
These are single channel vectors composed of independent gaussian noise, and it is noted that such a dedicated noise input is provided at each layer of the synthesis network, using trained eigen-scaling factors to broadcast into all eigen-maps.
There are many aspects of the image that can be randomly transformed, such as the hair of a figure, a beard, the pattern of a shade picture, the texture of asphalt on a road, and so on. As long as the entire picture follows the correct distribution, the characteristics of the tiny details can be randomized without affecting the perception of the image. Meanwhile, the generation of noise also helps to reduce the overfitting phenomenon during training.
Overfitting phenomena means that the assumptions become overly rigorous in order to obtain some consistent assumption. From the history of the development of the neural network, as the network structure becomes more complex and the number of network layers gradually increases, the over-fitting problem also gradually becomes a main reason for influencing the training effect. The reason for overfitting is mainly that the attention of the machine shifts to places where one does not want the attention of the machine during training, i.e. at the minutia. For example, in a deep neural network, due to too little training data or too deep network, the network mistakenly assumes the background color of a certain mark, for example, the blue color of the sky is the actual meaning of the mark-the speed limit is 80km/h. This phenomenon can be better eliminated by introducing gaussian noise. Meanwhile, the overfitting phenomenon can be solved by regularization, increase of training sample number and the like.
And (4) the constructed model invertor is connected into the overall design of the system for detecting the confrontation sample. Namely, a confrontation sample defense method based on a model inversion method, which comprises the following specific processes:
a confrontational sample defense model is designed, as shown in fig. 2, which includes a classifier, an encoder, a model inverter, and a similarity discriminator. In the process of model training, network parameters are continuously corrected mainly in a back propagation mode, and finally the network performance achieves the expected effect. So-called back propagation, namely, the parameters are corrected by calculating the gradient value of the loss function in the current parameter environment and then by means of gradient descent, and the gradient descent formula is as follows:
Figure SMS_6
wherein the content of the first and second substances,θ j for the parameter to be corrected, andαthis represents the learning rate, which has a small and positive value, varying from 0 to 1. It controls the step size of each iteration and moves towards the minimum of the loss function. Obviously, the learning rate determines the speed of parameter adjustment. When the learning rate is set too small, the convergence process becomes very slow or falls into a locally optimal solution very easily; when the learning rate is set too large, the gradient may oscillate around the minimum value and may even fail to converge. Therefore, selecting an appropriate learning rate will be crucial for the training of the model. The learning rate varies with different training targets, and needs to be adjusted, and is generally selected to be 0.001,0.01 or 0.1.
The static structure of the entire challenge sample test system is recorded in FIG. 2, and how the entire challenge sample defense system operates dynamically is described below.
The whole confrontation sample defense system consists of four parts, namely a classifier, an encoder, a model inverter and a similarity discriminator. Specifically, the purpose of the classifier is to provide the model inverter with the target labels to be inverted; the encoder is responsible for extracting the potential features of the input image x as a default initial vector for the model inverter. The model inverter finally outputs a picture with semantic information. A well-trained model inverter generates images that are highly similar to the labels given by the classifier.
At the same time, the similarity discriminator can measure the input imagexAnd its inverse imagex′The similarity between them. Samples generated by model invertorx′And confrontation samplexAnd simultaneously inputting the similarity detection. This is done on the premise that, given a label, the image generated by the model inverterx′Should resemble the normal image of the labelx(ii) a But would have very poor similarity to the competing image of the label. So by this similarity discriminator D aux To determine a generated imagex′Whether close enough to the corresponding inputxJudging the samplexWhether it is a challenge sample.
In the method of calculating the difference, the similarity discriminator calculates the difference between the generated image and the original image by:
first, use it directlyx-x′As a direct input, then, a positive error is definedx pos That is to sayxLabel in line with ityComposite image x of y ' and this difference is a general error generated by the picture itself. Redefining positive errorsx neg That is to sayxSubtracting model invertor from actual tagsy′The result producedx y′ . The formula is as follows:
Figure SMS_7
using the hinge loss as the final loss assessment result would result in the positive input value having a term in loss greater than 1, while the negative input value having a term in loss less than-1.
Figure SMS_8
Function ReLU, according to which a distinction can be madexAndx′the similarity between them. If it is notxAndx′is considered similar, thenxIs a benign sample; if not, then,xwould be considered a challenge sample.
The entire challenge sample defense system is operated as such, while reference may be made to fig. 2:
1. the generation process, invests in unknown samples, predicts the labels of the samples by the classifier, and outputs the potential vector z of the samples by the encoder. The labels and potential vectors are simultaneously input into the model inverter.
2. The inference process, which provides both input and output of the target DNN model, is run in the model inverter to obtain the generated results. For correctly inferred legal inputs, the synthesized output attempts to reconstruct the input. For the antagonistic samples, since the purpose of the model inverter is to generate a picture from the feature vectors, the model inverter will create as many composite results as possible that fit the wrong label, rather than reconstructing the input.
3. In the similarity detection process, the similarity detection detects the similarity between the detection generated result and the confrontation sample, the distance between the true sample and the confrontation sample is small, and the similarity detection is carried out; for the confrontation sample, the distance is large, and the similarity detection cannot be passed. Thus, two samples can be distinguished to achieve the aim.
The specific contents of the experiment are as follows:
through the realization of reference of related documents and a computer software technology, the reasonability and the effectiveness of the decoupling effect of entanglement distribution of the model inverter from the Z space to the W space are verified, and meanwhile, the scheme is verified to be capable of well solving the defense problem of resisting sample attack.
The experiment uses two experimental indexes, namely path length and separation degree are respectively sensed, and the two indexes are extremely small indexes.
The resulting experimental data are shown in the following table:
Method perceived path length Degree of separation
Traditional model inverter Z 412.0 10.78
Model inverter W of the invention 426.5 3.52
+ W for increasing noise input 193.7 3.54
+ Mixing 50% W 226.7 3.52
+ Mixing 90% W 240.5 3.76
Experimental data show that the sensing path length is greatly reduced after noise is added, which indicates that the W space is cleaner and clearer; the same result is obtained if only the path end points, i.e. the end-aware path lengths, or from the separation degree point of view, are measured: the W space in the model inverter is more decoupled than the original Z space.
In the aspect of resisting sample detection, a white box attack method is adopted to perform defense test on an MNIST data set, and PGD is respectively adopted l2 、BIM l2 The attack method of the invention can achieve 100% of defense effect. This indicates that the function of the defense challenge sample was achieved.
The specific embodiments described herein are merely illustrative of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims (10)

1. A method of model inversion characterized by: the method is based on an improved StyleGAN network and specifically comprises the following steps:
step 1, preprocessing and space transformation are carried out on a potential space vector to be inverted to obtain a decoupling vector;
step 2, verifying the decoupling of the transformed vector feature distribution;
step 3, transferring the decoupled vector to an initial vector for strengthening information training; simultaneously, performing proAdaIN processing after different affine transformations;
the specific process of the strengthened information training comprises the following steps: if there is no public data set, starting with a default picture; otherwise, combining the common data set of the articles to be classified, firstly inputting the common data set into a classifier with the same input vector, and classifying the articles; then, calculating cross entropy loss of the classification result and an input vector, taking the minimum loss data, and performing multi-layer perception MLP operation to obtain output;
ProAdaIN's operation is defined by the following equation:
Figure QLYQS_1
whereinerfcThe function is a function of the gaussian error,y s andy b is different patterns, subscripts, obtained by mapping the decoupled vectoriIs an index of the sample(s),μis the average of the samples and is,σis the standard deviation of the sample; each feature vectort i Are normalized separately and then use the pattern fromyScaling and biasing are performed so that, as a result,yis twice the number of feature mappings for that layer;
and 4, introducing noise input.
2. The model inversion method of claim 1, characterized in that:
the specific process of the step 1 is as follows:
mapping the conventional potential vector space Z to the decoupled potential vector space W, given the potential vector Z in the potential vector space Z, a non-linear mapping network f: Z → W is generated
Figure QLYQS_2
Mapping f is implemented using an 8-layer multi-layer perceptron.
3. The model inversion method of claim 1, characterized in that:
in the step 2, mixed regularization is adopted, and when a decoupled potential vector space W is generated, two given potential vectors z are adopted in parallel 1 ,z 2 Are processed simultaneously and their characteristics w are compared 1 ,w 2 And mixing the parameters to different degrees to finally generate the required feature spaces respectively.
4. A model inversion method according to claim 3, characterized in that:
in step 2, whether the decoupling is successful or not is verified by using the change of the sensing path length measurement image during interpolation in the potential space, and the sensing path lengthl z Is to thin the interpolation path between two latent codesSegmentation into small segments, defined as the sum of the perceived differences of each small segment, is represented by the following formula:
Figure QLYQS_3
wherein the potential vector z 1 ,z 2 ,z 1 ,z 2 P (Z), where P (Z) represents the probability distribution of the Z space vector, G is the model inverter, d (-) evaluates the perceived distance of the target picture,Eit is shown that it is desirable to,εthe steps are represented as steps in the process,trepresents from z 1 To z 2 A certain intermediate vector in the vector process, t to U (0, 1), U (0, 1) represents the uniform distribution of 0-1,slerprepresents a spherical interpolation; to focus attention on features of the core rather than details and background, the picture is cropped, multiple samples are computed, and it is desired that the spatially transformed vector space W is computed using a similar method:
Figure QLYQS_4
wherein the content of the first and second substances,gis the integral network part of the model inverter,lerp which represents a linear interpolation of the values of the pixels,fa function for spatial transformation in the pre-treatment; according to corresponding experimental results, after the Z space is mapped to the W space, the perception distance of the target picture is obviously reduced, which indicates that all the characteristics of the original picture are successfully decoupled.
5. The model inversion method of claim 1, characterized in that:
the formula of the cross entropy loss in step 3 is as follows:
Figure QLYQS_5
where M represents how many categories in total,y ic representing the sign function, i.e. the true direction of the sampleQuantity value taking;p ic representing the predictor, i.e. the value in the potential vector space Z.
6. The model inversion method of claim 1, characterized in that:
the step 4 introduces gaussian noise, which is provided as an input at each layer of the synthesis network, and is broadcast into all feature maps using trained feature scaling factors.
7. A countermeasure sample defense method based on a model inversion method is characterized in that:
designing a defense model of a confrontation sample, and training the model by adopting a gradient descent method in a supervised learning mode; the anti-sample defense model comprises a classifier, an encoder, a model inverter and a similarity discriminator, wherein the model inverter is a model inverter generated by the model inversion method according to any one of claims 1 to 6;
the purpose of the classifier is to provide a target label to be inverted to the model inverter;
the encoder is configured to extract potential spatial vectors of an input image;
the model inverter reconstructs the characteristic vectors classified by the encoder and outputs a picture with semantic information;
the similarity discriminator is used to determine whether the model inverter generated image is sufficiently close to the corresponding input image.
8. The method of sample defense against of claim 7, characterized in that:
the similarity discriminator adopts the following formula:
Figure QLYQS_6
error of the measurementx pos Is an input imagexLabel in line with ityComposite image x of y ' difference between, negative errorx neg I.e. inputting imagesxSubtracting model invertor from actual tagsy′The result producedx y′
Using the hinge loss function as the final loss assessment result, i.e. the positive input value has a term in loss greater than 1, while the negative input value has a term in loss less than-1:
Figure QLYQS_7
similarity discriminator D aux Similarity discriminator loss function L Daux From which a distinction can be madexAndx′if there is similarity between themxAndx′is considered similar, thenxIs a benign sample; if not, then,xwould be considered a challenge sample.
9. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to perform operations performed by the method of any one of claims 7-8.
10. A computer-readable medium having a computer program stored thereon, characterized in that: the program when executed by a processor implements the operations performed by the method of any of claims 7-8.
CN202310059601.9A 2023-01-17 2023-01-17 Model inversion method-based countermeasure sample defense method and equipment Active CN115860112B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310059601.9A CN115860112B (en) 2023-01-17 2023-01-17 Model inversion method-based countermeasure sample defense method and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310059601.9A CN115860112B (en) 2023-01-17 2023-01-17 Model inversion method-based countermeasure sample defense method and equipment

Publications (2)

Publication Number Publication Date
CN115860112A true CN115860112A (en) 2023-03-28
CN115860112B CN115860112B (en) 2023-06-30

Family

ID=85657543

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310059601.9A Active CN115860112B (en) 2023-01-17 2023-01-17 Model inversion method-based countermeasure sample defense method and equipment

Country Status (1)

Country Link
CN (1) CN115860112B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116631043A (en) * 2023-07-25 2023-08-22 南京信息工程大学 Natural countermeasure patch generation method, training method and device of target detection model
CN117095136A (en) * 2023-10-19 2023-11-21 中国科学技术大学 Multi-object and multi-attribute image reconstruction and editing method based on 3D GAN
CN117197589A (en) * 2023-11-03 2023-12-08 武汉大学 Target classification model countermeasure training method and system

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111598805A (en) * 2020-05-13 2020-08-28 华中科技大学 Confrontation sample defense method and system based on VAE-GAN
CN113111945A (en) * 2021-04-15 2021-07-13 东南大学 Confrontation sample defense method based on transform self-encoder
CN113222960A (en) * 2021-05-27 2021-08-06 哈尔滨工程大学 Deep neural network confrontation defense method, system, storage medium and equipment based on feature denoising
CN113378644A (en) * 2021-05-14 2021-09-10 浙江工业大学 Signal modulation type recognition attack defense method based on generative countermeasure network
CN113449786A (en) * 2021-06-22 2021-09-28 华东师范大学 Reinforced learning confrontation defense method based on style migration
CN113554089A (en) * 2021-07-22 2021-10-26 西安电子科技大学 Image classification countermeasure sample defense method and system and data processing terminal
US20220172000A1 (en) * 2020-02-25 2022-06-02 Zhejiang University Of Technology Defense method and an application against adversarial examples based on feature remapping
CN114724189A (en) * 2022-06-08 2022-07-08 南京信息工程大学 Method, system and application for training confrontation sample defense model for target recognition
CN114757351A (en) * 2022-04-24 2022-07-15 北京理工大学 Defense method for resisting attack by deep reinforcement learning model
CN115048983A (en) * 2022-05-17 2022-09-13 北京理工大学 Counterforce sample defense method of artificial intelligence system based on data manifold topology perception

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220172000A1 (en) * 2020-02-25 2022-06-02 Zhejiang University Of Technology Defense method and an application against adversarial examples based on feature remapping
CN111598805A (en) * 2020-05-13 2020-08-28 华中科技大学 Confrontation sample defense method and system based on VAE-GAN
CN113111945A (en) * 2021-04-15 2021-07-13 东南大学 Confrontation sample defense method based on transform self-encoder
CN113378644A (en) * 2021-05-14 2021-09-10 浙江工业大学 Signal modulation type recognition attack defense method based on generative countermeasure network
CN113222960A (en) * 2021-05-27 2021-08-06 哈尔滨工程大学 Deep neural network confrontation defense method, system, storage medium and equipment based on feature denoising
CN113449786A (en) * 2021-06-22 2021-09-28 华东师范大学 Reinforced learning confrontation defense method based on style migration
CN113554089A (en) * 2021-07-22 2021-10-26 西安电子科技大学 Image classification countermeasure sample defense method and system and data processing terminal
CN114757351A (en) * 2022-04-24 2022-07-15 北京理工大学 Defense method for resisting attack by deep reinforcement learning model
CN115048983A (en) * 2022-05-17 2022-09-13 北京理工大学 Counterforce sample defense method of artificial intelligence system based on data manifold topology perception
CN114724189A (en) * 2022-06-08 2022-07-08 南京信息工程大学 Method, system and application for training confrontation sample defense model for target recognition

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
HUY H NGUYEN 等: "Generating Master Faces for Use in Performing Wolf Attacks on Face Recognition Systems" *
张嘉楠 等: "深度学习对抗样本的防御方法综述" *
李明慧 等: "针对深度学习模型的对抗性攻击与防御" *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116631043A (en) * 2023-07-25 2023-08-22 南京信息工程大学 Natural countermeasure patch generation method, training method and device of target detection model
CN116631043B (en) * 2023-07-25 2023-09-22 南京信息工程大学 Natural countermeasure patch generation method, training method and device of target detection model
CN117095136A (en) * 2023-10-19 2023-11-21 中国科学技术大学 Multi-object and multi-attribute image reconstruction and editing method based on 3D GAN
CN117095136B (en) * 2023-10-19 2024-03-29 中国科学技术大学 Multi-object and multi-attribute image reconstruction and editing method based on 3D GAN
CN117197589A (en) * 2023-11-03 2023-12-08 武汉大学 Target classification model countermeasure training method and system
CN117197589B (en) * 2023-11-03 2024-01-30 武汉大学 Target classification model countermeasure training method and system

Also Published As

Publication number Publication date
CN115860112B (en) 2023-06-30

Similar Documents

Publication Publication Date Title
Salem et al. Dynamic backdoor attacks against machine learning models
CN115860112B (en) Model inversion method-based countermeasure sample defense method and equipment
Wang et al. Guided diffusion model for adversarial purification
CN110941794A (en) Anti-attack defense method based on universal inverse disturbance defense matrix
Zhang et al. Defense against adversarial attacks by reconstructing images
Lin et al. ML attack models: Adversarial attacks and data poisoning attacks
Li et al. AdvSGAN: Adversarial image Steganography with adversarial networks
Wang et al. Generating semantic adversarial examples via feature manipulation
Wang et al. Adversarial detection by latent style transformations
Li et al. A defense method based on attention mechanism against traffic sign adversarial samples
CN112907431B (en) Steganalysis method for robust countersteganalysis
Luo et al. Improving security for image steganography using content-adaptive adversarial perturbations
Hui et al. FoolChecker: A platform to evaluate the robustness of images against adversarial attacks
He et al. Type-I generative adversarial attack
Wang et al. Generating semantic adversarial examples via feature manipulation in latent space
Cai et al. Face anti-spoofing via conditional adversarial domain generalization
Du et al. LC-GAN: Improving adversarial robustness of face recognition systems on edge devices
CN114842242A (en) Robust countermeasure sample generation method based on generative model
Yang et al. Data leakage attack via backdoor misclassification triggers of deep learning models
CN115238271A (en) AI security detection method based on generative learning
Abdollahi et al. Image steganography based on smooth cycle-consistent adversarial learning
Sharma et al. Towards secured image steganography based on content-adaptive adversarial perturbation
Liu et al. Jacobian norm with selective input gradient regularization for improved and interpretable adversarial defense
Hollósi et al. Capsule Network based 3D Object Orientation Estimation
CN115797711B (en) Improved classification method for countermeasure sample based on reconstruction model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant