CN110334806A - A method of adversarial sample generation based on generative adversarial network - Google Patents

A method of adversarial sample generation based on generative adversarial network Download PDF

Info

Publication number
CN110334806A
CN110334806A CN201910459852.XA CN201910459852A CN110334806A CN 110334806 A CN110334806 A CN 110334806A CN 201910459852 A CN201910459852 A CN 201910459852A CN 110334806 A CN110334806 A CN 110334806A
Authority
CN
China
Prior art keywords
sample
loss function
generator
adversarial
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910459852.XA
Other languages
Chinese (zh)
Inventor
贾西平
陈桂君
方刚
陈道鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Polytechnic Normal University
Original Assignee
Guangdong Polytechnic Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Polytechnic Normal University filed Critical Guangdong Polytechnic Normal University
Priority to CN201910459852.XA priority Critical patent/CN110334806A/en
Publication of CN110334806A publication Critical patent/CN110334806A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种基于生成式对抗网络的对抗样本生成方法,包括生成器G、鉴别器D、空间变换模块ST和目标分类网络F,生成器G生成扰动,将扰动叠加到原样本中产生对抗样本,再根据鉴别器D和目标分类网络F的损失函数对生成器G进行训练,最后得到训练好的生成器G,利用训练好的生成器G为不同输入样本产生自适应的对抗样本。本发明利用生成式对抗网络,嵌入基于空间变换的增强模块,采用无监督方式进行对抗训练,提高攻击模型的泛化能力和鲁棒性,进而增强对抗样本的迁移性和鲁棒性。

The invention discloses a method for generating an adversarial sample based on a generative adversarial network, including a generator G, a discriminator D, a space transformation module ST and a target classification network F. The generator G generates a disturbance, and the disturbance is superimposed on the original sample to generate Adversarial samples, and then train the generator G according to the loss function of the discriminator D and the target classification network F, and finally get the trained generator G, and use the trained generator G to generate adaptive confrontation samples for different input samples. The invention utilizes a generative confrontation network, embeds an enhancement module based on space transformation, conducts confrontation training in an unsupervised manner, improves the generalization ability and robustness of the attack model, and further enhances the migration and robustness of the confrontational samples.

Description

一种基于生成式对抗网络的对抗样本生成方法A method of adversarial sample generation based on generative adversarial network

技术领域technical field

本发明涉及机器学习领域,更具体地,涉及一种基于生成式对抗网络的对抗样本生成方法。The present invention relates to the field of machine learning, and more specifically, to a method for generating an adversarial example based on a generative adversarial network.

背景技术Background technique

对抗攻击是当前机器学习领域研究的一个热点问题。对抗攻击的原理是通过对抗样本(向原数据样本中添加经过精心训练的人眼不易察觉的微小扰动得到的新样本)来欺骗深度神经网络,使其做出错误判定。Adversarial attacks are a hot topic in the field of machine learning. The principle of adversarial attack is to deceive the deep neural network to make wrong judgments by adversarial samples (new samples obtained by adding carefully trained tiny perturbations that are imperceptible to the human eye to the original data samples).

目前大多数基于深度神经网络的攻击算法(如基于梯度和基于优化的方法)都是针对测试过程或测试数据集的,且需要一直对模型的体系结构和参数进行白盒访问(如获取与输入相关的梯度就需要知道目标网络的权重)。但是,当前的深度学习系统通常出于安全原因不允许对模型进行白盒访问,只允许对模型进行查询访问,即将模型视作黑盒。针对这种情况的攻击被称为黑盒攻击,但当前大多数黑盒攻击的成功率都不高,因为大部分黑盒攻击方法都是基于对抗样本的可迁移性(Transferability)。可迁移性是对抗样本的一个常见属性,指依据有限样本生成的对抗样本对其他变量域也有良好的攻击效果。Most of the current attack algorithms based on deep neural networks (such as gradient-based and optimization-based methods) are aimed at the test process or test data set, and require white-box access to the model's architecture and parameters all the time (such as getting and input The associated gradient requires knowledge of the weights of the target network). However, current deep learning systems usually do not allow white-box access to the model for security reasons, only query access to the model, which treats the model as a black box. Attacks against this situation are called black-box attacks, but the success rate of most current black-box attacks is not high, because most black-box attack methods are based on the transferability of adversarial samples. Transferability is a common property of adversarial examples, which means that adversarial examples generated based on finite samples also have good attack effects on other variable domains.

在无法获取目标网络结构和训练数据集的黑盒攻击中,可迁移性至关重要。如何高效地生成可迁移性强、攻击性能稳定的对抗样本,是一个极有意义且极具挑战性的问题。Transferability is crucial in black-box attacks where the target network structure and training dataset are unavailable. How to efficiently generate adversarial samples with strong transferability and stable attack performance is a very meaningful and challenging problem.

综上所述,虽然已有研究证明了现有攻击方法在不同数据训练的相同结构神经网络之间,以及同一任务训练的不同结构神经网络之间具有一定的迁移性,如文献[1]Goodfellow I J,Shlens J,Szegedy C,et al.Explaining and Harnessing AdversarialExamples[J].International Conference on Learning Representations,2015、文献[2]Kurakin A,Goodfellow I J,Bengio S,et al.Adversarial examples in the physicalworld[J].arXiv:Computer Vision and Pattern Recognition,2017、文献[3]Moosavidezfooli S,Fawzi A,Frossard P,et al.DeepFool:A Simple and AccurateMethod to Fool Deep Neural Networks[J].Computer Vision and PatternRecognition,2016:2574-2582和文献[4]Xiao C,Li B,Zhu J Y,et al.GeneratingAdversarial Examples with Adversarial Networks[J].2018;但仍存在对抗样本过于依赖目标模型而引起对抗样本可迁移性差、攻击成功率低、攻击效率低等问题。In summary, although existing studies have proved that existing attack methods have certain transferability between neural networks with the same structure trained on different data, and between neural networks with different structures trained on the same task, such as [1] Goodfellow I J, Shlens J, Szegedy C, et al. Explaining and Harnessing Adversarial Examples [J]. International Conference on Learning Representations, 2015, Literature [2] Kurakin A, Goodfellow I J, Bengio S, et al. Adversarial examples in the physical world [J ].arXiv:Computer Vision and Pattern Recognition,2017, literature [3]Moosavidezfooli S,Fawzi A,Frossard P,et al.DeepFool:A Simple and Accurate Method to Fool Deep Neural Networks[J].Computer Vision and Pattern Recognition,2016: 2574-2582 and literature [4] Xiao C, Li B, Zhu J Y, et al. Generating Adversarial Examples with Adversarial Networks [J]. 2018; but there are still adversarial examples that rely too much on the target model, which leads to poor transferability of adversarial examples and successful attacks Low rate, low attack efficiency and other issues.

发明内容Contents of the invention

本发明提供一种基于生成式对抗网络的对抗样本生成方法,实现为不同输入样本自适应产生具备可迁移性和鲁棒性对抗样本。The present invention provides a method for generating adversarial samples based on a generative adversarial network, which realizes adaptive generation of adversarial samples with transferability and robustness for different input samples.

为解决上述技术问题,本发明的技术方案如下:In order to solve the problems of the technologies described above, the technical solution of the present invention is as follows:

一种基于生成式对抗网络的对抗样本生成方法,包括以下步骤:A method for generating an adversarial example based on a generative adversarial network, comprising the following steps:

S1:将原样本x输入生成器G中,所述生成器G输出扰动G(x),生成器G的损失函数为LG,扰动G(x)叠加到原样本x中得到对抗样本x′=x+G(x),与一般的GAN不同,生成器的目标是产生扰动而不是最终的图像,即输出图像等于输入图像加生成器G的输出图像,生成的对抗样本的细节和纹理是从输入图像中复制的,极大地保留了原图像的细节;S1: Input the original sample x into the generator G, the generator G outputs a disturbance G(x), the loss function of the generator G is L G , and the disturbance G(x) is superimposed on the original sample x to obtain an adversarial sample x′ =x+G(x), unlike the general GAN, the goal of the generator is to generate disturbances rather than the final image, that is, the output image is equal to the input image plus the output image of the generator G, and the details and texture of the generated adversarial samples are copied from the input image, greatly preserving the details of the original image;

S2:将S1得到的对抗样本x′输入鉴别器D中,所述鉴别器D区分对抗样本x′和原样本x,得到鉴别器D的损失函数LDS2: Input the adversarial sample x' obtained by S1 into the discriminator D, and the discriminator D distinguishes the adversarial sample x' from the original sample x, and obtains the loss function L D of the discriminator D ;

S3:将S1得到的对抗样本x′输入到增强模块ST中,所述增强模块ST基于空间变换,对对抗样本x′进行空间变换操作,增强模块ST输出经过仿射变换处理后的对抗样本x′st=Tθ(x+G(x)),式中Tθ为变换函数;S3: Input the adversarial sample x' obtained in S1 into the enhanced module ST, the enhanced module ST performs a spatial transformation operation on the adversarial sample x' based on the spatial transformation, and the enhanced module ST outputs the affine transformed adversarial sample x ′ st =T θ (x+G(x)), where T θ is the transformation function;

S4:将经过仿射变换处理后的对抗样本x′st输入目标分类模型F,得到目标分类模型F的损失函数LFS4: Input the adversarial sample x′ st processed by the affine transformation into the target classification model F, and obtain the loss function L F of the target classification model F ;

S5:根据生成器G的损失函数为LG、鉴别器D的损失函数LD和目标分类模型F的损失函数LF构建目标函数LGAN用于训练攻击模型GAN,得到训练好的生成器G;S5: According to the loss function of the generator G as L G , the loss function L D of the discriminator D and the loss function L F of the target classification model F, the target function L GAN is constructed to train the attack model GAN, and the trained generator G is obtained ;

S6:利用训练好的生成器G为不同输入样本产生自适应的对抗样本。S6: Use the trained generator G to generate adaptive adversarial samples for different input samples.

优选地,生成器G的损失函数使用L2范数作为距离度量损失,具体表示如下:Preferably, the loss function of the generator G uses the L2 norm as the distance metric loss, specifically expressed as follows:

LG=max(0,||G(x)||2-c)L G =max(0,||G(x)|| 2 -c)

其中,c是自定义的常数,它允许用户指定添加的扰动量,能够生成各种各样的对抗样本,可以帮助更好地理解对抗样本的特征空间。该损失也可以稳定GAN的训练。Among them, c is a custom constant, which allows the user to specify the amount of perturbation added, and can generate various adversarial samples, which can help to better understand the feature space of adversarial samples. This loss can also stabilize the training of GAN.

优选地,所述鉴别器D为二元神经网络分类器。Preferably, the discriminator D is a binary neural network classifier.

优选地,所述鉴别器D的损失函数具体为:Preferably, the loss function of the discriminator D is specifically:

LD=logD(x)+log(1-D(x+G(x)))。L D =logD(x)+log(1-D(x+G(x))).

优选地,目标分类模型F的损失函数在有目标的攻击中为:Preferably, the loss function of the target classification model F in a targeted attack is:

LF=L(F(Tθ(x+G(x))),y′)L F = L(F(T θ (x+G(x))),y′)

表示预测类与目标类y′之间的距离,式中,L是交叉熵损失函数;Indicates the distance between the predicted class and the target class y′, where L is the cross-entropy loss function;

目标分类模型F的损失函数在无目标的攻击中为:The loss function of the target classification model F in an untargeted attack is:

LF=-L(F(Tθ(x+G(x))),y)L F =-L(F(T θ (x+G(x))),y)

表示预测类与原始标签类y之间的负距离,式中,L是交叉熵损失函数。Indicates the negative distance between the predicted class and the original label class y, where L is the cross-entropy loss function.

优选地,目标函数LGAN表示为:Preferably, the objective function L GAN is expressed as:

LGAN=LF+αLD+β·LG L GAN =L F +αL D +β·L G

式中,α和β是常数,用于控制各目标函数的相对重要性,LG用于生成微小扰动,LD用于鼓励生成的对抗样本显示类似于原样本,而LF用于优化对抗样本,提高攻击成功率,通过最小化生成器损失函数最大化鉴别器损失函数argmingmaxdLGAN求解得到生成器G和鉴别器D。In the formula, α and β are constants used to control the relative importance of each objective function, L G is used to generate small perturbations, L D is used to encourage the generated adversarial samples to show similarity to the original samples, and LF is used to optimize the adversarial sample, improve the attack success rate, and obtain the generator G and discriminator D by minimizing the generator loss function and maximizing the discriminator loss function argmin g max d L GAN solution.

优选地,步骤S5还包括对训练好的生成器G进行测试,具体包括以下步骤:Preferably, step S5 also includes testing the trained generator G, specifically including the following steps:

S5.1:利用训练好的生成器G生成扰动,从而生成测试对抗样本,将测试对抗样本输入不同结构的目标分类网络,使其分类错误;S5.1: Use the trained generator G to generate disturbances to generate test adversarial samples, and input the test adversarial samples into target classification networks with different structures to make them misclassified;

S5.2:对S5.1的测试对抗样本进行空间变换,产生新测试对抗样本,新测试对抗样本输入目标分类网络,使其分类错误。S5.2: Perform spatial transformation on the test adversarial samples in S5.1 to generate new test adversarial samples, and input the new test adversarial samples into the target classification network to make them classified incorrectly.

与现有技术相比,本发明技术方案的有益效果是:Compared with the prior art, the beneficial effects of the technical solution of the present invention are:

与已有攻击算法相比,本发明提出的方法不需要访问原始目标分类模型就可以有效地为不同输入样本产生攻击样本,查询与生成效率高,攻击模型泛化能力和鲁棒性强,能有效提高对抗样本的可迁移性与鲁棒性,进而提高黑盒攻击成功率。本发明方法的适用性较广,通用性较强,在不同类型的数据集与不同结构的模型上的攻击成功率都较高。Compared with the existing attack algorithm, the method proposed in the present invention can effectively generate attack samples for different input samples without accessing the original target classification model, the query and generation efficiency is high, the attack model has strong generalization ability and robustness, and can Effectively improve the transferability and robustness of adversarial samples, thereby increasing the success rate of black-box attacks. The method of the invention has wide applicability and strong versatility, and the attack success rate on different types of data sets and models with different structures is relatively high.

附图说明Description of drawings

图1为一种基于生成式对抗网络的对抗样本生成方法流程图。Figure 1 is a flow chart of a method for generating an adversarial example based on a generative adversarial network.

图2为一种基于生成式对抗网络的对抗样本生成方法模型示意图,图中虚线代表训练过程,实线代表测试过程。Figure 2 is a schematic diagram of a model of an adversarial sample generation method based on a generative adversarial network. The dotted line in the figure represents the training process, and the solid line represents the testing process.

图3为黑盒攻击鲁棒性实施流程图。Figure 3 is a flow chart of black-box attack robustness implementation.

具体实施方式Detailed ways

附图仅用于示例性说明,不能理解为对本专利的限制;The accompanying drawings are for illustrative purposes only and cannot be construed as limiting the patent;

为了更好说明本实施例,附图某些部件会有省略、放大或缩小,并不代表实际产品的尺寸;In order to better illustrate this embodiment, some parts in the drawings will be omitted, enlarged or reduced, and do not represent the size of the actual product;

对于本领域技术人员来说,附图中某些公知结构及其说明可能省略是可以理解的。For those skilled in the art, it is understandable that some well-known structures and descriptions thereof may be omitted in the drawings.

下面结合附图和实施例对本发明的技术方案做进一步的说明。The technical solutions of the present invention will be further described below in conjunction with the accompanying drawings and embodiments.

实施例1Example 1

一种基于生成式对抗网络的对抗样本生成方法,如图1至2,包括以下步骤:A method for generating an adversarial example based on a generative adversarial network, as shown in Figures 1 to 2, comprising the following steps:

S1:将原样本x输入生成器G中,所述生成器G输出扰动G(x),生成器G的损失函数为LG,扰动G(x)叠加到原样本x中得到对抗样本x′=x+G(x),与一般的GAN不同,生成器的目标是产生扰动而不是最终的图像,即输出图像等于输入图像加生成器G的输出图像,生成的对抗样本的细节和纹理是从输入图像中复制的,极大地保留了原图像的细节,生成器G的损失函数使用L2范数作为距离度量损失,具体表示如下:S1: Input the original sample x into the generator G, the generator G outputs a disturbance G(x), the loss function of the generator G is L G , and the disturbance G(x) is superimposed on the original sample x to obtain an adversarial sample x′ =x+G(x), unlike the general GAN, the goal of the generator is to generate disturbances rather than the final image, that is, the output image is equal to the input image plus the output image of the generator G, and the details and texture of the generated adversarial samples are Copied from the input image, the details of the original image are greatly preserved. The loss function of the generator G uses the L2 norm as the distance metric loss, which is expressed as follows:

LG=max(0,||G(x)||2-c)L G =max(0,||G(x)|| 2 -c)

其中,c是自定义的常数;Among them, c is a custom constant;

S2:将S1得到的对抗样本x′输入鉴别器D中,所述鉴别器D区分对抗样本x′和原样本x,所述鉴别器D为二元神经网络分类器,得到鉴别器D的损失函数LD=logD(x)+log(1-D(x+G(x)));S2: Input the adversarial sample x' obtained by S1 into the discriminator D, the discriminator D distinguishes the adversarial sample x' from the original sample x, the discriminator D is a binary neural network classifier, and the loss of the discriminator D is obtained Function L D =logD(x)+log(1-D(x+G(x)));

S3:将S1得到的对抗样本x′输入到增强模块ST中,所述增强模块ST基于空间变换,对对抗样本x′进行空间变换操作,增强模块ST输出经过仿射变换处理后的对抗样本x′st=Tθ(x+G(x)),式中Tθ为变换函数;S3: Input the adversarial sample x' obtained in S1 into the enhanced module ST, the enhanced module ST performs a spatial transformation operation on the adversarial sample x' based on the spatial transformation, and the enhanced module ST outputs the affine transformed adversarial sample x ′ st =T θ (x+G(x)), where T θ is the transformation function;

S4:将经过仿射变换处理后的对抗样本x′st输入目标分类模型F,得到目标分类模型F的损失函数LF,目标分类模型F的损失函数在有目标的攻击中为:S4: Input the adversarial sample x′ st processed by the affine transformation into the target classification model F, and obtain the loss function L F of the target classification model F. The loss function of the target classification model F in a targeted attack is:

LF=L(F(Tθ(x+G(x))),y′)L F = L(F(T θ (x+G(x))),y′)

表示预测类与目标类y′之间的距离,式中,L是交叉熵损失函数;Indicates the distance between the predicted class and the target class y′, where L is the cross-entropy loss function;

目标分类模型F的损失函数在无目标的攻击中为:The loss function of the target classification model F in an untargeted attack is:

LF=-L(F(Tθ(x+G(x))),y)L F =-L(F(T θ (x+G(x))),y)

表示预测类与原始标签类y之间的负距离,式中,L是交叉熵损失函数;Indicates the negative distance between the predicted class and the original label class y, where L is the cross-entropy loss function;

S5:根据生成器G的损失函数为LG、鉴别器D的损失函数LD和目标分类模型F的损失函数LF构建目标函数LGAN用于训练攻击模型GAN,得到训练好的生成器G和鉴别器D,目标函数LGAN表示为:S5: According to the loss function of the generator G as L G , the loss function L D of the discriminator D and the loss function L F of the target classification model F, the target function L GAN is constructed to train the attack model GAN, and the trained generator G is obtained and the discriminator D, the objective function L GAN is expressed as:

LGAN=LF+αLD+β·LG L GAN =L F +αL D +β·L G

式中,α和β是常数,用于控制各目标函数的相对重要性,LG用于生成微小扰动,LD用于鼓励生成的对抗样本显示类似于原样本,而LF用于优化对抗样本,提高攻击成功率;In the formula, α and β are constants used to control the relative importance of each objective function, L G is used to generate small perturbations, L D is used to encourage the generated adversarial samples to show similarity to the original samples, and LF is used to optimize the adversarial samples to increase the attack success rate;

还包括对训练好的生成器G进行测试,具体包括以下步骤:It also includes testing the trained generator G, which specifically includes the following steps:

S5.1:利用训练好的生成器G生成扰动,从而生成测试对抗样本,将测试对抗样本输入不同结构的目标分类网络,使其分类错误;S5.1: Use the trained generator G to generate disturbances to generate test adversarial samples, and input the test adversarial samples into target classification networks with different structures to make them misclassified;

S5.2:对S5.1的测试对抗样本进行空间变换,产生新测试对抗样本,新测试对抗样本输入目标分类网络,使其分类错误。S5.2: Perform spatial transformation on the test adversarial samples in S5.1 to generate new test adversarial samples, and input the new test adversarial samples into the target classification network to make them classified incorrectly.

S6:利用训练好的生成器G为不同输入样本产生自适应的对抗样本。S6: Use the trained generator G to generate adaptive adversarial samples for different input samples.

在具体实施过程中,以黑盒攻击为例进行鲁棒性测试,具体流程如图3所示。In the specific implementation process, the black-box attack is taken as an example to carry out the robustness test, and the specific process is shown in Figure 3.

1)选择攻击目标F。利用CIFAR-10数据集训练ResNet-18、ResNet-34和VGG-16利用GTSRB数据集训练VGG-16和Multi-Scale CNN,得到两组共五个攻击目标模型F={F1,F2,F3,F4,F5}。其中,ResNet-34和VGG-16分别作为测试对抗样本的灰盒和黑盒模型。1) Select attack target F. Use the CIFAR-10 dataset to train ResNet-18, ResNet-34 and VGG-16 Use the GTSRB dataset to train VGG-16 and Multi-Scale CNN, and get two groups of five attack target models F={F1,F2,F3, F4,F5}. Among them, ResNet-34 and VGG-16 are respectively used as gray-box and black-box models for testing adversarial samples.

2)数据预处理。为了排除网络本身的性能所导致的分类错误的影响,将目标分类网络能正确分类的样本筛选出来,作为生成对抗样本的原样本。2) Data preprocessing. In order to exclude the influence of classification errors caused by the performance of the network itself, the samples that can be correctly classified by the target classification network are screened out as the original samples for generating adversarial samples.

3)生成对抗样本。根据图2的训练过程生成攻击模型,并利用其生成对抗样本。3) Generate adversarial examples. Generate an attack model according to the training process in Figure 2, and use it to generate adversarial samples.

4)测试对抗样本的有效性。若生成的对抗样本可以成功欺骗目标分类网络F使其分类错误,说明本实施例攻击方法有效。4) Test the effectiveness of adversarial examples. If the generated adversarial example can successfully deceive the target classification network F to make a wrong classification, it shows that the attack method of this embodiment is effective.

5)测试对抗样本的可迁移性。若生成的对抗样本能够同时欺骗不同结构的目标分类网络F1和F2,使其分类错误,则说明对抗样本迁移性强,反之,则说明对抗样本迁移性差。与FGSM、BIM、DeepFool和advGAN方法生成的对抗样本相比,提高攻击的成功率,则说明本实施例方法能有效提高对抗样本的迁移性。5) Test the transferability of adversarial examples. If the generated adversarial samples can deceive target classification networks F1 and F2 with different structures at the same time, making them misclassified, it means that the adversarial samples have strong transferability, otherwise, it means that the adversarial samples have poor transferability. Compared with the adversarial samples generated by the FGSM, BIM, DeepFool and advGAN methods, the success rate of the attack is improved, which shows that the method of this embodiment can effectively improve the transferability of the adversarial samples.

测试对抗样本的鲁棒性。对步骤3)生成的对抗样本进行空间变换,产生新对抗样本,仍然能成功欺骗步骤1)的目标分类网络F2,则说明生成的对抗样本鲁棒性强。与FGSM、BIM、DeepFool和advGAN方法生成的对抗样本相比,提高攻击的成功率,则说明本实施方法能有效提高对抗样本的鲁棒性。Test the robustness against adversarial examples. Space transformation is performed on the adversarial samples generated in step 3) to generate new adversarial samples, and the target classification network F2 in step 1) can still be deceived successfully, which means that the generated adversarial samples are robust. Compared with the adversarial samples generated by FGSM, BIM, DeepFool and advGAN methods, the success rate of the attack is improved, which shows that this implementation method can effectively improve the robustness of the adversarial samples.

CIFAR-10数据集实验结果如表1所示:The experimental results of the CIFAR-10 data set are shown in Table 1:

表1Table 1

GTSRB数据集实验结果如表2所示:The experimental results of the GTSRB dataset are shown in Table 2:

表2Table 2

相同或相似的标号对应相同或相似的部件;The same or similar reference numerals correspond to the same or similar components;

附图中描述位置关系的用语仅用于示例性说明,不能理解为对本专利的限制;The terms describing the positional relationship in the drawings are only for illustrative purposes and cannot be interpreted as limitations on this patent;

显然,本发明的上述实施例仅仅是为清楚地说明本发明所作的举例,而并非是对本发明的实施方式的限定。对于所属领域的普通技术人员来说,在上述说明的基础上还可以做出其它不同形式的变化或变动。这里无需也无法对所有的实施方式予以穷举。凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明权利要求的保护范围之内。Apparently, the above-mentioned embodiments of the present invention are only examples for clearly illustrating the present invention, rather than limiting the implementation of the present invention. For those of ordinary skill in the art, on the basis of the above description, other changes or changes in different forms can also be made. It is not necessary and impossible to exhaustively list all the implementation manners here. All modifications, equivalent replacements and improvements made within the spirit and principles of the present invention shall be included within the protection scope of the claims of the present invention.

Claims (7)

1. a kind of confrontation sample generating method based on production confrontation network, which comprises the following steps:
S1: original sample x being inputted in generator G, the generator G output disturbance G (x), and the loss function of generator G is LG, disturb Dynamic G (x), which is added in original sample x, to be obtained to resisting sample x '=x+G (x);
S2: by S1 obtain in resisting sample x ' input discriminator D, the discriminator D is distinguished to resisting sample x ' and original sample x, Obtain the loss function L of discriminator DD
S3: resisting sample x ' is input in enhancing module ST by what S1 was obtained, the enhancing module ST is based on spatial alternation, to right Resisting sample x ' carry out spatial alternation operation, by affine transformation, treated to resisting sample x ' for enhancing module ST outputst=Tθ(x+ G (x)), T in formulaθFor transforming function transformation function;
S4: passing through affine transformation, treated to resisting sample x 'stObject-class model F is inputted, obtains object-class model F's Loss function LF
S5: being L according to the loss function of generator GG, discriminator D loss function LDWith the loss function of object-class model F LFConstruct objective function LGANFor training challenge model GAN, trained generator G is obtained;
S6: it is adaptive to resisting sample to be that different input samples generate using trained generator G.
2. the confrontation sample generating method according to claim 1 based on production confrontation network, which is characterized in that generate The loss function of device G uses L2 norm to lose as distance metric, is specifically expressed as follows:
LG=max (0, | | G (x) | |2-c)
Wherein, c is customized constant.
3. the confrontation sample generating method according to claim 2 based on production network, which is characterized in that the identification Device D is binary neural network classifier.
4. the confrontation sample generating method according to claim 3 based on production confrontation network, which is characterized in that described The loss function of discriminator D specifically:
LD=log D (x)+log (1-D (x+G (x))).
5. the confrontation sample generating method according to claim 4 based on production confrontation network, which is characterized in that target The loss function of disaggregated model F is in the attack for having target are as follows:
LF=L (F (Tθ(x+G(x))),y′)
Indicate prediction the distance between class and target class y ', in formula, L is cross entropy loss function;
The loss function of object-class model F is in aimless attack are as follows:
LF=-L (F (Tθ(x+G(x))),y)
Indicate the negative distance between prediction class and original tag class y, in formula, L is cross entropy loss function.
6. the confrontation sample generating method according to claim 5 based on production confrontation network, which is characterized in that target Function LGANIt indicates are as follows:
LGAN=LF+αLD+β·LG
In formula, α and β are constants, maximize discriminator loss function argmin by minimizing generator loss functiongmaxdLGAN Solution obtains generator G and discriminator D.
7. the confrontation sample generating method according to claim 6 based on production confrontation network, which is characterized in that step S5 further include trained generator G is tested, specifically includes the following steps:
S5.1: generating disturbance using trained generator G, to generate test to resisting sample, test inputs resisting sample The target classification network of different structure, makes its classification error;
S5.2: carrying out spatial alternation to resisting sample to the test of S5.1, generates new test to resisting sample, new test is defeated to resisting sample Enter target classification network, makes its classification error.
CN201910459852.XA 2019-05-29 2019-05-29 A method of adversarial sample generation based on generative adversarial network Pending CN110334806A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910459852.XA CN110334806A (en) 2019-05-29 2019-05-29 A method of adversarial sample generation based on generative adversarial network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910459852.XA CN110334806A (en) 2019-05-29 2019-05-29 A method of adversarial sample generation based on generative adversarial network

Publications (1)

Publication Number Publication Date
CN110334806A true CN110334806A (en) 2019-10-15

Family

ID=68140522

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910459852.XA Pending CN110334806A (en) 2019-05-29 2019-05-29 A method of adversarial sample generation based on generative adversarial network

Country Status (1)

Country Link
CN (1) CN110334806A (en)

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110768971A (en) * 2019-10-16 2020-02-07 伍军 Confrontation sample rapid early warning method and system suitable for artificial intelligence system
CN111163472A (en) * 2019-12-30 2020-05-15 浙江工业大学 Signal identification attack defense method based on generative countermeasure network
CN111160217A (en) * 2019-12-25 2020-05-15 中山大学 Method and system for generating confrontation sample of pedestrian re-identification system
CN111210002A (en) * 2019-12-30 2020-05-29 北京航空航天大学 A method and system for multi-layer academic network community discovery based on generative adversarial network model
CN111241287A (en) * 2020-01-16 2020-06-05 支付宝(杭州)信息技术有限公司 Training method and device for generating generation model of confrontation text
CN111275115A (en) * 2020-01-20 2020-06-12 星汉智能科技股份有限公司 A Generative Adversarial Network-Based Adversarial Attack Sample Generation Method
CN111310802A (en) * 2020-01-20 2020-06-19 星汉智能科技股份有限公司 An Adversarial Attack Defense Training Method Based on Generative Adversarial Networks
CN111340066A (en) * 2020-02-10 2020-06-26 电子科技大学 An Adversarial Sample Generation Method Based on Geometric Vectors
CN111539184A (en) * 2020-04-29 2020-08-14 上海眼控科技股份有限公司 Text data manufacturing method and device based on deep learning, terminal and storage medium
CN111738374A (en) * 2020-08-28 2020-10-02 北京智源人工智能研究院 Multi-sample adversarial perturbation generation method, device, storage medium and computing device
CN111818101A (en) * 2020-09-09 2020-10-23 平安国际智慧城市科技股份有限公司 Network security detection method and device, computer equipment and storage medium
CN111898645A (en) * 2020-07-03 2020-11-06 贵州大学 A transferable adversarial example attack method based on attention mechanism
CN111967584A (en) * 2020-08-19 2020-11-20 北京字节跳动网络技术有限公司 Method, device, electronic equipment and computer storage medium for generating countermeasure sample
CN111967592A (en) * 2020-07-09 2020-11-20 中国电子科技集团公司第三十六研究所 Method for generating counterimage machine recognition based on positive and negative disturbance separation
CN112162515A (en) * 2020-10-10 2021-01-01 浙江大学 An Adversarial Attack Method for Process Monitoring System
CN112818407A (en) * 2021-04-16 2021-05-18 中国工程物理研究院计算机应用研究所 Video privacy protection method based on generation countermeasure network
CN112884143A (en) * 2019-11-29 2021-06-01 北京四维图新科技股份有限公司 Method for training robust deep neural network model
CN113158190A (en) * 2021-04-30 2021-07-23 河北师范大学 Malicious code countermeasure sample automatic generation method based on generation type countermeasure network
CN113177599A (en) * 2021-05-10 2021-07-27 南京信息工程大学 Enhanced sample generation method based on GAN
CN113222480A (en) * 2021-06-11 2021-08-06 支付宝(杭州)信息技术有限公司 Training method and device for confrontation sample generation model
CN113361594A (en) * 2021-06-03 2021-09-07 安徽理工大学 Countermeasure sample generation method based on generation model
CN113395280A (en) * 2021-06-11 2021-09-14 成都为辰信息科技有限公司 Anti-confusion network intrusion detection method based on generation of countermeasure network
CN113505886A (en) * 2021-07-08 2021-10-15 深圳市网联安瑞网络科技有限公司 Countermeasure sample generation method, system, terminal and medium based on fuzzy test
CN113537381A (en) * 2021-07-29 2021-10-22 大连海事大学 Human Rehabilitation Motion Data Enhancement Method Based on Adversarial Samples
CN113642772A (en) * 2021-07-13 2021-11-12 重庆科技学院 Reservoir identification and prediction method based on machine learning
CN114332623A (en) * 2021-12-30 2022-04-12 广东工业大学 Method and system for generating countermeasure sample by utilizing spatial transformation
CN114548300A (en) * 2019-12-20 2022-05-27 支付宝(杭州)信息技术有限公司 Method and device for explaining service processing result of service processing model
WO2022116743A1 (en) * 2020-12-03 2022-06-09 International Business Machines Corporation Generating data based on pre-trained models using generative adversarial models
US20220198790A1 (en) * 2020-02-21 2022-06-23 Tencent Technology (Shenzhen) Company Limited Training method and apparatus of adversarial attack model, generating method and apparatus of adversarial image, electronic device, and storage medium
CN114663946A (en) * 2022-03-21 2022-06-24 中国电信股份有限公司 Countermeasure sample generation method, apparatus, device and medium
CN114708136A (en) * 2022-03-24 2022-07-05 南京信息工程大学 Black box reversible countermeasure sample generation method for model authorization access control
CN115115899A (en) * 2022-03-29 2022-09-27 浙大城市学院 Method, device and system for enhancing robustness of deep neural network and electronic equipment

Cited By (51)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110768971A (en) * 2019-10-16 2020-02-07 伍军 Confrontation sample rapid early warning method and system suitable for artificial intelligence system
CN112884143B (en) * 2019-11-29 2024-05-14 北京四维图新科技股份有限公司 Method for training robust deep neural network model
CN112884143A (en) * 2019-11-29 2021-06-01 北京四维图新科技股份有限公司 Method for training robust deep neural network model
CN114548300A (en) * 2019-12-20 2022-05-27 支付宝(杭州)信息技术有限公司 Method and device for explaining service processing result of service processing model
CN114548300B (en) * 2019-12-20 2024-05-28 支付宝(杭州)信息技术有限公司 Method and device for explaining service processing result of service processing model
CN111160217A (en) * 2019-12-25 2020-05-15 中山大学 Method and system for generating confrontation sample of pedestrian re-identification system
CN111160217B (en) * 2019-12-25 2023-06-23 中山大学 A method and system for generating adversarial samples for a pedestrian re-identification system
CN111210002A (en) * 2019-12-30 2020-05-29 北京航空航天大学 A method and system for multi-layer academic network community discovery based on generative adversarial network model
CN111163472A (en) * 2019-12-30 2020-05-15 浙江工业大学 Signal identification attack defense method based on generative countermeasure network
CN111163472B (en) * 2019-12-30 2022-10-04 浙江工业大学 Signal identification attack defense method based on generative countermeasure network
CN111210002B (en) * 2019-12-30 2022-01-28 北京航空航天大学 Multi-layer academic network community discovery method and system based on generation of confrontation network model
CN111241287A (en) * 2020-01-16 2020-06-05 支付宝(杭州)信息技术有限公司 Training method and device for generating generation model of confrontation text
CN111310802A (en) * 2020-01-20 2020-06-19 星汉智能科技股份有限公司 An Adversarial Attack Defense Training Method Based on Generative Adversarial Networks
CN111275115A (en) * 2020-01-20 2020-06-12 星汉智能科技股份有限公司 A Generative Adversarial Network-Based Adversarial Attack Sample Generation Method
CN111340066B (en) * 2020-02-10 2022-05-31 电子科技大学 An Adversarial Sample Generation Method Based on Geometric Vectors
CN111340066A (en) * 2020-02-10 2020-06-26 电子科技大学 An Adversarial Sample Generation Method Based on Geometric Vectors
US20220198790A1 (en) * 2020-02-21 2022-06-23 Tencent Technology (Shenzhen) Company Limited Training method and apparatus of adversarial attack model, generating method and apparatus of adversarial image, electronic device, and storage medium
CN111539184A (en) * 2020-04-29 2020-08-14 上海眼控科技股份有限公司 Text data manufacturing method and device based on deep learning, terminal and storage medium
CN111898645A (en) * 2020-07-03 2020-11-06 贵州大学 A transferable adversarial example attack method based on attention mechanism
CN111967592B (en) * 2020-07-09 2023-12-05 中国电子科技集团公司第三十六研究所 A method for generating adversarial image machine recognition based on separating positive and negative perturbations
CN111967592A (en) * 2020-07-09 2020-11-20 中国电子科技集团公司第三十六研究所 Method for generating counterimage machine recognition based on positive and negative disturbance separation
CN111967584A (en) * 2020-08-19 2020-11-20 北京字节跳动网络技术有限公司 Method, device, electronic equipment and computer storage medium for generating countermeasure sample
CN111738374A (en) * 2020-08-28 2020-10-02 北京智源人工智能研究院 Multi-sample adversarial perturbation generation method, device, storage medium and computing device
CN111738374B (en) * 2020-08-28 2020-11-24 北京智源人工智能研究院 Multi-sample adversarial perturbation generation method, device, storage medium and computing device
CN111818101A (en) * 2020-09-09 2020-10-23 平安国际智慧城市科技股份有限公司 Network security detection method and device, computer equipment and storage medium
CN111818101B (en) * 2020-09-09 2020-12-11 平安国际智慧城市科技股份有限公司 Network security detection method and device, computer equipment and storage medium
CN112162515B (en) * 2020-10-10 2021-08-03 浙江大学 An Adversarial Attack Method for Process Monitoring System
CN112162515A (en) * 2020-10-10 2021-01-01 浙江大学 An Adversarial Attack Method for Process Monitoring System
US20220180203A1 (en) * 2020-12-03 2022-06-09 International Business Machines Corporation Generating data based on pre-trained models using generative adversarial models
US12307377B2 (en) * 2020-12-03 2025-05-20 International Business Machines Corporation Generating data based on pre-trained models using generative adversarial models
WO2022116743A1 (en) * 2020-12-03 2022-06-09 International Business Machines Corporation Generating data based on pre-trained models using generative adversarial models
GB2617722A (en) * 2020-12-03 2023-10-18 Ibm Generating data based on pre-trained models using generative adversarial models
CN112818407B (en) * 2021-04-16 2021-06-22 中国工程物理研究院计算机应用研究所 Video privacy protection method based on generation countermeasure network
CN112818407A (en) * 2021-04-16 2021-05-18 中国工程物理研究院计算机应用研究所 Video privacy protection method based on generation countermeasure network
CN113158190A (en) * 2021-04-30 2021-07-23 河北师范大学 Malicious code countermeasure sample automatic generation method based on generation type countermeasure network
CN113177599B (en) * 2021-05-10 2023-11-21 南京信息工程大学 An enhanced sample generation method based on GAN
CN113177599A (en) * 2021-05-10 2021-07-27 南京信息工程大学 Enhanced sample generation method based on GAN
CN113361594A (en) * 2021-06-03 2021-09-07 安徽理工大学 Countermeasure sample generation method based on generation model
CN113361594B (en) * 2021-06-03 2023-10-20 安徽理工大学 Countermeasure sample generation method based on generation model
CN113222480A (en) * 2021-06-11 2021-08-06 支付宝(杭州)信息技术有限公司 Training method and device for confrontation sample generation model
CN113222480B (en) * 2021-06-11 2023-05-12 支付宝(杭州)信息技术有限公司 Training method and device for challenge sample generation model
CN113395280B (en) * 2021-06-11 2022-07-26 成都为辰信息科技有限公司 Anti-confusion network intrusion detection method based on generation countermeasure network
CN113395280A (en) * 2021-06-11 2021-09-14 成都为辰信息科技有限公司 Anti-confusion network intrusion detection method based on generation of countermeasure network
CN113505886A (en) * 2021-07-08 2021-10-15 深圳市网联安瑞网络科技有限公司 Countermeasure sample generation method, system, terminal and medium based on fuzzy test
CN113642772A (en) * 2021-07-13 2021-11-12 重庆科技学院 Reservoir identification and prediction method based on machine learning
CN113537381B (en) * 2021-07-29 2024-05-10 大连海事大学 Human rehabilitation exercise data enhancement method based on adversarial samples
CN113537381A (en) * 2021-07-29 2021-10-22 大连海事大学 Human Rehabilitation Motion Data Enhancement Method Based on Adversarial Samples
CN114332623A (en) * 2021-12-30 2022-04-12 广东工业大学 Method and system for generating countermeasure sample by utilizing spatial transformation
CN114663946A (en) * 2022-03-21 2022-06-24 中国电信股份有限公司 Countermeasure sample generation method, apparatus, device and medium
CN114708136A (en) * 2022-03-24 2022-07-05 南京信息工程大学 Black box reversible countermeasure sample generation method for model authorization access control
CN115115899A (en) * 2022-03-29 2022-09-27 浙大城市学院 Method, device and system for enhancing robustness of deep neural network and electronic equipment

Similar Documents

Publication Publication Date Title
CN110334806A (en) A method of adversarial sample generation based on generative adversarial network
Zhang et al. Defense against adversarial attacks using feature scattering-based adversarial training
Rozsa et al. Are accuracy and robustness correlated
CN110348475B (en) Confrontation sample enhancement method and model based on spatial transformation
Rozsa et al. LOTS about attacking deep features
He et al. Transferable sparse adversarial attack
CN114387449A (en) An image processing method and system for dealing with neural network adversarial attacks
CN113240080A (en) Prior class enhancement based confrontation training method
Wang et al. Adversarial detection by latent style transformations
CN113988293A (en) A method of adversarial generative network combining different levels of functions
Xu et al. ASQ-FastBM3D: An adaptive denoising framework for defending adversarial attacks in machine learning enabled systems
Wang et al. Understanding universal adversarial attack and defense on graph
CN113935396A (en) Adversarial sample attack method and related device based on manifold theory
CN117171762A (en) Single-step countermeasure training method and system based on data enhancement and step adjustment
Wang et al. Generating semantic adversarial examples via feature manipulation
Chen et al. Diffilter: Defending against adversarial perturbations with diffusion filter
Dong et al. Generalizable and discriminative representations for adversarially robust few-shot learning
Zhuang et al. PCAD: Towards ASR-robust spoken language understanding via prototype calibration and asymmetric decoupling
Yin et al. Adversarial attack, defense, and applications with deep learning frameworks
CN114970809A (en) Picture countermeasure sample generation method based on generation type countermeasure network
Zhang et al. Boosting Deepfake Detection Generalizability via Expansive Learning and Confidence Judgement
CN117011508A (en) Countermeasure training method based on visual transformation and feature robustness
Noroozi et al. Virtual adversarial training for semi-supervised verification tasks
Mukeri et al. Towards query efficient and derivative free black box adversarial machine learning attack
CN113487506A (en) Countermeasure sample defense method, device and system based on attention denoising

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191015

RJ01 Rejection of invention patent application after publication