CN112949678A

CN112949678A - Method, system, equipment and storage medium for generating confrontation sample of deep learning model

Info

Publication number: CN112949678A
Application number: CN202110049467.5A
Authority: CN
Inventors: 蔺琛皓; 朱炯历; 沈超; 管晓宏
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2021-01-14
Filing date: 2021-01-14
Publication date: 2021-06-11
Anticipated expiration: 2041-01-14
Also published as: CN112949678B

Abstract

The invention belongs to the field of deep learning models, and discloses a method, a system, equipment and a storage medium for generating a confrontation sample of a deep learning model. By acquiring the sensitive matrix, the disturbance is realized based on the sensitive matrix, the distribution of disturbed pixel points becomes sparse, the disturbance is less easily perceived from the angle of human eye observation, and the two norms of the confrontation sample are greatly reduced from the angle of quantization.

Description

Deep learning model adversarial sample generation method, system, device and storage medium

技术领域technical field

本发明属于深度学习模型领域，涉及一种深度学习模型对抗样本生成方法、系统、设备及存储介质。The invention belongs to the field of deep learning models, and relates to a method, system, equipment and storage medium for generating a deep learning model confrontation sample.

背景技术Background technique

深度学习在许多任务中有比传统机器学习更优秀的表现，如图像分类、目标检测、语音识别、自然语言处理等。随着深度神经网络在各个领域的广泛运用与发展，其安全问题也愈发引人关注。对抗攻击，是指攻击者通过构造针对性的深度学习模型输入从而让深度学习模型做出误判，即与人判断结果不一致，在一般情况下，攻击者通过在良性样本基础上添加难以察觉的扰动，使得深度学习模型做出误判，得出与输入良性样本不一致的结果，这样“恶性”的样本被称作对抗样本。Deep learning has outperformed traditional machine learning in many tasks, such as image classification, object detection, speech recognition, natural language processing, etc. With the extensive application and development of deep neural networks in various fields, its security issues have also attracted more and more attention. Adversarial attack means that the attacker constructs targeted deep learning model input to make the deep learning model make a misjudgment, that is, the result is inconsistent with human judgment. The perturbation causes the deep learning model to make misjudgments and obtain results that are inconsistent with the input benign samples. Such "malignant" samples are called adversarial samples.

针对图像分类领域，对抗攻击的扰动即为轻微改变图像中各像素的值，从而使得深度学习模型在对该图像进行分类时出错。针对攻击者能获取信息的多少，对抗攻击分为黑盒和白盒两种情形。其中，白盒指攻击者能获得深度学习模型的内部参数，包括深度学习模型结构和权重，而黑盒攻击更贴近实际应用情况，攻击者只能获得深度学习模型的输出向量。在黑盒对抗攻击中，因为深度学习模型的梯度无法直接获得，因而一些利用进化策略进行优化的方法可以有相对不错的表现。进化策略，是一种模仿生物进化原理的求解参数优化问题的方法，通过不断产生新的个体(individual)，并通过个体之间的适应度(fitness)比较来进行淘汰，最终获得具有较高适应度的个体。由于进化算法的优化过程不需要梯度信息，因而并不受黑盒条件的约束。在图像分类中，根据使深度学习模型误判情况的不同，对抗攻击还分为有目标对抗攻击(targeted)和无目标对抗攻击(non-targeted)，若使得模型的分类结果与原本的类不同，则是无目标攻击，而若使模型将图像误判为任意选定的某个目标类，则该攻击为有目标攻击。In the field of image classification, the perturbation of an adversarial attack is to slightly change the value of each pixel in the image, so that the deep learning model makes an error in classifying the image. Adversarial attacks are divided into two types, black box and white box, according to how much information the attacker can obtain. Among them, the white box means that the attacker can obtain the internal parameters of the deep learning model, including the structure and weight of the deep learning model, while the black box attack is closer to the actual application situation, and the attacker can only obtain the output vector of the deep learning model. In the black-box adversarial attack, because the gradient of the deep learning model cannot be obtained directly, some optimization methods using evolutionary strategies can have relatively good performance. Evolution strategy is a method of solving parameter optimization problems that imitates the principle of biological evolution. It generates new individuals (individual) continuously, and eliminates them by comparing the fitness between individuals, and finally obtains a higher fitness. degree individual. Since the optimization process of the evolutionary algorithm does not require gradient information, it is not constrained by the black box condition. In image classification, the adversarial attack is also divided into targeted adversarial attack (targeted) and non-targeted adversarial attack (non-targeted) according to the different situations that make the deep learning model misjudgment. If the classification result of the model is different from the original class , it is an untargeted attack, and if the model misjudges the image as an arbitrarily selected target class, the attack is a targeted attack.

在衡量黑盒情况下对抗样本产生效率时，常采用深度学习模型查询次数作为指标，而在衡量一个对抗样本的“难察觉”程度时，常常用添加扰动的零范数(扰动像素个数)、二范数(各像素扰动值的平方和开根号)和无穷范数(各像素中最大的扰动值)作为指标，在有限无穷范数的情形下，进化算法往往拥有相对较高的效率，然而由于一些冗余的扰动存在，即一些像素点本不需要加扰动，导致整体而言，用进化算法生成的对抗样本的二范数普遍较高，扰动点的分布比较广泛，因而容易被人眼察觉。针对这一问题，如果简单地将二范数的大小加入到适应度中，会导致原适应度与二范数的平衡难以把控，从而导致深度学习模型的优化速度降低，或者二范数的大小几乎没有束缚力，使得生成的对抗样本二范数仍然较高，而对抗样本较高的二范数又导致，在使用该类对抗样本训练后的深度学习模型的稳定性及安全性较低。When measuring the generation efficiency of adversarial samples in the case of black boxes, the number of queries of the deep learning model is often used as an indicator, and when measuring the "indiscernibility" of an adversarial sample, the zero norm (number of disturbed pixels) with added disturbance is often used. , the second norm (the square and root of the perturbation value of each pixel) and the infinite norm (the largest perturbation value in each pixel) as indicators, in the case of finite infinite norm, evolutionary algorithms often have relatively high efficiency However, due to the existence of some redundant perturbations, that is, some pixels do not need to be perturbed, as a whole, the second norm of the adversarial samples generated by the evolutionary algorithm is generally high, and the distribution of perturbation points is relatively wide, so it is easy to be The human eye perceives. In response to this problem, if the size of the second norm is simply added to the fitness, the balance between the original fitness and the second norm will be difficult to control, resulting in a decrease in the optimization speed of the deep learning model, or the loss of the second norm. The size has almost no binding force, so that the second norm of the generated adversarial samples is still high, and the higher second norm of the adversarial samples leads to the low stability and security of the deep learning model after training with this type of adversarial samples. .

发明内容SUMMARY OF THE INVENTION

本发明的目的在于克服上述现有技术中，现有黑盒攻击方法产生的对抗样本的二范数普遍较高或者二范数的大小几乎没有束缚力的缺点，提供一种深度学习模型对抗样本生成方法、系统、设备及存储介质。The purpose of the present invention is to overcome the shortcomings of the above-mentioned prior art that the two-norm of the adversarial samples generated by the existing black-box attack methods is generally high or the size of the two-norm has almost no binding force, and provides a deep learning model adversarial sample. Generation method, system, device and storage medium.

为达到上述目的，本发明采用以下技术方案予以实现：To achieve the above object, the present invention adopts the following technical solutions to realize:

本发明第一方面，一种深度学习模型对抗样本生成方法，包括以下步骤：In a first aspect of the present invention, a method for generating an adversarial sample for a deep learning model includes the following steps:

S1：获取原始图像基于目标深度学习模型的敏感矩阵；S1: Obtain the sensitivity matrix of the original image based on the target deep learning model;

S2：根据预设的用于扰动的若干零范数及若干无穷范数，构建若干范数组；根据敏感矩阵及若干范数组，得到各范数组对应的扰动图；S2: Construct several norm groups according to a number of preset zero norms and a number of infinite norms for perturbation; obtain a perturbation map corresponding to each norm group according to the sensitivity matrix and several norm groups;

S3：将原始图像及各范数组对应的扰动图均查询目标深度学习模型，得到并根据各范数组对应的扰动图相对于原始图像的原始类预测概率的下降值，获得对抗零范数及对抗无穷范数；S3: Query the target deep learning model for both the original image and the perturbation map corresponding to each norm group, and obtain and obtain the anti-zero norm and the anti-zero norm according to the drop value of the predicted probability of the original class of the perturbation map corresponding to each norm group relative to the original image. infinity norm;

S4：根据对抗无穷范数构建预设数量对抗扰动矩阵，将攻击目标类预测概率为适应度，以适应度最大为优化目标，根据原始图像、敏感矩阵及对抗零范数，通过进化算法迭代优化各对抗扰动矩阵，每次迭代后均进行S5；S4: Construct a preset number of adversarial disturbance matrices according to the adversarial infinity norm, take the prediction probability of the attack target class as the fitness, and take the maximum fitness as the optimization goal, based on the original image, the sensitive matrix and the adversarial zero norm, iteratively optimize through the evolutionary algorithm For each adversarial perturbation matrix, S5 is performed after each iteration;

S5：当当前迭代优化后的对抗扰动矩阵中，至少存在一个目标对抗扰动矩阵时，将原始图像通过目标对抗扰动矩阵扰动，得到对抗样本并输出，迭代结束；否则，返回S4。S5: When there is at least one target adversarial disturbance matrix in the adversarial disturbance matrix after the current iteration optimization, the original image is perturbed by the target adversarial disturbance matrix to obtain adversarial samples and output, and the iteration ends; otherwise, return to S4.

本发明深度学习模型对抗样本生成方法进一步的改进在于：The further improvement of the adversarial sample generation method of the deep learning model of the present invention is:

所述S1的具体方法为：The specific method of the S1 is:

获取与目标深度学习模型具有同一分类目标的深度学习模型为参考模型；Obtain a deep learning model with the same classification target as the target deep learning model as a reference model;

逐一选取原始图像中各像素点进行如下步骤：将当前像素点进行预设大小的扰动，得到敏感图像，将敏感图像输入参考模型，得到当前像素点扰动后相对于扰动前参考模型输出的原始类预测概率的变化值，作为当前像素点的敏感值；Select each pixel point in the original image one by one and perform the following steps: perturb the current pixel point with a preset size to obtain a sensitive image, input the sensitive image into the reference model, and obtain the original class output of the current pixel point after perturbation relative to the reference model before perturbation The change value of the predicted probability is used as the sensitive value of the current pixel point;

将原始图像中各像素点的敏感值，按照原始图像中各像素点的位置排列，得到原始图像基于目标深度学习模型的敏感矩阵。Arrange the sensitivity values of each pixel in the original image according to the position of each pixel in the original image to obtain the sensitivity matrix of the original image based on the target deep learning model.

所述S2的具体方法为：The specific method of the S2 is:

根据预设的用于扰动的若干零范数及若干无穷范数，逐一选取若干零范数中的零范数至遍历若干零范数，将选取的零范数与若干无穷范数中的各无穷范数一一组合，得到若干范数组；According to a number of zero norms and a number of infinite norms preset for perturbation, select the zero norm of the number of zero norms one by one to traverse the number of zero norms, and compare the selected zero norm and the number of infinite norms. Infinite norms are combined one by one to obtain several norm arrays;

预设初始扰动矩阵，初始扰动矩阵内参数数量与原始图像中像素点个数相同，且按照原始图像中各像素点的位置排列，各参数的绝对值为1，初始扰动矩阵与敏感矩阵中同一位置的参数的符号一致；逐一选取范数组，至遍历若干范数组，将当前范数组中的无穷范数的一半与初始扰动矩阵中的各参数相乘，得到各范数组对应的扰动矩阵，获取当前范数组中的零范数，按照敏感值从大到小的顺序，将敏感矩阵中前零范数个敏感值置为1，其余敏感值置为0，得到各扰动矩阵对应的掩膜矩阵；The initial perturbation matrix is preset. The number of parameters in the initial perturbation matrix is the same as the number of pixels in the original image, and is arranged according to the position of each pixel in the original image. The absolute value of each parameter is 1, and the initial perturbation matrix is the same as the sensitivity matrix. The symbols of the parameters of the position are the same; select the norm groups one by one, and traverse several norm groups, multiply the half of the infinite norm in the current norm group with each parameter in the initial perturbation matrix, and obtain the perturbation matrix corresponding to each norm group, get For the zero norm in the current norm array, according to the order of sensitivity values from large to small, set the first zero norm sensitive values in the sensitive matrix to 1, and set the remaining sensitive values to 0 to obtain the mask matrix corresponding to each disturbance matrix. ;

将原始图像中的各像素点分别叠加各范数组对应的扰动矩阵，然后与各扰动矩阵对应的掩膜矩阵相乘，得到各范数组对应的扰动图。The perturbation matrix corresponding to each norm group is superimposed on each pixel in the original image, and then multiplied by the mask matrix corresponding to each perturbation matrix to obtain the perturbation map corresponding to each norm group.

所述S3的具体方法为：The specific method of the S3 is:

将原始图像及各范数组对应的扰动图均输入目标深度学习模型，得到原始图像的原始类预测概率以及各范数组对应的扰动图的原始类预测概率，获取各范数组对应的扰动图相对于原始图像的原始类预测概率的下降值；The original image and the perturbation map corresponding to each norm group are input into the target deep learning model, and the original class prediction probability of the original image and the original class prediction probability of the perturbation map corresponding to each norm group are obtained, and the perturbation map corresponding to each norm group is obtained. The drop value of the original class prediction probability of the original image;

将所有范数组中，同一零范数的范数组对应的扰动图相对于原始图像的原始类预测概率的下降值叠加，得到若干第一叠加值，选取若干第一叠加值中斜率变化最大的第一叠加值对应的零范数为对抗零范数；In all norm groups, the perturbation map corresponding to the norm group of the same zero norm is superimposed relative to the drop value of the original class prediction probability of the original image to obtain several first superimposed values. The zero norm corresponding to the first superposition value is the anti-zero norm;

将所有范数组中，同一无穷范数的范数组对应的扰动图相对于原始图像的原始类预测概率的下降值叠加，得到若干第二叠加值，选取若干第二叠加值中斜率变化最大的第二叠加值对应的无穷范数为对抗无穷范数。In all norm groups, the perturbation map corresponding to the norm group of the same infinite norm is superimposed relative to the drop value of the original class prediction probability of the original image to obtain a number of second superposition values, and select the number of the second superposition values with the largest slope change. The infinity norm corresponding to the two superposition values is the anti-infinity norm.

所述S4中的进化算法为：自适应差分进化策略或线性张成的协方差矩阵自适应进化策略。The evolution algorithm in S4 is: an adaptive differential evolution strategy or a linearly stretched covariance matrix adaptive evolution strategy.

当进化算法为自适应差分进化策略时，所述S4的具体方法为：When the evolutionary algorithm is an adaptive differential evolution strategy, the specific method of S4 is:

S401：在[-对抗无穷范数，对抗无穷范数]中随机取值作为对抗扰动矩阵中的参数，构建预设数量的对抗扰动矩阵；对抗扰动矩阵中的各参数与原始图像的各像素点一一对应；S401: Randomly select values in [- against infinity norm, against infinity norm] as parameters in the adversarial disturbance matrix, and construct a preset number of adversarial disturbance matrices; each parameter in the adversarial disturbance matrix and each pixel of the original image one-to-one correspondence;

S402：根据自适应比例因子及变异公式，将预设数量的对抗扰动矩阵进行变异，得到预设数量的变异对抗扰动矩阵，将对抗扰动矩阵及变异对抗扰动矩阵均作为个体；根据对抗零范数及敏感矩阵，获取各个体对应的掩膜矩阵；将原始图像中的各像素点分别叠加各个体，然后与各个体对应的掩膜矩阵相乘，得到各个体对应的扰动图；将各个体对应的扰动图输入目标深度学习模型，得到各个体对应的扰动图的适应度，按照适应度从大到小的顺序，选取预设数量各个体，作为迭代优化后的对抗扰动矩阵；S402: According to the adaptive scale factor and the mutation formula, mutate a preset number of anti-disturbance matrices to obtain a preset number of mutant anti-disturbance matrices, and use both the anti-disturbance matrix and the mutant anti-disturbance matrix as individuals; according to the anti-zero norm and the sensitivity matrix to obtain the mask matrix corresponding to each individual; each pixel in the original image is superimposed on each individual, and then multiplied with the mask matrix corresponding to each individual to obtain the perturbation map corresponding to each individual; each individual corresponds to The perturbation map is input into the target deep learning model, and the fitness of the perturbation map corresponding to each individual is obtained. According to the order of fitness from large to small, a preset number of each individual is selected as the iteratively optimized confrontation perturbation matrix;

S403：以迭代优化后的对抗扰动矩阵更新对抗扰动矩阵，迭代进行S402，每次迭代后均进行S5。S403: Update the adversarial disturbance matrix with the iteratively optimized adversarial disturbance matrix, perform S402 iteratively, and perform S5 after each iteration.

当进化算法为线性张成的协方差矩阵自适应进化策略时，所述S4的具体方法为：When the evolutionary algorithm is a linearly stretched covariance matrix adaptive evolution strategy, the specific method of S4 is:

S411：随机生成k个n维向量作为张成空间的基底；通过预设的k维高斯分布，随机生成预设数量个k维向量，以k维向量为权重乘以基底，得到预设数量个n维样本向量，将n维样本向量以对抗无穷范数限幅后得到对抗扰动矩阵；S411: Randomly generate k n-dimensional vectors as the basis of the stretched space; randomly generate a preset number of k-dimensional vectors through a preset k-dimensional Gaussian distribution, multiply the basis by the k-dimensional vector as a weight, and obtain a preset number of The n-dimensional sample vector, the anti-perturbation matrix is obtained by limiting the n-dimensional sample vector with the anti-infinite norm;

S412：根据对抗零范数及敏感矩阵，获取各抗扰动矩阵对应的掩膜矩阵；将原始图像中的各像素点分别叠加各对抗扰动矩阵，然后与各抗扰动矩阵对应的掩膜矩阵相乘，得到各抗扰动矩阵对应的扰动图；将各抗扰动矩阵对应的扰动图输入目标深度学习模型，得到各抗扰动矩阵对应的扰动图的适应度，按照适应度从大到小的顺序，选取设定数量个抗扰动矩阵，根据设定数量个抗扰动矩阵更新k维高斯分布，根据更新后的k维高斯分布随机生成预设数量个优化k维向量，以优化k维向量为权重乘以基底，得到预设数量个优化n维样本向量，将优化n维样本向量以对抗无穷范数限幅后得到迭代优化后的对抗扰动矩阵；S412: According to the anti-zero norm and the sensitive matrix, obtain a mask matrix corresponding to each anti-disturbance matrix; superimpose each pixel point in the original image with each anti-disturbance matrix, and then multiply the mask matrix corresponding to each anti-disturbance matrix , obtain the disturbance graph corresponding to each anti-disturbance matrix; input the disturbance graph corresponding to each anti-disturbance matrix into the target deep learning model to obtain the fitness of the disturbance graph corresponding to each anti-disturbance matrix. Set a number of anti-disturbance matrices, update the k-dimensional Gaussian distribution according to the set number of anti-disturbance matrices, randomly generate a preset number of optimized k-dimensional vectors according to the updated k-dimensional Gaussian distribution, and multiply the optimized k-dimensional vector by the weight base, obtain a preset number of optimized n-dimensional sample vectors, and optimize the n-dimensional sample vectors to resist infinite norm limiting to obtain an iteratively optimized anti-disturbance matrix;

S413：以迭代优化后的对抗扰动矩阵更新对抗扰动矩阵，迭代进行S412，每次迭代后均进行S5。S413: Update the adversarial disturbance matrix with the iteratively optimized adversarial disturbance matrix, perform S412 iteratively, and perform S5 after each iteration.

本发明第二方面，一种深度学习模型对抗样本生成系统，包括敏感矩阵获取模块、扰动图获取模块、优化模块、参数确定模块以及输出模块；In a second aspect of the present invention, a deep learning model adversarial sample generation system includes a sensitivity matrix acquisition module, a disturbance map acquisition module, an optimization module, a parameter determination module, and an output module;

敏感矩阵获取模块用于获取原始图像基于目标深度学习模型的敏感矩阵；The sensitivity matrix acquisition module is used to acquire the sensitivity matrix of the original image based on the target deep learning model;

扰动图获取模块用于根据预设的用于扰动的若干零范数及若干无穷范数，构建若干范数组；根据敏感矩阵及若干范数组，得到各范数组对应的扰动图；The perturbation map acquisition module is used to construct several norm groups according to a number of zero norms and a number of infinite norms preset for perturbation; according to the sensitivity matrix and several norm groups, obtain a perturbation map corresponding to each norm group;

参数确定模块用于将原始图像及各范数组对应的扰动图均查询目标深度学习模型，得到并根据各范数组对应的扰动图相对于原始图像的原始类预测概率的下降值，获得对抗零范数及对抗无穷范数；The parameter determination module is used to query the target deep learning model for both the original image and the perturbation map corresponding to each norm group, and obtain and predict the drop value of the original class prediction probability of the perturbation map corresponding to each norm group relative to the original image to obtain the anti-zero norm numbers and against the infinity norm;

优化模块用于根据对抗无穷范数构建预设数量对抗扰动矩阵，将攻击目标类预测概率为适应度，以适应度最大为优化目标，根据原始图像、敏感矩阵及对抗零范数，通过进化算法迭代优化各对抗扰动矩阵，每次迭代后均触发输出模块；The optimization module is used to construct a preset number of adversarial disturbance matrices according to the adversarial infinity norm, take the prediction probability of the attack target class as the fitness, and take the maximum fitness as the optimization goal. Iteratively optimize each anti-disturbance matrix, and trigger the output module after each iteration;

输出模块用于当当前迭代优化后的对抗扰动矩阵中，至少存在一个目标对抗扰动矩阵时，将原始图像通过目标对抗扰动矩阵扰动，得到对抗样本并输出，迭代结束；否则，触发优化模块。The output module is used to perturb the original image through the target adversarial perturbation matrix when there is at least one target adversarial perturbation matrix in the adversarial perturbation matrix after the current iterative optimization, to obtain adversarial samples and output them, and the iteration ends; otherwise, the optimization module is triggered.

本发明第三方面，一种计算机设备，包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序，所述处理器执行所述计算机程序时实现上述深度学习模型对抗样本生成方法的步骤。In a third aspect of the present invention, a computer device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the above-mentioned deep learning when executing the computer program The steps of the model adversarial example generation method.

本发明第四方面，一种计算机可读存储介质，所述计算机可读存储介质存储有计算机程序，其特征在于，所述计算机程序被处理器执行时实现上述深度学习模型对抗样本生成方法的步骤。A fourth aspect of the present invention provides a computer-readable storage medium storing a computer program, characterized in that, when the computer program is executed by a processor, the steps of the above-mentioned deep learning model adversarial sample generation method are implemented .

与现有技术相比，本发明具有以下有益效果：Compared with the prior art, the present invention has the following beneficial effects:

本发明深度学习模型对抗样本生成方法，通过原始图像基于目标深度学习模型的敏感矩阵限制原始图像的扰动区域，使得扰动点的分布变得稀疏，从而使对抗扰动更不易被察觉，从量化的角度，对抗扰动的二范数得到了较大的下降。同时，通过结合敏感矩阵进行合理的“试探”，尝试不同零范数及无穷范数的组合，自动地、动态地为每次对抗攻击调节零范数及无穷范数，提高了对抗攻击效率与限定查询次数时的成功率。整个流程是插件式的，其中迭代优化策略部分可以结合实际应用情形替换成不同的进化算法用于优化，具有方便、快捷的特点。继而，通过生成的低二范数的对抗样本训练目标深度学习模型，能够进一步提高目标深度学习模型预测的准确性，提升目标深度学习模型的鲁棒性。The deep learning model adversarial sample generation method of the present invention limits the disturbance area of the original image based on the sensitive matrix of the target deep learning model, so that the distribution of disturbance points becomes sparse, thereby making the adversarial disturbance more difficult to detect. From the perspective of quantification , the two-norm against perturbation has been greatly reduced. At the same time, by combining the sensitive matrix to conduct reasonable "trials", trying different combinations of zero norm and infinite norm, the zero norm and infinite norm are automatically and dynamically adjusted for each adversarial attack, which improves the efficiency of adversarial attacks and the The success rate when the number of queries is limited. The whole process is plug-in, in which the iterative optimization strategy part can be replaced by different evolutionary algorithms for optimization in combination with the actual application situation, which is convenient and fast. Then, training the target deep learning model through the generated adversarial samples with low two-norm can further improve the prediction accuracy of the target deep learning model and improve the robustness of the target deep learning model.

进一步的，通过不同深度学习模型之间敏感矩阵高度相似的特点，不直接对目标深度学习模型进行查询，而通过参考模型来获得敏感矩阵，减少了查询次数。Further, by virtue of the highly similar sensitivity matrix between different deep learning models, the target deep learning model is not directly queried, but the sensitivity matrix is obtained through the reference model, which reduces the number of queries.

附图说明Description of drawings

图1为本发明的深度学习模型对抗样本生成方法流程框图；Fig. 1 is a flow chart of a method for generating an adversarial sample of a deep learning model of the present invention;

图2为本发明的深度学习模型对抗样本生成方法原理图；2 is a schematic diagram of a method for generating an adversarial sample of a deep learning model of the present invention;

图3为本发明的进化算法原理框图。FIG. 3 is a schematic block diagram of the evolutionary algorithm of the present invention.

具体实施方式Detailed ways

为了使本技术领域的人员更好地理解本发明方案，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分的实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都应当属于本发明保护的范围。In order to make those skilled in the art better understand the solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only Embodiments are part of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

需要说明的是，本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象，而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换，以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外，术语“包括”和“具有”以及他们的任何变形，意图在于覆盖不排他的包含，例如，包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元，而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first", "second" and the like in the description and claims of the present invention and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used may be interchanged under appropriate circumstances such that the embodiments of the invention described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having" and any variations thereof, are intended to cover non-exclusive inclusion, for example, a process, method, system, product or device comprising a series of steps or units is not necessarily limited to those expressly listed Rather, those steps or units may include other steps or units not expressly listed or inherent to these processes, methods, products or devices.

下面结合附图对本发明做进一步详细描述：Below in conjunction with accompanying drawing, the present invention is described in further detail:

参见图1和2，本发明提供一种深度学习模型对抗样本生成方法，基于敏感矩阵，设计了一个低扰动的进化算法黑盒对抗攻击插件式框架，来实现深度学习模型对抗样本生成，具体的，包括以下步骤。Referring to Figures 1 and 2, the present invention provides a method for generating adversarial samples for deep learning models. Based on the sensitivity matrix, a low-disturbance evolutionary algorithm black-box adversarial attack plug-in framework is designed to realize the generation of adversarial samples for deep learning models. , including the following steps.

S1：获取原始图像基于目标深度学习模型的敏感矩阵。S1: Obtain the sensitivity matrix of the original image based on the target deep learning model.

具体的，获取与目标深度学习模型具有同一分类目标的深度学习模型为参考模型M_ref；逐一选取原始图像中各像素点进行如下步骤：Specifically, obtaining a deep learning model with the same classification target as the target deep learning model is the reference model M _ref ; each pixel in the original image is selected one by one to perform the following steps:

将当前像素点进行预设大小的扰动，得到敏感图像，将敏感图像输入参考模型，得到当前像素点扰动后相对于扰动前参考模型输出的原始类预测概率的变化值，作为当前像素点的敏感值；将原始图像中各像素点的敏感值，按照原始图像中各像素点的位置排列，得到原始图像基于目标深度学习模型的敏感矩阵，敏感矩阵由每个像素点的敏感度的值组成。The current pixel is perturbed by a preset size to obtain a sensitive image, and the sensitive image is input into the reference model to obtain the change value of the original class prediction probability output by the current pixel after perturbation relative to the reference model before perturbation, as the sensitivity of the current pixel. value; arrange the sensitivity values of each pixel in the original image according to the position of each pixel in the original image to obtain the sensitivity matrix of the original image based on the target deep learning model, and the sensitivity matrix is composed of the sensitivity value of each pixel.

具体地，参考模型M_ref对原始图像中每个像素点的敏感度，定义为对该像素点进行某个固定大小的扰动导致的参考模型M_ref的输出对模型输出向量中原类别预测概率的变化值，这一过程用于估计该维度，即该像素点的偏导数的值，以此作为当前像素点扰动后相对于扰动前参考模型输出的原始类预测概率的变化值，偏导数的绝对值越大，则越敏感。Specifically, the sensitivity of the reference model M _ref to each pixel in the original image is defined as the change in the predicted probability of the original category in the model output vector caused by the output of the reference model M _ref caused by a certain fixed-size perturbation to the pixel. value, this process is used to estimate the dimension, that is, the value of the partial derivative of the pixel point, as the change value of the original class prediction probability output by the reference model before the disturbance of the current pixel point after the disturbance, the absolute value of the partial derivative The bigger it is, the more sensitive it is.

S2：根据预设的用于扰动的若干零范数及若干无穷范数，构建若干范数组；根据敏感矩阵及若干范数组，得到各范数组对应的扰动图。S2: Construct several norm groups according to a number of zero norms and a number of infinite norms preset for perturbation; obtain a perturbation map corresponding to each norm group according to the sensitivity matrix and several norm groups.

将S1生成的敏感矩阵用于目标深度学习模型的攻击，由于不同深度学习模型的敏感矩阵具有高度相似性，因而参考模型的敏感矩阵可直接用于对目标深度学习模型的攻击，具体的，根据预设的用于扰动的若干零范数

及若干无穷范数

逐一选取若干零范数中的零范数至遍历若干零范数，将选取的零范数与若干无穷范数中的各无穷范数一一组合，得到若干范数组

The sensitivity matrix generated by S1 is used to attack the target deep learning model. Since the sensitivity matrices of different deep learning models are highly similar, the sensitivity matrix of the reference model can be directly used to attack the target deep learning model. Specifically, according to preset zero-norms for perturbation

and some infinity norm

Select the zero-norms of the zero-norms one by one to traverse the zero-norms, and combine the selected zero-norms with the infinite-norms of the infinite-norms one by one to obtain a number of norm groups

预设初始扰动矩阵，初始扰动矩阵内参数数量与原始图像中像素点个数相同，且按照原始图像中各像素点的位置排列，各参数的绝对值为1，初始扰动矩阵与敏感矩阵中同一位置的参数的符号一致；逐一选取范数组，至遍历若干范数组，将当前范数组中的无穷范数

的一半即

与初始扰动矩阵中的各参数相乘，得到各范数组对应的扰动矩阵，获取当前范数组中的零范数，按照敏感值从大到小的顺序，将敏感矩阵中前零范数

个敏感值置为1，其余敏感值置为0，得到各扰动矩阵对应的掩膜矩阵；将原始图像中的各像素点分别叠加各范数组对应的扰动矩阵，然后与各扰动矩阵对应的掩膜矩阵相乘，得到各范数组对应的扰动图。The initial perturbation matrix is preset. The number of parameters in the initial perturbation matrix is the same as the number of pixels in the original image, and is arranged according to the position of each pixel in the original image. The absolute value of each parameter is 1, and the initial perturbation matrix is the same as the sensitivity matrix. The symbols of the positional parameters are the same; select the norm arrays one by one, until traversing several norm arrays, the infinite norm in the current norm array

half of

Multiply each parameter in the initial perturbation matrix to obtain the perturbation matrix corresponding to each norm group, obtain the zero norm in the current norm group, and put the first zero norm in the sensitivity matrix according to the order of sensitivity values from large to small.

Each sensitive value is set to 1, and the remaining sensitive values are set to 0, and the mask matrix corresponding to each perturbation matrix is obtained; the perturbation matrix corresponding to each norm group is superimposed on each pixel in the original image, and then the mask matrix corresponding to each perturbation matrix is superimposed on each pixel in the original image. The membrane matrix is multiplied to obtain the perturbation map corresponding to each norm array.

S3：将原始图像及各范数组对应的扰动图均查询目标深度学习模型，得到并根据各范数组对应的扰动图相对于原始图像的原始类预测概率的下降值，获得对抗零范数l₀及对抗无穷范数l_∞。S3: Query the target deep learning model for both the original image and the perturbation map corresponding to each norm group, and obtain the anti-zero norm l ₀ according to the drop value of the predicted probability of the original class of the perturbation map corresponding to each norm group relative to the original image. and against the infinity norm l _∞ .

具体的，将原始图像及各范数组对应的扰动图均输入目标深度学习模型，得到原始图像的原始类预测概率以及各范数组对应的扰动图的原始类预测概率，获取各范数组对应的扰动图相对于原始图像的原始类预测概率的下降值p_i,j；将所有范数组中，同一零范数的范数组对应的扰动图相对于原始图像的原始类预测概率的下降值叠加，得到若干第一叠加值

运用肘部法则，选取若干第一叠加值中斜率变化最大的第一叠加值对应的零范数为对抗零范数；将所有范数组中，同一无穷范数的范数组对应的扰动图相对于原始图像的原始类预测概率的下降值叠加，得到若干第二叠加值

运用肘部法则，选取若干第二叠加值中斜率变化最大的第二叠加值对应的无穷范数为对抗无穷范数。Specifically, the original image and the perturbation map corresponding to each norm group are input into the target deep learning model, the original class prediction probability of the original image and the original class prediction probability of the perturbation map corresponding to each norm group are obtained, and the perturbation map corresponding to each norm group is obtained. The drop value p _i,j of the original class prediction probability of the graph relative to the original image; in all norm groups, the perturbation graph corresponding to the norm group of the same zero norm is superimposed relative to the drop value of the original class prediction probability of the original image, get several first superposition values

Using the elbow rule, select the zero norm corresponding to the first superposition value with the largest slope change among the first superposition values as the anti-zero norm; in all norm groups, the perturbation map corresponding to the norm group of the same infinite norm is relative to The descending values of the original class prediction probability of the original image are superimposed to obtain several second superimposed values

Using the elbow rule, the infinity norm corresponding to the second superposition value with the largest slope change among several second superposition values is selected as the anti-infinity norm.

S4：根据对抗无穷范数构建预设数量对抗扰动矩阵，将攻击目标类预测概率为适应度，以适应度最大为优化目标，根据原始图像、敏感矩阵及对抗零范数，通过进化算法迭代优化各对抗扰动矩阵，每次迭代后均进行S5。S4: Construct a preset number of adversarial disturbance matrices according to the adversarial infinity norm, take the prediction probability of the attack target class as the fitness, and take the maximum fitness as the optimization goal, based on the original image, the sensitive matrix and the adversarial zero norm, iteratively optimize through the evolutionary algorithm For each adversarial perturbation matrix, S5 is performed after each iteration.

其中，当无目标攻击时，适应度为目标深度学习模型输出向量中原本类的概率的负值，当有目标攻击时，适应度为目标深度学习模型输出向量中目标类的概率。参见图3，进化算法的原理示意，根据人为经验初始化种群，然后进行变异、交叉，以适应度最大为优化目标进行选择，根据选择结结果判断是否达到攻击目标，达到攻击目标的话就输出对应的对抗样本，否则，在迭代次数限制内继续迭代。其中，进化算法为可以选择自适应差分进化策略或线性张成的协方差矩阵自适应进化策略。Among them, when there is no target attack, the fitness is the negative value of the probability of the original class in the output vector of the target deep learning model, and when there is a target attack, the fitness is the probability of the target class in the output vector of the target deep learning model. See Figure 3, the principle of the evolutionary algorithm, initialize the population according to human experience, then mutate and cross, select with the maximum fitness as the optimization goal, and judge whether the attack target is reached according to the result of the selection knot. If the attack target is reached, output the corresponding Adversarial examples, otherwise, continue to iterate within the iteration limit. The evolutionary algorithm is an adaptive differential evolution strategy or a linearly stretched covariance matrix adaptive evolution strategy.

当进化算法为自适应差分进化策略时，S4的具体方法为：When the evolutionary algorithm is an adaptive differential evolution strategy, the specific method of S4 is:

S401：在[-对抗无穷范数，对抗无穷范数]中随机取值作为对抗扰动矩阵中的参数，构建预设数量的对抗扰动矩阵，一般为20个；对抗扰动矩阵中的各参数与原始图像的各像素点一一对应。S401: In [- against infinity norm, against infinity norm], random values are taken as parameters in the adversarial disturbance matrix, and a preset number of adversarial disturbance matrices are constructed, generally 20; the parameters in the adversarial disturbance matrix are the same as the original Each pixel of the image corresponds one-to-one.

S402：根据自适应比例因子及变异公式，将预设数量的对抗扰动矩阵进行变异，得到预设数量的变异对抗扰动矩阵，将对抗扰动矩阵及变异对抗扰动矩阵均作为个体；根据对抗零范数及敏感矩阵，获取各个体对应的掩膜矩阵；将原始图像中的各像素点分别叠加各个体，然后与各个体对应的掩膜矩阵相乘，得到各个体对应的扰动图；将各个体对应的扰动图输入目标深度学习模型，得到各个体对应的扰动图的适应度，按照适应度从大到小的顺序，选取预设数量各个体，作为迭代优化后的对抗扰动矩阵。S402: According to the adaptive scale factor and the mutation formula, mutate a preset number of anti-disturbance matrices to obtain a preset number of mutant anti-disturbance matrices, and use both the anti-disturbance matrix and the mutant anti-disturbance matrix as individuals; according to the anti-zero norm and the sensitivity matrix to obtain the mask matrix corresponding to each individual; each pixel in the original image is superimposed on each individual, and then multiplied with the mask matrix corresponding to each individual to obtain the perturbation map corresponding to each individual; each individual corresponds to Input the perturbation map of the target deep learning model to obtain the fitness of the perturbation map corresponding to each individual. According to the order of fitness from large to small, a preset number of each individual is selected as the iteratively optimized confrontation perturbation matrix.

具体的，原本的差分进化策略的变异部分为：Specifically, the variation part of the original differential evolution strategy is:

其中，

表示第i代群体中的某个个体，

与

均为随机选取的索引值，F为固定的比例因子。本实施例中，在原本的变异的基础形式上，考虑到在进化(优化)初期可以适当增加比例因子F(变异步长)，将F改为随i变化的形式：in,

represents an individual in the i-th generation population,

and

Both are randomly selected index values, and F is a fixed scale factor. In this embodiment, on the basis of the original variation, considering that the scaling factor F (variable asynchronous length) can be appropriately increased in the early stage of evolution (optimization), F is changed to a form that varies with i:

其中，AF代表自适应比例因子(adaptive factor)，F₁与F₂分别为AF的下限和初始值，α为调节AF变化速率的因子，s为迭代数i的缩放因子。将改进的比例因子用于差分进化策略，能加快其收敛速度。Among them, AF stands for adaptive factor (adaptive factor), F ₁ and F ₂ are the lower limit and initial value of AF, respectively, α is a factor for adjusting the rate of change of AF, and s is a scaling factor for the iteration number i. Using the improved scale factor for the differential evolution strategy can speed up its convergence.

当进化算法为线性张成的协方差矩阵自适应进化策略(Linear SpannedCovariance Matrix Adaptation Evolutionary Strategies)时，为解决协方差矩阵自适应进化策略在求解极高维优化问题时，由于频繁计算协方差矩阵而导致时间复杂度高的问题，将n维(n＝w×h)的优化空间降维为k(k远小于n)个n维向量的线性张成的空间，即设输出扰动为k个n维向量的带权和，用协方差矩阵自适应进化策略对各权重进行优化，由于k远小于n，计算协方差矩阵的时间复杂度由O(n²)降为O(k²)，大大下降。在此基础上，使用协方差矩阵自适应进化策略，也能达到较好的效果，S4的具体方法为：When the evolutionary algorithm is the Linear Spanned Covariance Matrix Adaptation Evolutionary Strategies, in order to solve the covariance matrix adaptive evolutionary strategy in solving extremely high-dimensional optimization problems, due to the frequent calculation of the covariance matrix, The problem of high time complexity is caused by reducing the dimension of the n-dimensional (n=w×h) optimization space to a linearly stretched space of k (k is much smaller than n) n-dimensional vectors, that is, setting the output disturbance to k n The weighted sum of the dimensional vectors is optimized by the adaptive evolution strategy of the covariance matrix. Since k is much smaller than n, the time complexity of calculating the covariance matrix is reduced from O(n ² ) to O(k ² ), which greatly reduces the time complexity of the covariance matrix. decline. On this basis, using the covariance matrix adaptive evolution strategy can also achieve better results. The specific method of S4 is:

S411：随机生成k个n维向量作为张成空间的基底；通过预设的k维高斯分布，随机生成预设数量个k维向量，以k维向量为权重乘以基底，得到预设数量个n维样本向量，将n维样本向量以对抗无穷范数限幅后得到对抗扰动矩阵。S411: Randomly generate k n-dimensional vectors as the basis of the stretched space; randomly generate a preset number of k-dimensional vectors through a preset k-dimensional Gaussian distribution, multiply the basis by the k-dimensional vector as a weight, and obtain a preset number of The n-dimensional sample vector is sliced with the anti-infinity norm to obtain the adversarial perturbation matrix.

S412：根据对抗零范数及敏感矩阵，获取各抗扰动矩阵对应的掩膜矩阵；将原始图像中的各像素点分别叠加各对抗扰动矩阵，然后与各抗扰动矩阵对应的掩膜矩阵相乘，得到各抗扰动矩阵对应的扰动图；将各抗扰动矩阵对应的扰动图输入目标深度学习模型，得到各抗扰动矩阵对应的扰动图的适应度，按照适应度从大到小的顺序，选取设定数量个抗扰动矩阵，根据设定数量个抗扰动矩阵更新k维高斯分布，根据更新后的k维高斯分布随机生成预设数量个优化k维向量，以优化k维向量为权重乘以基底，得到预设数量个优化n维样本向量，将优化n维样本向量以对抗无穷范数限幅后得到迭代优化后的对抗扰动矩阵。S412: According to the anti-zero norm and the sensitive matrix, obtain a mask matrix corresponding to each anti-disturbance matrix; superimpose each pixel point in the original image with each anti-disturbance matrix, and then multiply the mask matrix corresponding to each anti-disturbance matrix , obtain the disturbance graph corresponding to each anti-disturbance matrix; input the disturbance graph corresponding to each anti-disturbance matrix into the target deep learning model to obtain the fitness of the disturbance graph corresponding to each anti-disturbance matrix. Set a number of anti-disturbance matrices, update the k-dimensional Gaussian distribution according to the set number of anti-disturbance matrices, randomly generate a preset number of optimized k-dimensional vectors according to the updated k-dimensional Gaussian distribution, and multiply the optimized k-dimensional vector by the weight A preset number of optimized n-dimensional sample vectors are obtained, and the n-dimensional sample vectors are optimized to resist infinite norm limiting to obtain an iteratively optimized adversarial perturbation matrix.

其中，目标对抗扰动矩阵为采用该目标对抗扰动矩阵扰动后的原始图像，能够使得目标深度学习模型的输出达到攻击目标，比如：使得扰动后的原始图像输入目标深度学习模型的输出与原始图像输入目标深度学习模型的输出不同，或者使得扰动后的原始图像输入目标深度学习模型的输出为预先设置的输出结果。The target adversarial perturbation matrix is the original image perturbed by the target adversarial perturbation matrix, which can make the output of the target deep learning model reach the attack target, for example: make the perturbed original image input into the target deep learning model output and the original image input The output of the target deep learning model is different, or the output of the perturbed original image input to the target deep learning model is a preset output result.

综上所述，本发明深度学习模型对抗样本生成方法，通过迁移参考模型的敏感图，施加阈值来限制扰动区域为敏感度高的像素点处，使得被扰动的像素点的分布变得稀疏，从而使对抗扰动更不易被察觉，从量化的角度来看，对抗扰动的二范数(各像素扰动值的平方和开根号)可以因此得到较大的下降。同时通过结合敏感图进行合理的“试探”，尝试不同的无穷范数l_∞和零范数l₀组合，自动地、动态地为每次对抗攻击调节l_∞和l₀，提高了对抗攻击效率与限定查询次数时的成功率。整个流程是插件式的，其中进化策略部分可以结合实际应用情形替换成不同的进化策略(如自适应差分进化算法和线性张成的协方差矩阵自适应进化策略)用于优化，具有方便、快捷的特点。由于进化算法具有可并行的特点，在同次迭代中不同个体的计算可以并行处理，因而在迭代过程中可以具有较高的效率。通过生成的低二范数的对抗样本训练目标深度学习模型，能够进一步提高目标深度学习模型预测的准确性，提升目标深度学习模型的鲁棒性。To sum up, the method for generating an adversarial sample of the deep learning model of the present invention, by migrating the sensitivity map of the reference model, imposes a threshold to limit the disturbed area to pixels with high sensitivity, so that the distribution of disturbed pixels becomes sparse, Therefore, the adversarial disturbance is more difficult to be perceived. From the perspective of quantization, the two-norm of the adversarial disturbance (the square and the root of the disturbance value of each pixel) can thus be greatly reduced. At the same time, by combining the sensitive graphs for reasonable "trials", trying different combinations of infinite norm l _∞ and zero norm l ₀ , automatically and dynamically adjusting l _∞ and l ₀ for each adversarial attack, improving the efficiency of adversarial attacks The success rate when the number of queries is limited. The whole process is plug-in, in which the evolution strategy part can be replaced by different evolution strategies (such as adaptive differential evolution algorithm and linear stretched covariance matrix adaptive evolution strategy) for optimization, which is convenient and fast. specialty. Because the evolutionary algorithm is parallelizable, the computation of different individuals in the same iteration can be processed in parallel, so it can have higher efficiency in the iterative process. Training the target deep learning model through the generated adversarial samples with low two-norm can further improve the prediction accuracy of the target deep learning model and improve the robustness of the target deep learning model.

经过试验论证，在深度学习模型对抗样本生成方法，即使使用不同的参考模型，比如VGG16或DenseNet121，均能得到相较原算法更小的二范数扰动，并只用了不到20％的像素点，表明本发明深度学习模型对抗样本生成方法，生成的对抗样本较现有方法生成的对抗样本的二范数得到了较大的下降。After experimental demonstration, in the deep learning model against the sample generation method, even using different reference models, such as VGG16 or DenseNet121, can obtain a smaller two-norm disturbance than the original algorithm, and only use less than 20% of the pixels point, indicating that the method for generating the adversarial samples of the deep learning model of the present invention, the two-norm of the adversarial samples generated by the existing methods has been greatly reduced.

下述为本发明的装置实施例，可以用于执行本发明方法实施例。对于装置实施例中未纰漏的细节，请参照本发明方法实施例。The following are apparatus embodiments of the present invention, which can be used to execute method embodiments of the present invention. For details that are not omitted in the device embodiments, please refer to the method embodiments of the present invention.

本发明再一个实施例中，提供了一种深度学习模型对抗样本生成系统，能够用于实现上述的深度学习模型对抗样本生成方法，具体的，该深度学习模型对抗样本生成系统包括敏感矩阵获取模块、扰动图获取模块、优化模块、参数确定模块以及输出模块。In yet another embodiment of the present invention, a system for generating adversarial samples for deep learning models is provided, which can be used to implement the above-mentioned method for generating adversarial samples for deep learning models. Specifically, the system for generating adversarial samples for deep learning models includes a sensitivity matrix acquisition module , a perturbation graph acquisition module, an optimization module, a parameter determination module and an output module.

其中，敏感矩阵获取模块用于获取原始图像基于目标深度学习模型的敏感矩阵；扰动图获取模块用于根据预设的用于扰动的若干零范数及若干无穷范数，构建若干范数组；根据敏感矩阵及若干范数组，得到各范数组对应的扰动图；参数确定模块用于将原始图像及各范数组对应的扰动图均查询目标深度学习模型，得到并根据各范数组对应的扰动图相对于原始图像的原始类预测概率的下降值，获得对抗零范数及对抗无穷范数；优化模块用于根据对抗无穷范数构建预设数量对抗扰动矩阵，将攻击目标类预测概率为适应度，以适应度最大为优化目标，根据原始图像、敏感矩阵及对抗零范数，通过进化算法迭代优化各对抗扰动矩阵，每次迭代后均触发输出模块；输出模块用于当当前迭代优化后的对抗扰动矩阵中，至少存在一个目标对抗扰动矩阵时，将原始图像通过目标对抗扰动矩阵扰动，得到对抗样本并输出，迭代结束；否则，触发优化模块。Among them, the sensitivity matrix acquisition module is used to acquire the sensitivity matrix of the original image based on the target deep learning model; the perturbation map acquisition module is used to construct several norm groups according to preset zero norm and several infinite norm for perturbation; The sensitivity matrix and several norm groups are used to obtain the perturbation map corresponding to each norm group; the parameter determination module is used to query the target deep learning model for both the original image and the perturbation map corresponding to each norm group, and obtain and compare the perturbation map corresponding to each norm group. The anti-zero norm and the anti-infinity norm are obtained from the drop value of the original class prediction probability of the original image; the optimization module is used to construct a preset number of adversarial disturbance matrices according to the anti-infinity norm, and the predicted probability of the attack target class is the fitness, Taking the maximum fitness as the optimization goal, according to the original image, the sensitive matrix and the confrontation zero norm, each confrontation disturbance matrix is iteratively optimized through the evolutionary algorithm, and the output module is triggered after each iteration; the output module is used for the confrontation after the current iteration optimization. In the perturbation matrix, when there is at least one target confrontation perturbation matrix, the original image is perturbed by the target confrontation perturbation matrix, and the confrontation sample is obtained and output, and the iteration ends; otherwise, the optimization module is triggered.

本发明再一个实施例中，提供了一种计算机设备，该计算机设备包括处理器以及存储器，所述存储器用于存储计算机程序，所述计算机程序包括程序指令，所述处理器用于执行所述计算机存储介质存储的程序指令。处理器可能是中央处理单元(CentralProcessing Unit，CPU)，还可以是其他通用处理器、数字信号处理器(Digital SignalProcessor、DSP)、专用集成电路(Application Specific Integrated Circuit，ASIC)、现成可编程门阵列(Field-Programmable GateArray，FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等，其是终端的计算核心以及控制核心，其适于实现一条或一条以上指令，具体适于加载并执行计算机存储介质内一条或一条以上指令从而实现相应方法流程或相应功能；本发明实施例所述的处理器可以用于深度学习模型对抗样本生成方法的操作。In yet another embodiment of the present invention, a computer device is provided, the computer device includes a processor and a memory, the memory is used for storing a computer program, the computer program includes program instructions, and the processor is used for executing the computer Program instructions stored in the storage medium. The processor may be a central processing unit (CPU), other general-purpose processors, digital signal processors (Digital Signal Processors, DSP), application specific integrated circuits (ASICs), and off-the-shelf programmable gate arrays. (Field-Programmable GateArray, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc., which are the computing core and control core of the terminal, which are suitable for implementing one or more instructions, specifically suitable for One or more instructions in the computer storage medium are loaded and executed to implement the corresponding method process or corresponding function; the processor according to the embodiment of the present invention can be used for the operation of the deep learning model adversarial sample generation method.

本发明再一个实施例中，本发明还提供了一种存储介质，具体为计算机可读存储介质(Memory)，所述计算机可读存储介质是计算机设备中的记忆设备，用于存放程序和数据。可以理解的是，此处的计算机可读存储介质既可以包括计算机设备中的内置存储介质，当然也可以包括计算机设备所支持的扩展存储介质。计算机可读存储介质提供存储空间，该存储空间存储了终端的操作系统。并且，在该存储空间中还存放了适于被处理器加载并执行的一条或一条以上的指令，这些指令可以是一个或一个以上的计算机程序(包括程序代码)。需要说明的是，此处的计算机可读存储介质可以是高速RAM存储器，也可以是非不稳定的存储器(non-volatile memory)，例如至少一个磁盘存储器。可由处理器加载并执行计算机可读存储介质中存放的一条或一条以上指令，以实现上述实施例中有关深度学习模型对抗样本生成方法的相应步骤。In yet another embodiment of the present invention, the present invention further provides a storage medium, specifically a computer-readable storage medium (Memory), where the computer-readable storage medium is a memory device in a computer device for storing programs and data . It can be understood that, the computer-readable storage medium here may include both a built-in storage medium in a computer device, and certainly also an extended storage medium supported by the computer device. The computer-readable storage medium provides storage space in which the operating system of the terminal is stored. In addition, one or more instructions suitable for being loaded and executed by the processor are also stored in the storage space, and these instructions may be one or more computer programs (including program codes). It should be noted that the computer-readable storage medium here can be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory. One or more instructions stored in the computer-readable storage medium can be loaded and executed by the processor, so as to implement the corresponding steps of the method for generating an adversarial sample of a deep learning model in the foregoing embodiment.

本领域内的技术人员应明白，本申请的实施例可提供为方法、系统、或计算机程序产品。因此，本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且，本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。As will be appreciated by those skilled in the art, the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block in the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.

最后应当说明的是：以上实施例仅用以说明本发明的技术方案而非对其限制，尽管参照上述实施例对本发明进行了详细的说明，所属领域的普通技术人员应当理解：依然可以对本发明的具体实施方式进行修改或者等同替换，而未脱离本发明精神和范围的任何修改或者等同替换，其均应涵盖在本发明的权利要求保护范围之内。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention rather than to limit them. Although the present invention has been described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: the present invention can still be Modifications or equivalent replacements are made to the specific embodiments of the present invention, and any modifications or equivalent replacements that do not depart from the spirit and scope of the present invention shall be included within the protection scope of the claims of the present invention.

Claims

1. A method for generating confrontation samples of a deep learning model is characterized by comprising the following steps:

s1: acquiring a sensitive matrix of an original image based on a target deep learning model;

s2: constructing a plurality of norm groups according to a plurality of preset zero norms and a plurality of infinite norms for disturbance; obtaining a disturbance diagram corresponding to each norm group according to the sensitive matrix and the plurality of norm groups;

s3: inquiring a target deep learning model for the original image and the disturbance graphs corresponding to the range groups to obtain and obtain a zero norm countermeasure and an infinite norm countermeasure according to a reduction value of the original class prediction probability of the disturbance graphs corresponding to the range groups relative to the original image;

s4: constructing a preset number of confrontation disturbance matrixes according to the confrontation infinite norm, taking the prediction probability of the attack target class as the fitness, taking the fitness as the maximum optimization target, iteratively optimizing each pair of confrontation disturbance matrixes through an evolutionary algorithm according to the original image, the sensitivity matrix and the confrontation zero norm, and performing S5 after each iteration;

s5: when at least one target anti-disturbance matrix exists in the anti-disturbance matrix after current iterative optimization, disturbing the original image through the target anti-disturbance matrix to obtain an anti-sample and output the anti-sample, and finishing iteration; otherwise, return to S4.

2. The method for generating confrontation samples of deep learning model according to claim 1, wherein the specific method of S1 is:

acquiring a deep learning model with the same classification target as the target deep learning model as a reference model;

selecting pixel points in the original image one by one to carry out the following steps: disturbing the current pixel point by a preset size to obtain a sensitive image, inputting the sensitive image into a reference model to obtain a change value of the original class prediction probability output relative to the reference model before disturbance after the current pixel point is disturbed, and taking the change value as the sensitive value of the current pixel point;

and arranging the sensitive values of all pixel points in the original image according to the positions of all the pixel points in the original image to obtain a sensitive matrix of the original image based on the target deep learning model.

3. The method for generating confrontation samples of deep learning model according to claim 2, wherein the specific method of S2 is:

selecting zero norms in the plurality of zero norms one by one to traverse the plurality of zero norms according to a plurality of preset zero norms and a plurality of infinite norms for disturbance, and combining the selected zero norms and the infinite norms in the plurality of infinite norms one by one to obtain a plurality of norm groups;

presetting an initial disturbance matrix, wherein the number of parameters in the initial disturbance matrix is the same as the number of pixel points in the original image, the parameters are arranged according to the positions of the pixel points in the original image, the absolute value of each parameter is 1, and the symbols of the parameters at the same position in the initial disturbance matrix and the sensitive matrix are consistent; selecting a norm group one by one, traversing a plurality of norm groups, multiplying half of infinite norms in the current norm group by each parameter in the initial disturbance matrix to obtain disturbance matrixes corresponding to the norm groups, obtaining zero norms in the current norm group, setting the front zero norm sensitive values in the sensitive matrixes to be 1 and setting the other sensitive values to be 0 according to the descending order of the sensitive values, and obtaining mask matrixes corresponding to the disturbance matrixes;

and superposing the pixel points in the original image with the disturbance matrixes corresponding to the norm groups respectively, and multiplying the pixel points by the mask matrixes corresponding to the disturbance matrixes to obtain the disturbance graphs corresponding to the norm groups.

4. The method for generating confrontation samples of deep learning model according to claim 1, wherein the specific method of S3 is:

inputting the original image and the disturbance images corresponding to the norm groups into a target deep learning model to obtain the original class prediction probability of the original image and the original class prediction probability of the disturbance images corresponding to the norm groups, and obtaining a reduction value of the original class prediction probability of the disturbance images corresponding to the norm groups relative to the original image;

superposing disturbance graphs corresponding to norm groups of the same zero norm relative to the descending values of the original type prediction probability of the original image in all the norm groups to obtain a plurality of first superposed values, and selecting the zero norm corresponding to the first superposed value with the largest slope change in the plurality of first superposed values as the anti-zero norm;

and superposing disturbance graphs corresponding to norm groups of the same infinite norm with respect to the descending values of the original type prediction probability of the original image in all the norm groups to obtain a plurality of second superposed values, and selecting the infinite norm corresponding to the second superposed value with the largest slope change in the plurality of second superposed values as the confrontation infinite norm.

5. The method for generating the confrontation sample of the deep learning model as claimed in claim 1, wherein the evolutionary algorithm in S4 is: an adaptive differential evolution strategy or a linear-stretched covariance matrix adaptive evolution strategy.

6. The method for generating the confrontation sample of the deep learning model as claimed in claim 5, wherein when the evolutionary algorithm is the adaptive differential evolution strategy, the specific method of S4 is as follows:

s401: randomly taking values in [ -confrontation infinite norm, confrontation infinite norm ] as parameters in the confrontation disturbance matrix, and constructing a preset number of confrontation disturbance matrices; each parameter in the anti-disturbance matrix corresponds to each pixel point of the original image one by one;

s402: according to the adaptive scale factor and the variation formula, a preset number of the antagonistic disturbance matrixes are varied to obtain a preset number of the variable antagonistic disturbance matrixes, and the antagonistic disturbance matrixes and the variable antagonistic disturbance matrixes are used as individuals; acquiring a mask matrix corresponding to each body according to the anti-zero norm and the sensitive matrix; respectively superposing each individual on each pixel point in the original image, and multiplying the individual with the mask matrix corresponding to each individual to obtain a disturbance diagram corresponding to each individual; inputting the disturbance graphs corresponding to the individuals into a target deep learning model to obtain the fitness of the disturbance graphs corresponding to the individuals, and selecting a preset number of the individuals as an anti-disturbance matrix after iterative optimization according to the sequence of the fitness from large to small;

s403: and updating the anti-disturbance matrix with the iteratively optimized anti-disturbance matrix, and iterating S402 and after each iteration, performing S5.

7. The method for generating the confrontation sample of the deep learning model according to claim 5, wherein when the evolutionary algorithm is a linear-stretched covariance matrix adaptive evolution strategy, the specific method of S4 is as follows:

s411: randomly generating k n-dimensional vectors as a basis of a span space; randomly generating a preset number of k-dimensional vectors through preset k-dimensional Gaussian distribution, multiplying the k-dimensional vectors by a substrate by taking the k-dimensional vectors as weights to obtain a preset number of n-dimensional sample vectors, and limiting the n-dimensional sample vectors by an anti-infinite norm to obtain an anti-disturbance matrix;

s412: acquiring a mask matrix corresponding to each anti-disturbance matrix according to the anti-zero norm and the sensitive matrix; superposing each anti-disturbance matrix on each pixel point in the original image respectively, and multiplying the anti-disturbance matrixes by the corresponding mask matrixes of the anti-disturbance matrixes to obtain disturbance graphs corresponding to the anti-disturbance matrixes; inputting a disturbance graph corresponding to each anti-disturbance matrix into a target deep learning model to obtain the fitness of the disturbance graph corresponding to each anti-disturbance matrix, selecting a set number of anti-disturbance matrices according to the sequence of the fitness from large to small, updating k-dimensional Gaussian distribution according to the set number of anti-disturbance matrices, randomly generating a preset number of optimized k-dimensional vectors according to the updated k-dimensional Gaussian distribution, multiplying the optimized k-dimensional vectors by a weight and a substrate to obtain a preset number of optimized n-dimensional sample vectors, and limiting the optimized n-dimensional sample vectors by an infinite norm to obtain an iteratively optimized anti-disturbance matrix;

s413: and updating the anti-disturbance matrix with the iteratively optimized anti-disturbance matrix, and iterating S412, and performing S5 after each iteration.

8. A system for generating confrontation samples of a deep learning model is characterized by comprising a sensitive matrix acquisition module, a disturbance map acquisition module, an optimization module, a parameter determination module and an output module;

the sensitive matrix acquisition module is used for acquiring a sensitive matrix of the original image based on the target deep learning model;

the disturbance diagram acquisition module is used for constructing a plurality of norm groups according to a plurality of preset zero norms and a plurality of infinite norms for disturbance; obtaining a disturbance diagram corresponding to each norm group according to the sensitive matrix and the plurality of norm groups;

the parameter determination module is used for inquiring the target deep learning model from the original image and the disturbance graphs corresponding to the range groups to obtain and obtain a zero-norm countermeasure and an infinite norm countermeasure according to a reduction value of the original class prediction probability of the disturbance graphs corresponding to the range groups relative to the original image;

the optimization module is used for constructing a preset number of resistance disturbance matrixes according to resistance infinite norms, taking the prediction probability of an attack target class as the fitness, taking the fitness as the maximum optimization target, iteratively optimizing each pair of resistance disturbance matrixes through an evolutionary algorithm according to the original image, the sensitivity matrix and the resistance zero norm, and triggering the output module after each iteration;

the output module is used for disturbing the original image through the target disturbance resisting matrix when at least one target disturbance resisting matrix exists in the disturbance resisting matrix after current iteration optimization to obtain a resisting sample and output the resisting sample, and the iteration is finished; otherwise, triggering the optimization module.

9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program implements the steps of the deep learning model confrontation sample generation method according to any one of claims 1 to 7.

10. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for generating confrontation samples of a deep learning model according to any one of claims 1 to 7.