CN113780468A - Robust model training method based on small number of neuron connections - Google Patents

Robust model training method based on small number of neuron connections Download PDF

Info

Publication number
CN113780468A
CN113780468A CN202111140405.1A CN202111140405A CN113780468A CN 113780468 A CN113780468 A CN 113780468A CN 202111140405 A CN202111140405 A CN 202111140405A CN 113780468 A CN113780468 A CN 113780468A
Authority
CN
China
Prior art keywords
neuron
model
attack
connections
results
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111140405.1A
Other languages
Chinese (zh)
Other versions
CN113780468B (en
Inventor
郭延明
李建
老松杨
阮逸润
赵翔
魏迎梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202111140405.1A priority Critical patent/CN113780468B/en
Publication of CN113780468A publication Critical patent/CN113780468A/en
Application granted granted Critical
Publication of CN113780468B publication Critical patent/CN113780468B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a robust model training method based on a small number of neuron connections, which is used for image classification robust model training and comprises a backbone network and a decision module; the backbone network extracts the features of the input image and extracts the potential features L of the input image at the last convolutional layer and the global average pooling layer; the decision module includes four processes: product operation, sorting, cutting and summing; the product operation is to calculate the product of L and W to obtain the calculation result of the neuron connection, and the ordering is to order the calculation result of the neuron connection; the cutting is to set the front alpha and the back beta of each group of results after sorting as 0; the summation is to obtain a predicted value by summing the residual non-zero results in each neuron, and obtain a classification result through the predicted value. The method does not increase trainable parameters of the model and can obtain a highly robust model without using antagonistic training.

Description

Robust model training method based on small number of neuron connections
Technical Field
The invention belongs to the technical field of image classification, and particularly relates to a robust model training method based on a small number of neuron connections.
Background
Deep Neural Networks (DNNs) are increasingly being used in the real world and have enjoyed dramatic success in a number of areas of research such as image classification, image segmentation, object detection, etc. However, many studies indicate that DNN is susceptible to challenge samples. In the image classification task, the adversarial attack process refers to adding carefully designed perturbations in a clean image and then using the adversarial examples to spoof the model. The countersample may make the probability of the attacked model outputting a wrong prediction high. The research on the adversarial attack is very rich and can be divided into white box attack, black box attack, targeted attack and untargeted attack.
Adversarial attacks severely limit the application of artificial intelligence in security scenarios because adversarial attacks are easy to implement and can cause significant losses to the real world. Therefore, resistance to adversarial attacks is receiving increasing attention and many defense methods are proposed. These existing defense methods typically use adversarial training, which is considered as a simple and effective method to improve model robustness, or to adjust the network structure to resist adversarial attacks. It is worth noting, however, that resistance training is a very slow process. For example, for antagonism training on the CIFAR-10 dataset, 50,000 antagonism samples will be generated per epoch, and the network will learn double the training data (50,000 antagonism samples and 50,000 clean data) in each epoch, which greatly increases the training time of the network. Therefore, there are many things to be improved, such as model complexity and training speed.
For having a predefined loss function/fClassification model f ofw(e.g., cross-entropy loss, which is widely used in image classification tasks). In the training phase, the lf function is usedThe loss value of the model fw is calculated with the goal of finding the parameter w to minimize the loss function lf. In contrast, the goal of a counter attack is to maximize the value of the loss function lf. Adding gradients to the input image is the most straightforward and efficient method of spoofing models. Well-known white-box attack methods include FGSM, PGD, DDN, which are commonly used to assess defense capabilities.
Fastgradientsignmethod (fgsm) is an efficient single-step attack algorithm that uses the signed gradient of the input image to generate the challenge samples. For a given clean image x and its label y, the FGSM generates a antagonistic instance x' using equation (1).
Figure BDA0003283524750000021
Wherein epsilon is attack intensity, and the value range is 0-255. sign () returns the sign of the gradient. PGD is a variant of FGSM at setting xk=1On the premise of x, a challenge sample is generated using a gradient of k times. The process can be described as:
Figure BDA0003283524750000022
where α is a small step, the challenge sample is within lp-ball of the original input x, and PGD is experimentally proven to be a common adversary among all first-order adversaries.
DDN is a classical L2-norm attack method, which can be regarded as a variant of C & W. DDN can obtain challenge samples faster than C & W. And the DDN attack can also achieve high attack success rate at a similar perturbation level as C & W. Therefore, DDN is commonly used to evaluate the robustness of the model to L2-norm attacks.
The main methods for resisting adversarial attacks can be divided into two categories, namely adversarial training and adjusting network structure. The fight training is considered to be the most popular and effective defense method, one of the most common defense baselines. Learners suggest training robust models using confrontational samples generated by PGDs, since PGDs are common first-order opponents.
Modifying the network structure is another common defense method. Recent studies have demonstrated that adding a noise layer to the original network structure can improve the robustness of the model. The random self-integration (RSE) method adds an additive noise to the convolutional layer. The noise is sampled from a normal distribution with a mean of 0, but the variance of the distribution needs to be set manually. In contrast, Parametric Noise Injection (PNI) adds noise sampled from a normal distribution and learns weights for each noise value through the network. The mean and variance of the sampling noise are the same as the convolutional layer weights. Learn2Perturb (L2P) is the latest extension of PNI, which adds noise layer output directly to the network layer. Learn2Perturb allows the network to Learn noise by training the noise injection module and network layers alternately.
Although these methods achieve higher accuracy of the perturbed data, they increase the training burden on the network and severely degrade the accuracy of the clean data.
Disclosure of Invention
In order to solve the problems, the invention provides a robust model training method (Few2Decide) based on a small number of neuron connections to train an image classification robust model, wherein the model discards partial non-robust connections at a full connection layer, and the rest connections can keep robust prediction, so that the huge calculation cost of countertraining is eliminated. The method specifically comprises the following steps:
the method comprises a backbone network and a decision module, wherein the backbone network extracts features of an input image and extracts potential features L of the input image at a last convolution layer and a global average pooling layer, and the decision module comprises product operation, sorting, cutting and summing;
the product operation is to calculate the product of L and W to obtain the calculation result of neuron connection, wherein W is the weight matrix of the full connection layer, the size of W is n × m, wherein n is the category number of the data set, and m is the characteristic length of the image; the sorting is to sort the calculation results of the connections included in each neuron; the cutting is to set the front alpha and the back beta of each group of results after sorting as 0; the summation is to obtain a predicted value by summing the residual non-zero results in each neuron, and obtain a classification result through the predicted value.
Further, the product operation uses hadamard products to obtain the calculation result of the neuron connection.
Further, the sorting of the calculation results of the neuron connection specifically includes sorting the calculation results of the connection included in each neuron from small to large.
Further, α and β are set to 1/3 of the image feature length.
Further, the classification result obtained through the predicted value is obtained by inquiring the index of the maximum predicted value.
Further, the robust model is trained using clean data and cross entropy loss as a loss function.
Further, the weights W are initialized using a uniform distribution and are fixed during the training phase.
The invention provides a simple method for training the robust model, and eliminates the huge calculation cost of the countertraining. In particular, the present invention designs a method to discard some non-robust neuron connections in the fully-connected layer and use the remaining connections to compute a prediction score for each class. The method provided by the invention does not increase trainable parameters of the model, can obtain the model with high robustness under the condition of not using antagonism training, and obviously improves the robustness of the model under various strong attacks on a common data set (CIFAR-10 and MNIST) for evaluating the defense capability of the model under the white-box environment.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the structures shown in the drawings without creative efforts.
FIG. 1 is a diagram of a robust model training method based on a small number of neuron connections.
FIG. 2 is a comparison of Few2Decide with other methods for different FGSM and PGD challenge strengths.
Fig. 3 is a calculation result of neuron connection.
FIG. 4 is the relationship between the accuracy of perturbation data of ResNet-56 under PGD attack and the attack step k on the CIFAR-10 test set.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
To effectively combat adversarial attacks, the impact of adversarial attacks on the network output was first investigated. The image feature distribution is visualized in a standard model with/without attack (ResNet-56) using the T-SNE tool. Specifically, we collect the output of the last convolutional layer and then project the output into three-dimensional space using pca (principal Component analysis). Clean data features in the same category are tightly clustered. We visualize the clean data features of class truck and the disturbance image features of class truck under FGSM and PGD attacks. It can be seen that the attack method keeps the features of the challenge sample away from clean data.
Next, we show how changes in image features affect the final classification result of fully-connected layer neurons. The length of an image feature (L ═ { L _1, L _ 2., L _ {63}, L _ {64}) is 64. The fully connected layer has 10 neurons with 64 weights per neuron. Given a weight matrix of W, we can derive 10 classes of prediction scores { P _0, P _ 1., P _9} according to equation 3.
Pi=∑Wij·Lj j=1,2,3,...,63,64 (3)
We order the computation results of 64 connections in the neuron from small to large. In addition, the calculation results under the attack are shown. It can be seen that although the changes in neuron outputs fool the model, the median distribution of the 64 join computation results is not changed by the attack algorithm.
As can be seen from the above, although the perturbation added to the image changes the prediction result of the model, the calculation result in which some neurons are connected in each neuron does not change much. Thus, we can classify the connections that each neuron contains into two types: robust connections, non-robust connections. (1) Robust connection: when the attack is carried out, the calculation result of the robust connection has no obvious change and is positioned in the middle of the result distribution. (2) Non-robust connection: the computed results of the non-robust connections vary significantly under attack, either at the top or at the bottom of the result distribution. Thus, the image class can be determined only by robust connection in the full connection layer.
A schematic of the process of the invention (Few2Decide) is shown in FIG. 1. The method mainly comprises a backbone network and a decision module. W is the weight matrix of the fully connected layers, the size of W is n m, where n is the number of classes of the dataset and m is the length feature of the image. The last convolutional layer of the backbone network and the global average pooling are used to extract the latent features L of the input image, and then the decision module is used to calculate the model prediction values. The decision module has four processes, including: product operation, sorting, clipping, and summing.
Product operation: there are two kinds of multiplication between matrixes, which are Hadamard product and matmul product, respectively, and the Hadamard product refers to element-by-element multiplication of two matrixes with the same shape. The hadamard product is used to obtain the calculation of the weight W (i.e. the calculation of the neuron connections).
Sorting: the ten class classifiers have ten sets of results (W)ij·LjI ═ 0,1,2.., 9). After the hadamard multiplication process, the calculation results of the connections contained by each neuron are sorted from small to large.
Cutting: after sorting, the front α and back β of each set of results are set to 0. The robust connection occupies approximately 1/3 the length of the image feature. Therefore, α and β are set to L/3.
And (3) summing: the predicted values are obtained by summing the non-zero results remaining in each neuron. Then, the index of the maximum predicted value is inquired to obtain a classification result.
The above process differs from dropout in the following two points: 1) the method of the present invention only disables part of the non-robust connections, while dropout randomly disables the entire neuron. 2) The method of the present invention exists in both the model training and testing phases, but dropout replaces random inactivation with expected values in the testing phase.
Meanwhile, in the invention, only clean data is used for training the model, and cross entropy loss is used as a loss function. The weight W is initialized with a uniform distribution U (0, 1). To speed up the process of model convergence, the weights W are fixed during the training phase.
In order to evaluate the defense performance of the method, an Few2Decide method is adopted to train various models and observe the robustness of the models to different attack methods. In addition, the method is compared to typical ordinary PGD countermeasure training and other methods of achieving the most advanced defense performance by modifying the network structure, including stochastic self-integration (RSE), antagonistic Bayesian neural networks (Adv-BNN), Parametric Noise Injection (PNI), and Learn2Perturb (L2P).
Experiments two commonly used data sets were used to evaluate model defense capabilities, CIFAR-10 and MNIST. The CIFAR-10 dataset relates to 10 classes of natural images, consisting of 50,000 training images and 10,000 test images. Each image is provided with RGB channels, which are 32 x 32 pixels in size. The MNIST dataset is a series of grayscale images of handwritten digits, consisting of 60,000 training images and 10,000 test images. Each image has only one channel, and the size is 28 x 28 pixels. For both data sets, the same data enhancement strategy of Learn2Perturb (L2P) (i.e., random clipping, random flipping) was used during training. Furthermore, we set the normalization to a non-trainable layer in front of the model so that the attack algorithm can directly add the antagonistic perturbation to the clean data.
Backbone network: our method was evaluated on two datasets using the classical ResidualNetworks as backbone network. In particular, ResNet- (20,32,44,56) was used to study the effect of network depth on the defense capabilities of different methods. ResNet-20([1.5x ], [2x ], [4x ]) was used to study the effect of network width on defense. ResNet-20[ nx ] indicates that the number of convolution kernels in each convolution layer is increased by a factor of n.
Attack: to assess the defense, the present method was compared to other defense methods against attacks FGSM and PGD based on the l ∞ norm. For the attack algorithm, the same configuration as the other methods is followed. For PGD attacks, the attack strength epsilon is set in the equation. 8/255 on CIFAR-10, 0.3 on MNIST, 7 on iteration step number k, and 0.01 on step size α. FGSM attacks maintain the same attack strength epsilon as PGD. The accuracy of the model under attack was evaluated in the complete test data. Since PGD had a random initialization process, 5 PGD attacks were performed in each evaluation, and the model accuracy was reported as (mean ± std)%. For DDN attacks, default settings in literature (j.rony, l.g.afemann, l.s.oliverira, i.b.ayed, r.sabourin, and e.granger, "Decoupling direction and norm for effective gradient-based l2 adaptive attecks and defenses," cvin pr,2019) are used.
TABLE 1 comparison with unprotected network
Figure BDA0003283524750000061
Table 1 reflects the effectiveness of the proposed method of the present invention, Parameter representing the number of all parameters that need to be trained. To evaluate the effectiveness of our proposed module, the accuracy of the model with/without the use of the method on clean data and perturbed data was first compared. The clean data is the original test image in the dataset. Perturbation data is formed by adding antagonistic perturbations to clean data. As shown in table 1, the backbone network parameters using our method are smaller than the unguarded model (the original model was not modified at all) because our model does not use the traditional fully connected layer and the weights are untrained. Although we use the same backbone for CIFAR-10 and MNIST, the first convolutional layer input channel is adjusted to 1 when the network is used for MNIST. Thus making the MNIST model parameters smaller than CIFAR-10.
First, it was observed that attacks can significantly compromise the accuracy of the model, especially for models that are left unprotected. For example, the clean data accuracy of ResNet-44 and ResNet-56 on CIFAR-10 datasets exceeds 93%, but the accuracy drops to zero under PGD attacks because the perturbation data characteristics and clean data characteristic distributions are very different. In contrast, our method can preserve robust connections, and thus our method can enhance the model's ability to resist attacks. Therefore, the backbone network adopting the method provided by the invention can still keep more than 60% of accuracy under the attack of PGD.
Secondly, our method also reduces the accuracy of clean data to some extent over the un-armed model, because our model uses fewer neuron connections in the decision phase than the un-armed model. Discarded neuronal connections are also associated with tags. Therefore, when our model is applied to clean data, the accuracy will inevitably drop. But the increase in robustness can offset this clean loss in data accuracy. For example, when we use ResNet-56 as the backbone network, the accuracy of our method on CIFAR-10 clean data decreases by 0.41% (93.3% → 92.89%), but the accuracy of the perturbation data increases by 68.08%.
To further illustrate the effectiveness of our method, the proposed method of the present invention was compared to the current state-of-the-art methods, including general counter training, PNI, Adv-BNN, and L2P. Consistent with the competition method, the following experiments were performed on the CIFAR-10 dataset.
Because the network under the method provided by the invention has no randomness, the precision of the clean data and the FGSM disturbance data is fixed. For the stochastic method, the results are expressed as (mean ± std)%. # Clean is the accuracy of the backbone network on Clean data. Some results are extracted from the literature (Z.He, A.S. Rakin, and D.Fan, "Parametric noise injection: transportable random to reactive deep neural network acquisition adaptive attack," in CVPR,2019) and from the literature (A.Jeddi, M.J.Shafiee, M.Karg, C.Scharfenberger, and A.Wong, "Learn 2 term: an end-to-end defect prediction to reactive adaptive attack," in CVPR, 2020). If we achieve a higher level of accuracy in the competition method, we report our own experimental results. We bold the maximum of each type of accuracy.
Table 2 compares Few2 decade method to current state-of-the-art method for performance
Figure BDA0003283524750000071
Table 2 shows the results of the comparison of the different networks. First, it can be seen that all methods reduce the accuracy of clean data over the un-armed model. For example, when the proposed method uses ResNet-56 as the backbone network, the clean data accuracy for the three competing methods is 86.0%, 77.2%, and 84.82%, respectively. The method provided by the invention is proved to lose relatively less accuracy of clean data. Although our research focuses on model defense capabilities, it is also important to ensure that the model achieves satisfactory accuracy on clean data. Therefore, we consider the proposed method to be more efficient and practical. Secondly, the increase of the depth and the width of the network can enhance the fitting capability of the model, so that the learning characteristics of the model are more accurate, and the method is favorable for finding stable connection. As shown in table 2, the defense capabilities of the competing methods do not increase with increasing backbone network depth and width. Taking Adv-BNN and L2P methods as examples, under the PGD attack, the precision of disturbance data of the Adv-BNN and the L2P methods is kept at 54.62%, and the depth of a backbone network is increased from 32 to 56. As the backbone width increases from ResNet-20 to ResNet-20(4 ×), the accuracy of the perturbed data even decreases. In contrast, the proposed method can provide better performance by increasing the network capacity. For example, when the backbone network depth is increased from 20 to 56, the accuracy under FGSM attack is increased from 64.84% to 75.41%, and the accuracy under PGD attack is increased from 53.01% to 68.08%. In addition, increasing the network width also enhances our method defense. The results of ResNet-20 and ResNet-20([1.5x ], [2x ], [4x ]) indicate that the perturbation data accuracy of our model increased from 64.84% to 80.4% under FGSM attack and from 53.01% to 73.01% under PGD attack. This demonstrates that the proposed method is more adaptive than the competing method, since we do not need to carefully design for each network architecture separately.
The results reported in table 3 are based on the highest accuracy in the literature. For PGD attacks, the attack strength, e, 8/255, k, 7. Some results are extracted from the literature (A.Jeddi, M.J.Shafiee, M.Karg, C.Scharfenberger, and A.Wong, "Learn 2 perturing: an end-end feature failure learning to improve adaptive robustness," in CVPR, 2020).
TABLE 3 comparison of the proposed Few2Decide with the most advanced method on CIFAR-10
Figure BDA0003283524750000081
The method proposed by the present invention is also compared to other currently most advanced methods to provide a powerful network model on the CIFAR-10 dataset. Due to the different adaptability of different methods to the backbone networks, the backbone network used by each method is not considered, and only the highest disturbance data precision is reported. Table 3 shows that the proposed method of the present invention achieves the most advanced challenge accuracy on CIFAR-10 under PGD attack with a perturbation intensity of 8/255. Furthermore, our method has a higher precision of clean data than other methods.
The above experimental results are based on a certain intensity of attack. To evaluate the defense capabilities of these methods under a wide range of threat intensities, the ResNet-56 networks were trained using different defense methods (including PNI, Vanilla, Few2 defide, and undefended models) and evaluated for their robust accuracy under different strengths of FGSM and PGD attacks.
FIG. 2 is a comparison of Few2Decide with other methods for different FGSM and PGD challenge strengths. For (a) and (b), the x-axis represents the attack strength of ε/255, and the y-axis represents the accuracy remaining after each model was attacked. For (c), the x-axis represents the number of attack iterations k.
FIG. 2(a) shows the accuracy of several models under FGSM attack, ε increasing from 1/255 to 20/255. For PGD attack, increasing the attack strength ∈ and the number of iteration steps k may increase PGD attack capability. When k is 7 and ε increases from 1/255 to 20/255, the model accuracy is as shown in FIG. 2 (b). Fig. 2(c) shows the model accuracy when e is 8/255 and k is increased from 0 to 20.
It can be observed that as the strength of the attack increases, more antagonistic noise is added to the clean data and therefore the accuracy of all methods is decreasing. It can be seen that all defense methods have a certain defense ability, since their accuracy is always higher than that of the unprotected model. Our Few2Decide method consistently outperformed all competing methods with significant advantage in all settings. This indicates that the proposed method can still resist attacks of various strengths well. Fig. 2(c) also shows that our method can provide stable defense because the model accuracy does not decrease with increasing PGD attack steps when the attack is saturated.
A defense method that is robust against attacks based on the L ∞ norm does not necessarily mean that the accuracy of test data for any particular attack method is improved. To verify that our method also has defense against L2 norm based attacks. We performed DDN attacks on our model. The DDN attack is a strong L2 specification attack, and it is difficult to reduce the success rate of the DDN attack. But the L2 norm of the antagonistic perturbation may reflect the difficulty of the attack model.
TABLE 4 comparison of attack DDN based on L2-norm
Figure BDA0003283524750000091
We report the average L2 norm of the perturbation in table 4, where the values in parentheses are the attack success rate of the test model.
For the non-defense model ResNet-56, the average L2 norm of the antagonistic perturbation is 0.109, and the success rate of DDN attack is 100%. In contrast, the average perturbation L2 norm of our model increased to 0.336. For the model ResNet-20[4x ], the average L2 norm of the antagonistic disturbance was also increased over the unprotected model. This indicates that our method enhances the robustness of the model, as the attack algorithm must use a higher level of noise to fool our model. Furthermore, our method can reduce the success rate of DDN attacks. The reduction in success rate of the attack and the increase in noise level demonstrate that our model also has a defense against the L2-norm attack.
Robust connections have been learned for the evaluation of our model. In addition to the quantitative evaluation above, in this section, we visualized the results of the computation of our Few2Decide selected robust connections.
FIG. 3 is a calculation of the neuron connections selected by our method. The attack strength was set to 8/255. All results were based on ResNet-56 as the backbone network, using our Few2Decide method, but no counter training was performed. Fig. 3(a) to 3(j) show abscissa of the series number of neuron connections and ordinate of the calculation result of the neuron connections, and fig. 3(a) to 3(j) show outputs of the fully connected neurons when the clean sample, the FGSM countermeasure sample, and the PGD countermeasure sample are input to the neural network by the ten class classifiers, respectively, and the outputs are (72.4,55.6,169.9), (68.6,57.1,170.9), (74.2,65.6,208.3), (75.6, 64.8,187.8), (56.3,51.0,169.5), (74.5,71.4,230.9), (67.7,65.2,162.7), (65.9,58.6,186.5), (113.0,96.7,298.5), (64.1,53.7,148.3), and fig. 3(k) shows a given image and its label, and fig. 3(1) shows a change trend of outputs of 10 neurons.
As shown in fig. 3, we use the 20 th to 40 th neurons after sorting to make predictions, and use the backbone network (ResNet-56). The solid line is the clean data calculation. The long dashed line represents the results under PGD attack and the short dashed line represents FGSM attack. The attack strength is set to 8/255. When the network employs our defense method, the perturbed image fails to fool the classifier. For the one-step attack FGSM, when the perturbed image features are input to the weights W, our model will adjust the connections used to compute the prediction scores, so the computation results of our model selection will not change significantly. For multi-step attack PGD, our model dynamically adjusts the neuron connections used after each step of attack. While the attack algorithm may still change the calculation of the neurons, the distribution of the prediction scores is not changed. For example, the score for the ship (category 8) is still the largest, even though the relationship between the scores for each category does not change. Thus, our model has learned robust connections, and our model is robust to resistant attacks.
FIG. 4 is the relationship between the accuracy of perturbation data of ResNet-56 under PGD attack and the attack step k on the CIFAR-10 test set.
The high robustness provided by our model does not come from gradient confusion. We attempted to show that our approach does not rely on gradient obfuscation by comparison to the vanilla model, which is certified in some literature as a non-obfuscated gradient. As shown in fig. 4, the iterative step of adding PGD attacks results in a reduction in the accuracy of the perturbation data of our and vanilla models. However, for both models, the accuracy of the perturbation data is not reduced when the iteration step k > -20. If the robustness provided by our Few2Decide method comes from gradient confusion, which is incorrect due to a single sample, then adding an attack step should break our defenses. We can observe that our approach retains defense and is still superior to resistance training even though k is increased from 0 to 100. Thus, we can conclude that our defense method does not rely on gradient confusion.
While embodiments in accordance with the invention have been described above, these embodiments are not intended to be exhaustive or to limit the invention to the precise embodiments described. Many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best utilize the invention and various embodiments with various modifications as are suited to the particular use contemplated. The invention is limited only by the claims and their full scope and equivalents.

Claims (7)

1. A robust model training method based on a small number of neuron connections is characterized in that the method is used for image classification robust model training and comprises a backbone network and a decision module;
the backbone network extracts the features of the input image and extracts the potential features L of the input image at the last convolutional layer and the global average pooling layer;
the decision module includes four processes: product operation, sorting, cutting and summing;
the product operation is to calculate the product of L and W to obtain the calculation result of neuron connection, wherein W is the weight matrix of the full connection layer, the size of W is n × m, wherein n is the category number of the data set, and m is the length characteristic of the image;
the sorting is to sort the calculation results of the neuron connection;
the cutting is to set the front alpha and the back beta of each group of results after sorting as 0;
the summation is to obtain a predicted value by summing the residual non-zero results in each neuron, and obtain a classification result through the predicted value.
2. The method of claim 1, wherein the product operation uses hadamard products to obtain the calculation of the neuron connections.
3. The method of claim 1, wherein ranking the calculation results of the neuron connections is performed by ranking the calculation results of the connections included in each neuron from small to large.
4. The method of claim 1, wherein α and β are set to 1/3 of an image feature length.
5. The method according to claim 1, wherein the obtaining of the classification result by the predictor is specifically obtaining the classification result by querying an index of a maximum predictor.
6. The method of claim 1, wherein the robust model is trained using clean data and using cross-entropy loss as a loss function.
7. The method of claim 1, wherein the weights W are initialized using a uniform distribution and are fixed during a training phase.
CN202111140405.1A 2021-09-28 2021-09-28 Robust image classification model training method based on small number of neuron connections Active CN113780468B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111140405.1A CN113780468B (en) 2021-09-28 2021-09-28 Robust image classification model training method based on small number of neuron connections

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111140405.1A CN113780468B (en) 2021-09-28 2021-09-28 Robust image classification model training method based on small number of neuron connections

Publications (2)

Publication Number Publication Date
CN113780468A true CN113780468A (en) 2021-12-10
CN113780468B CN113780468B (en) 2022-08-09

Family

ID=78853814

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111140405.1A Active CN113780468B (en) 2021-09-28 2021-09-28 Robust image classification model training method based on small number of neuron connections

Country Status (1)

Country Link
CN (1) CN113780468B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108038471A (en) * 2017-12-27 2018-05-15 哈尔滨工程大学 A kind of underwater sound communication signal type Identification method based on depth learning technology
CN108805281A (en) * 2017-04-28 2018-11-13 英特尔公司 Graphics processing unit generates confrontation network
CN109887047A (en) * 2018-12-28 2019-06-14 浙江工业大学 A kind of signal-image interpretation method based on production confrontation network
CN110298384A (en) * 2019-06-03 2019-10-01 西华大学 Fight sample image generation method and device
CN110334808A (en) * 2019-06-12 2019-10-15 武汉大学 A kind of confrontation attack defense method based on confrontation sample training
CN110569916A (en) * 2019-09-16 2019-12-13 电子科技大学 Confrontation sample defense system and method for artificial intelligence classification
CN111368886A (en) * 2020-02-25 2020-07-03 华南理工大学 Sample screening-based label-free vehicle picture classification method
US20200293901A1 (en) * 2019-03-15 2020-09-17 International Business Machines Corporation Adversarial input generation using variational autoencoder
CN112446476A (en) * 2019-09-04 2021-03-05 华为技术有限公司 Neural network model compression method, device, storage medium and chip
CN112926661A (en) * 2021-02-26 2021-06-08 电子科技大学 Method for enhancing image classification robustness
CN113378949A (en) * 2021-06-22 2021-09-10 昆明理工大学 Dual-generation confrontation learning method based on capsule network and mixed attention

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108805281A (en) * 2017-04-28 2018-11-13 英特尔公司 Graphics processing unit generates confrontation network
CN108038471A (en) * 2017-12-27 2018-05-15 哈尔滨工程大学 A kind of underwater sound communication signal type Identification method based on depth learning technology
CN109887047A (en) * 2018-12-28 2019-06-14 浙江工业大学 A kind of signal-image interpretation method based on production confrontation network
US20200293901A1 (en) * 2019-03-15 2020-09-17 International Business Machines Corporation Adversarial input generation using variational autoencoder
CN110298384A (en) * 2019-06-03 2019-10-01 西华大学 Fight sample image generation method and device
CN110334808A (en) * 2019-06-12 2019-10-15 武汉大学 A kind of confrontation attack defense method based on confrontation sample training
CN112446476A (en) * 2019-09-04 2021-03-05 华为技术有限公司 Neural network model compression method, device, storage medium and chip
CN110569916A (en) * 2019-09-16 2019-12-13 电子科技大学 Confrontation sample defense system and method for artificial intelligence classification
CN111368886A (en) * 2020-02-25 2020-07-03 华南理工大学 Sample screening-based label-free vehicle picture classification method
CN112926661A (en) * 2021-02-26 2021-06-08 电子科技大学 Method for enhancing image classification robustness
CN113378949A (en) * 2021-06-22 2021-09-10 昆明理工大学 Dual-generation confrontation learning method based on capsule network and mixed attention

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
NICOLAS PAPERNOT 等: "Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks", 《2016 IEEE SYMPOSIUM ON SECURITY AND PRIVACY》 *
杨朔 等: "基于深度生成式对抗网络的蓝藻语义分割", 《计算机应用》 *
王兴宾 等: "深度神经网络的对抗样本攻击与防御综述", 《广州大学学报(自然科学版)》 *

Also Published As

Publication number Publication date
CN113780468B (en) 2022-08-09

Similar Documents

Publication Publication Date Title
Qin et al. Detecting and diagnosing adversarial images with class-conditional capsule reconstructions
CN110348399B (en) Hyperspectral intelligent classification method based on prototype learning mechanism and multidimensional residual error network
Fu et al. Fast crowd density estimation with convolutional neural networks
CN110048827B (en) Class template attack method based on deep learning convolutional neural network
CN112308158A (en) Multi-source field self-adaptive model and method based on partial feature alignment
Horng et al. Multilevel image thresholding selection based on the firefly algorithm
CN110334749A (en) Confrontation attack defending model, construction method and application based on attention mechanism
CN113379618B (en) Optical remote sensing image cloud removing method based on residual dense connection and feature fusion
CN113806546A (en) Cooperative training-based method and system for defending confrontation of graph neural network
CN104732244A (en) Wavelet transform, multi-strategy PSO (particle swarm optimization) and SVM (support vector machine) integrated based remote sensing image classification method
CN113627543B (en) Anti-attack detection method
Ying et al. Human ear recognition based on deep convolutional neural network
Zunair et al. Unconventional wisdom: A new transfer learning approach applied to bengali numeral classification
Ding et al. Defending against adversarial attacks using random forest
CN109165698A (en) A kind of image classification recognition methods and its storage medium towards wisdom traffic
CN115062306A (en) Black box anti-attack method for malicious code detection system
Nguyen-Son et al. Opa2d: One-pixel attack, detection, and defense in deep neural networks
CN114049537B (en) Countermeasure sample defense method based on convolutional neural network
CN107766792A (en) A kind of remote sensing images ship seakeeping method
CN113780468B (en) Robust image classification model training method based on small number of neuron connections
Nabizadeh et al. A novel method for multi-level image thresholding using particle swarm Optimization algorithms
CN111723864A (en) Method and device for performing countermeasure training by using internet pictures based on active learning
CN115984667A (en) Fisher information-based antagonistic training generalization capability improving method
Liao et al. Convolution filter pruning for transfer learning on small dataset
Liu et al. Classifying hyperspectral images with capsule network and active learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant