CN113837360B - DNN robust model reinforcement method based on relational graph - Google Patents

DNN robust model reinforcement method based on relational graph Download PDF

Info

Publication number
CN113837360B
CN113837360B CN202111012421.2A CN202111012421A CN113837360B CN 113837360 B CN113837360 B CN 113837360B CN 202111012421 A CN202111012421 A CN 202111012421A CN 113837360 B CN113837360 B CN 113837360B
Authority
CN
China
Prior art keywords
neurons
model
loss
neural network
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111012421.2A
Other languages
Chinese (zh)
Other versions
CN113837360A (en
Inventor
陈晋音
陈宇冲
金海波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN202111012421.2A priority Critical patent/CN113837360B/en
Publication of CN113837360A publication Critical patent/CN113837360A/en
Application granted granted Critical
Publication of CN113837360B publication Critical patent/CN113837360B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/061Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using biological neurons, e.g. biological neurons connected to an integrated circuit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The invention discloses a DNN robust model reinforcement method based on a relation diagram, which comprises the steps of constructing a target model data set, training a target model, constructing a model relation diagram, constraining a critical path and reconstructing the target model. According to the invention, the critical path in the neural network is searched and constructed on the relation diagram of the DNN model, so that a new loss function is calculated for the propagation process and the node behavior on the path, and the relation diagram after the critical path is extracted is combined with the new loss function to reconstruct back to the model, thereby improving the robustness of the model.

Description

DNN robust model reinforcement method based on relational graph
Technical Field
The invention relates to the field of distributed machine learning and artificial intelligence safety, in particular to a DNN robust model reinforcement method based on a relation diagram.
Background
Deep learning is now widely used in the real world at a striking rate, such as computer vision, speech recognition, natural language processing, and the like. But the presence of an antagonistic sample, an image containing disturbances that are imperceptible to humans but misleading to Deep Neural Network (DNN) models, poses a potential security threat to deep learning systems, such as face recognition systems, automatic verification systems, and automatic driving systems.
On the one hand, during the last few years, a number of defense approaches have been proposed to increase the robustness of the model against samples, avoiding the potential hazards in real-world applications. These methods can be broadly divided into resistance training, input conversion, model architecture conversion, and resistance sample detection. However, these methods described above are mainly directed to the pixel space of the input image, and there are few ways to analyze the effect of the disturbance resistance by studying the behavior of the model intermediate layer. In recent researches, li et al propose a method for obtaining critical attack neurons based on a gradient-based influence propagation strategy, so as to further construct critical attack paths of a neural network on a computational graph, and reduce noise propagation by restraining a propagation process and node behaviors on the paths, thereby improving the robustness of the model. For example, in a social network, false information may pose a tremendous social threat by rapidly propagating between nodes. While nodes with higher information capabilities are more critical than other nodes, more easily communicate spurious information, and are included in critical paths. In order to effectively inhibit the transmission of false information, an immunization strategy is generally adopted, namely a key path is found and blocked in the graph, so that the transmission of the false information is reduced, and the safety of a social network is improved. However, existing computational graphs lack generality, are disjoint from biology and neuroscience, and contain too much redundant information for critical paths that focus only on information delivery.
On the other hand, it is widely believed that the performance of a neural network depends on its architecture, and that the relationship between the accuracy of the neural network and its underlying graph structure lacks systematic insight. Recently You et al have proposed a new method of representing a neural network as a graph, called a relationship graph. The figure focuses mainly on the exchange of information and not just on the directional data flow. It simultaneously gets rid of many constraints of the computational graph (such as directed, loop-free, bipartite, single-in single-out) and characterizes the neural network by clustering coefficients and average path length of the relationship graph. However, this method of approximating the performance of the neural network to a smooth function of the cluster coefficients and average path lengths of its graph can only build the original robust model, and once it is subjected to attack against the sample, it is difficult to explain and defend from a fine-grained point of view.
Aiming at the two problems, the invention provides a method for combining the critical path and the relation diagram, searching and constructing the critical path in the neural network on the relation diagram of the DNN model, and reconstructing the relation diagram after extracting the critical path back to the model, thereby improving the robustness of the model.
Disclosure of Invention
In order to further improve the effectiveness of a critical path in a neural network and provide a fine-grained explanation for a relation diagram representation mode of the neural network, the invention provides a DNN robust model reinforcement method based on the relation diagram.
In order to achieve the above object, the present invention provides the following technical solutions:
a DNN robust model reinforcement method based on a relation diagram comprises the following steps:
(1) Constructing a target model dataset: the target model data set comprises n pieces of sample data, which are divided into a-type sample data, and D% of sample data is extracted from each type of sample data to serve as a training set D of the target model train Test set D using each type of residual sample data as target model test N, a and d are natural numbers;
(2) Training a target model: constructing a target model structure for sample data, and setting a unified super-parameter pair training set D train Training: training the epoch number and batch size, adopting random gradient descent, setting the learning rate as the cos cosine learning rate of initial 0.1, and the Loss function Loss c The regularization parameter λ is added on the basis of the cross entropy function:
wherein the index i represents the ith sample data, i=1, 2, …, n, n is the number of sample data, x i Representing an input sample, wherein p (·) is a real label of the sample, q (·) is a prediction probability of the model, and w is a regularization coefficient; after training is finished, storing the model;
(3) Constructing a model relation diagram: firstly defining a neural network, then defining a general relation diagram by using information exchange, further expanding the information exchange of the general relation diagram to a convolutional neural network, and finally drawing the relation diagram;
(4) Constraining the critical path: firstly, calculating the influence between two layers of neurons, then selecting key neurons, and finally limiting the loss gradient and obtaining a loss function;
(5) Reconstructing a target model
Mapping the key neurons obtained in the step (4) to the map in the step (3)Removing nodes and continuous edges of non-critical neurons in the relationship diagram to obtain a new relationship diagram structure, reconstructing the relationship diagram back to the neural network structure, and adopting the loss function obtained in the step (4) to obtain a training set D train Training is carried out to obtain a new robust target model.
Further, the neural network definition in the step (3) is specifically: define a neural network graph g= (V, E), where v= { V 1 ,...,v n And is the set of nodes,for the edge set, subscripts i and j are subscripts of two randomly fetched nodes, and each node v has a feature vector x of one node v
Further, in the step (3), a general relationship graph is defined by using information exchange, specifically, the relationship graph is defined when the information exchange between the neural network graph G and the neurons is related, that is, when there is data input and output between the neurons in the graph, the neural network graph G and the neurons in the graph are related; each piece of information is converted at each side through an information function f (·) and then aggregated at each node through an aggregation function AGG (·); the M-th round of information exchange for node v can be described as:
wherein the superscript M represents the mth round of information exchange, m=1, 2, …, M, the subscripts u, v are nodes in the graph G,is the input node characteristic of the node u, N (v) = { u|v v (u, v) ∈E } is all the neighboring nodes of the node v, and +.>Is the output node characteristic of node v.
Further, the step (3) extends the information exchange of the general relation diagram to the convolutional neural network, specificallyThe relationship graph is applicable to convolutional neural networks in which the input is an image tensor X (m) The method comprises the steps of carrying out a first treatment on the surface of the Defining node features from vectorsPopularization to tensor->It contains the input image->Some channels are then defined by the convolution operator generalized information exchange
Where is the convolution operator,for a convolution filter, i, j represents the subscripts of two nodes that are randomly fetched.
Further, the drawing of the relation diagram in the step (3) is specifically implemented by loading the model trained and stored in the step (2), and the model is represented by psi; and calculating the information exchange value X of each tensor in two layers by using a formula (3) every time two layers of the model psi are selected, drawing by taking the tensor as a vertex neuron and the information exchange value as an edge weight, and finally connecting all drawn drawings together to be used as a drawing representation of a neural network.
Further, the effect between the two layers of neurons is calculated in the step (4), specifically, the effect of one neuron is the sum of absolute values of gradients of elements at each position of the effect of the one neuron on the other neuron; for the jth neuron of a first layerOutput of +.>The individual element z in (a) with the parameter +.>The i-th neuron of the first-1 layer>Impact value on it:
wherein the subscript L represents the first layer, l=1, 2, … L, L is the number of layers of the neural network,is the collection of elements of the ith neuron of layer 1, A (-) function extracts the elements at the specified locations;
by usingRepresenting the influence value between the i, j two neurons i.e. the i-th neuron of the first-1 layer +.>The j-th neuron of the first layer->Output of +.>Sum of values of each element in (a):
further, in the step (4), a critical neuron is selected, specifically, given a plurality of samples, and a loss gradient of the model decision contribution of the ith neuron of the last convolution layer L is first deduced
Wherein the method comprises the steps ofRepresenting the output of the ith neuron in the last convolutional layer L;
next, the loss gradients of each neuron are put together to order from large to small, and the first k neurons are selected as key neurons byRepresenting the key neurons selected in the last layer L:
wherein the top_k (·) function represents the first k, F, choices L The set of neurons of the last convolutional layer L is then recursively derived from the effect of layer L-1 on layer L using equation (7) to obtain the key neurons of the previous layers, usingRepresenting the key neurons fetched for each layer:
finally, the R (x) represents the key neurons of sample x at different layers:
further, one intuitive way to limit the loss gradient and obtain the loss function in step (4), particularly when faced against attacks, is to limit the loss gradient to reduce the impact of these neurons; a loss term can be directly obtained by limiting the gradient of the key neurons:
adding the loss term to the cross entropy loss to obtain the final loss function:
Loss=Loss c +δLoss g (11)
wherein Loss is c Representing the cross entropy loss function in step (2), delta is a super parameter used to balance these loss terms.
The technical conception of the invention is as follows: according to the deep neural network model reinforcement method based on the relation graph, a DNN model is built, the network structure is converted into the form of the relation graph, so that key neurons in the neural network are searched, key paths in the neural network are built, then more effective constraint is carried out on the propagation process and node behaviors on the paths, for example, a new gradient loss function is built to weaken the propagation of noise, and the relation graph after the key paths are extracted is combined with the new gradient loss function to reconstruct back to the model, so that the robustness of the model is improved. And finally, generating a challenge sample by using three challenge attack methods, and using the challenge sample to attack the model to verify the improvement of the robustness of the model.
The beneficial results of the invention are mainly as follows: 1) Compared with the method of selecting the key neurons in the traditional key paths by using the calculation map, the method has the advantages that more redundant information in the calculation map is removed by using the relation map, and the effectiveness of the key paths in the neural network is further improved. 2) The method of using the critical path provides a fine-grained explanation for the representation of the relationship graph of the neural network, especially when the model is under attack. 3) The method for improving the robustness of the model by utilizing the critical path only operates some critical neurons in the neural network, so that the integrity of the neural network is greatly reserved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a DNN robust model reinforcement method based on a relational graph.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the detailed description is presented by way of example only and is not intended to limit the scope of the invention.
As shown in fig. 1, a DNN robust model reinforcement method based on a relationship diagram includes the following steps:
(1) Construction of object model data sets
In the invention, the CIFAR-10 data set and the ImageNet data set are used for constructing a relation diagram and verifying the robustness. Both data sets are common data sets of an image classification model, wherein the CIFAR-10 data set contains 60000 RGB color images, each image has a size of 32 x 32 pixels and is divided into 10 classes, and each class contains 6000 images. Training set D for extracting 5000 images from each class as target model train_CIFAR-10 Test set D using each type of remaining image as target model test_CIFAR-10 . The ImageNet dataset contains 1000000 RGB color images, each image size is 224 x 224 pixels, and each image is divided into 1000 classes, each class contains 1000 images. Training set D for extracting 500 images from each class as target model train_ImageNet Test set D using each type of remaining image as target model test_ImageNet
(2) Target model training
The invention is used in CIFAR-10 data setA5-layer MLP with 512 hidden units is used as a target model structure, the input of the MLP is 3072-dimensional flat vectors of CIFAR-10 images, the output is 10-dimensional prediction, and each MLP layer is provided with a ReLU activation function and a BatchNorm regularization layer. ResNet-34 was used as the target model structure in the ImageNet dataset, which consisted of only basic convolution blocks of 3*3. Setting unified super parameter pair training set D train Training: training epoch number of 200 and batch size of 128, adopting random gradient descent, setting the cos cosine learning rate with learning rate of initial 0.1, and Loss function Loss c The regularization parameter λ is added on the basis of the cross entropy function:
wherein the index i represents the ith sample data, i=1, 2, …, n, n is the number of sample data, x i Representing an input sample, wherein p (·) is a real label of the sample, q (·) is a prediction probability of the model, and w is a regularization coefficient;
and saving the model after training is finished.
(3) Constructing a model relation diagram, which specifically comprises the following substeps:
(3.1) definition of neural networks
Define a neural network graph g= (V, E), where v= { V 1 ,...,v n And is the set of nodes,for the edge set, subscripts i and j are subscripts of two randomly fetched nodes, and each node v has a feature vector x of one node v
(3.2) defining a general relationship diagram with information exchange
When the neural network graph G in (3.1) is related to the information exchange between neurons, i.e., there is an afferent and an output of data between neurons in the graph, it is defined as a relationship graph. In particular, the information exchange is defined by an information function. When its input is characteristic of a node, the output is an information and an aggregation function; when its input is a set of information, the output is updated node characteristics. In each round of information exchange, each node sends information to its neighbors and aggregates incoming information from its neighbors. Each piece of information is converted at each edge by an information function f (·) and then aggregated at each node by an aggregation function AGG (·). The M-th round of information exchange is performed, and the M-th round of information exchange for the node v can be described as a formula:
wherein the superscript M represents the mth round of information exchange, m=1, 2, …, M, the subscripts u, v are nodes in the graph G,is the input node characteristic of the node u, N (v) = { u|v v (u, v) ∈E } is all the neighboring nodes of the node v, and +.>Is the output node characteristic of node v.
(3.3) extending the information exchange of the general relationship diagram to the convolutional neural network
Further adapting the relationship diagram defined in step (3.2) to a convolutional neural network, wherein the input is an image tensor X (m) Tensors are multiple linear mappings defined in vector space. Defining node features from vectorsPopularization to tensor->It contains the input image->Is then used to promote information exchange with a convolution operator defined as the following equation:
where is the convolution operator,for a convolution filter, i, j represents the subscripts of two nodes that are randomly fetched.
(3.4) drawing a relationship diagram
And (3) loading the model trained and stored in the step (2), wherein the model is represented by psi. The two continuous layers of the model psi are selected each time, the information exchange value X of each tensor in the two layers is calculated by using a formula (3), then the tensor is used as a vertex neuron, the information exchange value is used as an edge weight for drawing, and finally all drawn drawings are connected together to be used as a drawing representation of a neural network.
(4) Constraining a critical path comprising the sub-steps of:
(4.1) calculating the influence between two layers of neurons
In order to construct the critical attack path, it is necessary to extract critical attack neurons in each layer and connect them. The effect of the previous layer of neurons on the next layer of neurons is first calculated by back-propagation.
In particular, the effect of one neuron is the sum of the absolute values of the gradient of the element at each of its locations to another neuron. For the jth neuron of a first layerOutput of +.>The individual element z in (a) with the parameter +.>The i-th neuron of the first-1 layer>Impact value on it:
Wherein the subscript L represents the first layer, l=1, 2, … L, L is the number of layers of the neural network,is the collection of elements of the ith neuron of layer l-1, and the A (-) function extracts the elements at the specified locations.
By usingRepresenting the influence value between the i, j two neurons i.e. the i-th neuron of the first-1 layer +.>The j-th neuron of the first layer->Output of +.>Sum of values of each element in (a):
(4.2) selecting Key neurons
Given a plurality of samples, a loss gradient is first derived for the i-th neuron of the last convolutional layer L contributing to model decision making
Wherein the method comprises the steps ofRepresenting the output of the ith neuron in the last convolutional layer L.
Next, the loss gradients of each neuron are put together to order from large to small, and the first k neurons are selected as key neurons byRepresenting the key neurons selected in the last layer L:
wherein the top_k (·) function represents the first k, F, choices L The set of neurons of the last convolutional layer L is then recursively derived from the effect of layer L-1 on layer L using equation (7) to obtain the key neurons of the previous layers, usingRepresenting the key neurons fetched for each layer:
finally, the R (x) represents the key neurons of sample x at different layers:
(4.3) limiting loss gradients
One intuitive way to constrain critical paths in the face of challenge is to limit the loss gradient to reduce the impact of these neurons. A loss term can be directly obtained by limiting the gradient of the key neurons:
adding the loss term to the cross entropy loss to obtain the final loss function:
Loss=Loss c +δLoss g (11)
wherein Loss is c Representing the cross entropy loss function in step (2), δ is a super parameter used to balance these loss terms.
(5) Reconstructing a target model
Mapping the key neurons obtained in the step (4.2) onto the relation diagram drawn in the step (3.4), removing nodes and connecting edges of non-key neurons in the relation diagram to obtain a new relation diagram structure, reconstructing the relation diagram back to the neural network structure, and adopting the loss function obtained in the step (4.3) to obtain a training set D train Training is carried out to obtain a new robust target model.
Various methods of resistance attack are employed, including FGSM attack, CW attack, and PGD attack. Each attack randomly selects 1000 generated challenge samples in each data set to attack the original target model and the reconstructed robust target model. Three attacks set different parameters, wherein for FGSM attacks, the parameter epsilon=2 is set; for CW attacks, L is used 2 Attack of norm, setting initial value c=0.01, confidence k=0, iteration times epoch=200; for PGD attacks, the parameter epsilon=2, step alpha=epsilon/10, and number of iterations epoc=20.
When the model is subjected to the countermeasure attack, the accuracy, the precision and the recall rate of the model are commonly used as evaluation indexes of the robustness.
Accuracy rate: the accuracy Acc represents the ratio of the number of samples correctly classified by the classifier to the total number of samples for a given test dataset.
Wherein TP represents positive class and FP represents negative class, FN represents positive class and FN represents negative class, and TN represents higher accuracy rate and better robustness. The accuracy of the robust target model of the CIFAR-10 data set obtained through experiments is improved by 30.5% compared with the original target model on average, and the accuracy of the robust target model of the CIFAR-10 data set is improved by 23.6% compared with the original target model.
Accuracy rate: the accuracy Pre indicates the proportion of all "correctly discriminated samples (TP)" to all "actually discriminated samples (tp+fp)", that is, the proportion of samples judged to be member samples which are correctly discriminated. The higher the accuracy, the better the robustness. The accuracy of the robust target model of the CIFAR-10 data set obtained through experiments is improved by 22.6% compared with the original target model on average under three attacks, and the accuracy of the robust target model of the CIFAR-10 data set is improved by 18.6% compared with the original target model
Recall rate: recall Rec represents the percentage of member samples that were successfully predicted to the total number of member samples. The higher the recall, the better the robustness performance. The recall rate of the robust target model of the CIFAR-10 data set obtained through experiments under three attacks is improved by 16.7% compared with that of the original target model on average, and the recall rate of the robust target model of the CIFAR-10 data set is improved by 10.8% compared with that of the original target model
The foregoing detailed description of the preferred embodiments and advantages of the invention will be appreciated that the foregoing description is merely illustrative of the presently preferred embodiments of the invention, and that no changes, additions, substitutions and equivalents of those embodiments are intended to be included within the scope of the invention.

Claims (4)

1. The DNN robust model reinforcement method based on the relation diagram is characterized by comprising the following steps of:
(1) Constructing a target model dataset: the object model dataset includes n sample numbersAccording to the data, dividing the data into a type of sample data, wherein each piece of sample data is an RGB color image; extracting D% of sample data from each type of sample data as a training set D of a target model train Test set D using each type of residual sample data as target model test N, a and d are natural numbers;
(2) Training a target model: constructing a target model structure for sample data, and setting a unified super-parameter pair training set D train Training: training the epoch number and batch size, adopting random gradient descent, setting the learning rate as the cos cosine learning rate of initial 0.1, and the Loss function Loss c The regularization parameter λ is added on the basis of the cross entropy function:
wherein the index i represents the ith sample data, i=1, 2, …, n, n is the number of sample data, x i Representing an input sample, wherein p (·) is a real label of the sample, q (·) is a prediction probability of the model, and w is a regularization coefficient; after training is finished, storing the model;
(3) Constructing a model relation diagram: firstly defining a neural network, then defining a general relation diagram by using information exchange, further expanding the information exchange of the general relation diagram to a convolutional neural network, and finally drawing the relation diagram;
defining a neural network specifically includes: define a neural network graph g= (V, E), where v= { V 1 ,...,v n And is the set of nodes,for the edge set, subscripts i and j are subscripts of two randomly fetched nodes, and each node v has a feature vector x of one node v
Defining a general relation graph by using information exchange, wherein the general relation graph is specifically related to the information exchange between the neural network graph G and neurons, namely, when data are transmitted and output between the neurons in the graph, the neural network graph G is defined as the relation graph; each piece of information is converted at each side through an information function f (·) and then aggregated at each node through an aggregation function AGG (·); the M-th round of information exchange for node v can be described as:
wherein the superscript M represents the mth round of information exchange, m=1, 2, …, M, the subscripts u, v are nodes in the graph G,is the input node characteristic of the node u, N (v) = { u|v v (u, v) ∈E } is all the neighboring nodes of the node v, and +.>Is the output node characteristic of node v;
extending the information exchange of a general relationship diagram to a convolutional neural network, in particular applying the relationship diagram to a convolutional neural network, wherein the input is an image tensor X (m) The method comprises the steps of carrying out a first treatment on the surface of the Defining node features from vectorsPopularization to tensor->It contains the input image->Some channels are then defined by the convolution operator generalized information exchange
Where is the convolution operator,i and j are convolution filters, and represent subscripts of two nodes which are randomly fetched;
drawing a relation diagram, namely loading the model trained and stored in the step (2) and representing the model by psi; calculating information exchange values X of each tensor in two layers by using a formula (3) every time two layers of the model psi are selected, drawing by taking the tensor as a vertex neuron and the information exchange values as edge weights, and finally connecting all drawn drawings together to be used as a drawing representation of a neural network;
(4) Constraining the critical path: firstly, calculating the influence between two layers of neurons, then selecting key neurons, and finally limiting the loss gradient and obtaining a loss function;
(5) Reconstructing a target model:
mapping the key neurons obtained in the step (4) onto the relation diagram drawn in the step (3), removing nodes and connecting edges of non-key neurons in the relation diagram to obtain a new relation diagram structure, reconstructing the relation diagram back to the neural network structure, and adopting the loss function pair training set D obtained in the step (4) train Training is carried out to obtain a new robust target model.
2. The DNN robust model reinforcement method based on the relation diagram according to claim 1, wherein the influence between two layers of neurons is calculated in the step (4), specifically, the influence of one neuron is the sum of absolute values of gradients of elements on each position thereof with respect to the other neuron; for the jth neuron of a first layerOutput of +.>The individual element z in (a) with the parameter +.>Layer 1 of the first layeri neurons->Impact value on it:
wherein the subscript L represents the first layer, l=1, 2, … L, L is the number of layers of the neural network,is the collection of elements of the ith neuron of layer 1, A (-) function extracts the elements at the specified locations;
by usingRepresenting the influence value between the i, j two neurons i.e. the i-th neuron of the first-1 layer +.>The j-th neuron of the first layer->Output of +.>Sum of values of each element in (a):
3. the DNN robust model reinforcement method based on the relationship graph of claim 2, wherein the selecting of the key neurons in step (4) is specifically given a plurality of samples, and the i-th neuron model decision of the last convolution layer L is derived firstLoss gradient of contribution
Wherein the method comprises the steps ofRepresenting the output of the ith neuron in the last convolutional layer L;
next, the loss gradients of each neuron are put together to order from large to small, and the first k neurons are selected as key neurons byRepresenting the key neurons selected in the last layer L:
wherein the top_k (·) function represents the first k, F, choices L The set of neurons of the last convolutional layer L is then recursively derived from the effect of layer L-1 on layer L using equation (7) to obtain the key neurons of the previous layers, usingRepresenting the key neurons fetched for each layer:
finally, the R (x) represents the key neurons of sample x at different layers:
4. a DNN robust model reinforcement method based on a graph according to claim 3, wherein in step (4) the loss gradient is limited and a loss function is obtained, in particular when faced against attacks, an intuitive way to constrain critical paths is to limit the loss gradient to reduce the impact caused by these neurons; a loss term can be directly obtained by limiting the gradient of the key neurons:
adding the loss term to the cross entropy loss to obtain the final loss function:
Loss=Loss c +δLoss g (11);
wherein Loss is c Representing the cross entropy loss function in step (2), delta is a super parameter used to balance these loss terms.
CN202111012421.2A 2021-08-31 2021-08-31 DNN robust model reinforcement method based on relational graph Active CN113837360B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111012421.2A CN113837360B (en) 2021-08-31 2021-08-31 DNN robust model reinforcement method based on relational graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111012421.2A CN113837360B (en) 2021-08-31 2021-08-31 DNN robust model reinforcement method based on relational graph

Publications (2)

Publication Number Publication Date
CN113837360A CN113837360A (en) 2021-12-24
CN113837360B true CN113837360B (en) 2024-03-29

Family

ID=78961750

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111012421.2A Active CN113837360B (en) 2021-08-31 2021-08-31 DNN robust model reinforcement method based on relational graph

Country Status (1)

Country Link
CN (1) CN113837360B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112183717A (en) * 2020-08-28 2021-01-05 北京航空航天大学 Neural network training method and device based on critical path
CN113283599A (en) * 2021-06-11 2021-08-20 浙江工业大学 Anti-attack defense method based on neuron activation rate

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11715002B2 (en) * 2018-05-10 2023-08-01 Microsoft Technology Licensing, Llc Efficient data encoding for deep neural network training

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112183717A (en) * 2020-08-28 2021-01-05 北京航空航天大学 Neural network training method and device based on critical path
CN113283599A (en) * 2021-06-11 2021-08-20 浙江工业大学 Anti-attack defense method based on neuron activation rate

Also Published As

Publication number Publication date
CN113837360A (en) 2021-12-24

Similar Documents

Publication Publication Date Title
CN110941794B (en) Challenge attack defense method based on general inverse disturbance defense matrix
CN111160217B (en) Method and system for generating countermeasure sample of pedestrian re-recognition system
CN109255381B (en) Image classification method based on second-order VLAD sparse adaptive depth network
CN111598210B (en) Anti-attack defense method for anti-attack based on artificial immune algorithm
CN113283590B (en) Defending method for back door attack
CN113627543B (en) Anti-attack detection method
Chen et al. Automated design of neural network architectures with reinforcement learning for detection of global manipulations
CN112580728B (en) Dynamic link prediction model robustness enhancement method based on reinforcement learning
Suzuki et al. Adversarial example generation using evolutionary multi-objective optimization
CN111222583B (en) Image steganalysis method based on countermeasure training and critical path extraction
Tolba et al. Image signature improving by PCNN for Arabic sign language recognition
CN114626042A (en) Face verification attack method and device
CN111950635A (en) Robust feature learning method based on hierarchical feature alignment
JP2021093144A (en) Sensor-specific image recognition device and method
CN116051924B (en) Divide-and-conquer defense method for image countermeasure sample
CN113837360B (en) DNN robust model reinforcement method based on relational graph
CN116188439A (en) False face-changing image detection method and device based on identity recognition probability distribution
CN113627404B (en) High-generalization face replacement method and device based on causal inference and electronic equipment
CN115131646A (en) Deep network model compression method based on discrete coefficient
Wei et al. Auto-generating neural networks with reinforcement learning for multi-purpose image forensics
CN113255768A (en) Method for improving robustness of convolutional neural network
CN112749759A (en) Preprocessing method, system and application of confrontation sample of deep neural network map
CN113283537B (en) Method and device for protecting privacy of depth model based on parameter sharing and oriented to membership inference attack
Hollósi et al. Capsule Network based 3D Object Orientation Estimation
CN113283520B (en) Feature enhancement-based depth model privacy protection method and device for membership inference attack

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant