CN115392434A - Depth model reinforcement method based on graph structure variation test - Google Patents
Depth model reinforcement method based on graph structure variation test Download PDFInfo
- Publication number
- CN115392434A CN115392434A CN202210953766.6A CN202210953766A CN115392434A CN 115392434 A CN115392434 A CN 115392434A CN 202210953766 A CN202210953766 A CN 202210953766A CN 115392434 A CN115392434 A CN 115392434A
- Authority
- CN
- China
- Prior art keywords
- model
- test
- variation
- original target
- operator
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3668—Software testing
- G06F11/3672—Test management
- G06F11/3688—Test management for test execution, e.g. scheduling of test suites
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/061—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using biological neurons, e.g. biological neurons connected to an integrated circuit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Mathematical Physics (AREA)
- Medical Informatics (AREA)
- Multimedia (AREA)
- Neurology (AREA)
- Microelectronics & Electronic Packaging (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a depth model reinforcement method based on graph structure variation test, which comprises the steps of injecting different variation operators into a deep learning system to construct different variation test models, analyzing the detection degree of variation to evaluate the quality of the test models and preferentially sequencing the models, so that graph network indexes with guiding significance are found according to sequencing results after the models are mapped into a graph structure to generate a new robust graph structure and reconstruct the new robust graph structure back to the models, and the robustness of the models is improved.
Description
Technical Field
The invention relates to the field of distributed machine learning and artificial intelligence safety, in particular to a depth model reinforcement method based on a graph structure variation test.
Background
Deep learning has enjoyed tremendous success in many areas over the past few decades, including autopilot, artificial intelligence, video surveillance, and the like. However, with the recent frequent occurrence of a series of catastrophic accidents related to deep learning, the robustness and safety of the deep learning system become a big problem. This problem is further exacerbated by the recent proposal by Christian et al of an antagonism test generation technique that adds a misleading perturbation imperceptible to humans to the original image. Such newly generated test samples are called confrontation samples, and constitute a potential safety threat to deep learning systems such as face recognition systems, automatic verification systems, automatic driving systems and the like.
Over the past few years, a number of defense methods have been proposed to increase the robustness of the model to the resistant samples to avoid potential hazards in real-world applications. These methods can be broadly divided into adversarial training, input transformation, model architecture transformation, and adversarial sample detection. However, most of the above methods are directed to the pixel space of the input image, and the influence on the reactive disturbance is rarely analyzed by studying the model interlayer structure. This is because, although it is generally believed that the performance of a neural network depends on its network parameters and training samples, the relationship between the accuracy of the neural network and its underlying graph structure lacks systematic knowledge. In recent studies, you et al have proposed a new method of representing neural networks as graphs, called relational graphs. The figure focuses primarily on the exchange of information and not just on the directed data flow. However, this method can only construct an original robust model, and once the method is attacked by a challenge sample, the method is difficult to interpret and defend from the perspective of fine granularity. Meanwhile, due to the unique characteristics of the deep learning system, a new quality evaluation standard is also required to guide the deep learning system. Recently, ma et al have proposed a mutation testing framework specifically for deep learning systems, which injects faults into the deep learning system by defining a set of source-level mutation operators and a set of model-level mutation operators, and finally evaluates the quality of test data by analyzing the detection degree of the injected faults. However, the mutation test only detects faults in a local area, and cannot provide an optimization method with instructive opinions for the overall structure of the model.
And most recently, laura et al have conducted an exploratory experiment on the simulated human brain network for structural indexes of the underlying map, and they have simulated the network structure of the human brain by using a network topology structure and a modular organization calculation method, and drawn a brain structure diagram and counted indexes including characteristic path length, clustering coefficients and the like to guide research. However, the indexes of the graphs lack complex research on multilevel and multi-view of a deep learning network, and also lack interpretable theoretical basis on network robustness performance change.
Aiming at the problems, the invention provides a method for injecting different mutation operators into a deep learning system to construct different mutation test models, evaluating the quality of the models by analyzing the detection degree of the mutation to carry out preferential sequencing on the models, and finally finding out graph network indexes with guiding significance according to the sequencing result after the models are mapped into graph structures, thereby generating new robustness of the robustness graph structures to be reconstructed back to the models and improving the robustness of the models.
Disclosure of Invention
In order to further explore the relationship between a multi-level and multi-view complex network structure and the robustness of a model and provide a fine-grained explanation with instructive significance for a network graph structure representation mode, the invention provides a depth model reinforcement method based on graph structure mutation test.
In order to achieve the purpose, the invention provides the following technical scheme:
a depth model reinforcement method based on graph structure variation testing comprises the following steps:
1) Acquiring training data and test data;
2) Constructing an original target model, and training the original target model based on the acquired training data;
3) Constructing a variation test model of the original target model, the variation test model comprising a variation test model generated using a source-level variation operator and/or a model-level variation operator; the trained mutation test model generated by using the model-level mutation operator is directly stored, and the untrained mutation test model generated by using the source-level mutation operator is trained and stored on the basis of the acquired training data and by using the same parameters as those of the original target model;
4) Carrying out variation detection on the variation test model based on the test data, and calculating the variation score of each variation test model:
wherein M 'is a set of each variation test model, M' is an index of the variation test model in each variation test model, K is the number of kinds of labels of the test data, and killclass (T ', M') represents a set of kinds of test data in the test data T 'that kill the variation test model M'; for a test data point T among the test data T ', T kills x in the variant test model m' if the following condition is satisfied i Class data;
I. t is correctly classified as x by the original object model m i ;
II. t is not correctly classified as x by the variant test model m i (ii) a Wherein the test data is divided into K classes, x, based on the number of the types of the labels i A test data set representing a type i of the label;
5) Constructing graph structures of an original target model and each variation test model, wherein nodes in the graph structures correspond to neurons of each layer of neural network, edges in the graph structures correspond to connecting lines between two neuron nodes with propagation relation between each two layers, and the weights of the edges are components of a feature vector matrix of the corresponding nodes when the edges propagate from the previous layer to the next layer;
6) And calculating graph structure indexes corresponding to the graph structures of the original target model and each variation test model, readjusting the growth direction of the graph structure of the original target model according to the variation trend of the graph structure index of each variation test model along with the variation score, and finally reconstructing the adjusted graph structure back to the model to realize model reinforcement.
The technical conception of the invention is as follows: according to the depth model reinforcement method based on the graph structure variation test, different variation operators are injected into a deep learning system to construct different variation test models, the detection degree of variation is analyzed to evaluate the quality of the test models and preferentially sort the models, so that graph network indexes with guiding significance are found according to the sorting result after the models are mapped into the graph structure to generate a new robust graph structure and the new robust graph structure is reconstructed back to the models, and the robustness of the models is improved.
Further, the source mutation operator comprises:
LR operator: randomly deleting one layer from the untrained original target model structure;
LA operator: randomly adding a layer from the untrained original target model structure;
and/or AFR operator: all activation functions of one layer are randomly deleted from the untrained original object model structure.
Further, the model-level variance operator includes:
GF operator: changing the value of the weight in the trained original target model structure based on the Gaussian distribution of the weight in the trained original target model structure;
WS operator: selecting neurons randomly in the trained original target model structure, and replacing the weight connected with the neurons to the previous layer;
NEB operator: randomly selecting a layer of weight values of all neurons from the trained original target model structure as 0;
NAI operator: randomly selecting a layer of weight values of all neurons in the trained original target model structure as negative values;
and/or the NS operator: randomly selecting weights of 20% of neurons in a layer from the trained original target model structure for random exchange.
Further, in the GF operator, changing the numerical value of the weight in the trained original target model structure, wherein the value range of the changed numerical value is [ w-3 sigma, w +3 sigma ]; where σ is the standard deviation of the Gaussian distribution of weights in the trained original target model structure.
Further, the obtained training data and the test data are image data, the original target model is an image classification network model, and the training data and the test data are obtained by the following training method:
and taking each training sample in the training data as input, taking the prediction type of the training sample as output, and training the constructed original target model by minimizing the loss of the output and the real label of the sample to obtain the trained original model.
Further, in the step 4), the method further includes:
calculating the average error rate of each mutation test model M ' e M ' on the test data T ' is used for measuring the overall behavior difference effect AveErrorRate (T ', M ') of each mutation model:
therein, sigma m′∈M′ ErrorRate (T ', m') represents the error rate of the test data T 'in the variation test model m' test;
one or several variation test models M ' in which the overall behavioral difference effect AveErrorRate (T ', M ') is high were excluded.
Further, in step 6), the graph structure index includes: characteristic path length, degree, and/or clustering coefficient.
Further, in step 6), the method further includes: and adopting one or more adversarial attack methods to carry out adversarial attack on the reconstructed model so as to evaluate the improvement of the robustness of the reinforced model.
The method can be applied to the construction and reinforcement of various neural network models, is particularly suitable for the construction and reinforcement of the image classification network model based on K classification, and specifically relates to a construction method of the image classification network model based on the graph structure variation test, which comprises the following steps:
1) Acquiring training data and test data, wherein the training data and the test data are both images;
2) Constructing an original target model, and training the original target model based on the acquired training data: and taking each training sample in the training data as input, taking the prediction type of the training sample as output, and training the constructed original target model by minimizing the loss of the output and the real label of the sample to obtain the trained original model.
3) Constructing a variation test model of the original target model, the variation test model comprising a variation test model generated using a source-level variation operator and/or a model-level variation operator;
4) Performing variation detection on the variation test model based on the test data, and calculating the variation score of each variation test model;
5) Constructing graph structures of an original target model and each variation test model;
6) And calculating graph structure indexes corresponding to the graph structures of the original target model and each variation test model, readjusting the growth direction of the graph structure of the original target model according to the variation trend of the graph structure index of each variation test model along with the variation score, and finally reconstructing the adjusted graph structure back to the model to realize model reinforcement.
The beneficial results of the invention are mainly reflected in that: 1) Compared with the traditional model test method, the mutation operator provided by the mutation test modifies and tests the model in a finer-grained mode, so that the model accuracy is not reduced obviously, and the experimental diversity is maintained. 2) The model-level mutation operator in the mutation test only operates a few neurons in the neural network, and the integrity of the neural network is greatly reserved. 3) The graph structure guided robust optimization method of the model structure provides a referable new direction for the future model robustness research.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart of the method of the present invention.
FIG. 2 is a schematic diagram of the variant test workflow of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1 to 2, a depth model reinforcement method based on a graph structure variation test, in this embodiment, taking image classification model reinforcement as an example, includes the following steps:
1) Building a target model dataset
In the present embodiment, the construction of the brain-like map of the image classification model and the robustness verification are performed using the CIFAR-10 dataset. The CIFAR-10 data set contains 60000 RGB color images, each image is 32 × 32 in size and is divided into 10 classes, and each class accounts for 6000 samples. Wherein 50000 training samples and 10000 testing samples are obtained. The method takes 50000 training samples from the 50000 training samples of the CIFAR-10 data set as the training set of the target model, and takes 10000 training samples from the testing sample as the testing set of the target model.
2) Training original models
And taking each training sample as input, taking the category of the training sample as output, and training the constructed original model by minimizing the loss of the output and the real label of the sample to obtain the trained original model. In the embodiment, 3 5-layer MLPs with 512 hidden units are used in the CIFAR-10 data set as the original model structure of the image classification model, the input of the MLP is 3072-dimensional flat vectors of the CIFAR-10 image (32 × 3), the output is 10-dimensional image class prediction probability, and each MLP layer has a ReLU activation function and a BatchNorm regularization layer. Training sets unified hyper-parameters: training the epoch number to be 200, the batch size to be 128, using random gradient descent (SGD), cos cosine learning rate with the learning rate set as initial 0.1, and loss function, adding a regularization parameter λ on the basis of the cross entropy function, and expressing as the following formula, but not limited thereto:
where p (-) represents the true label of the sample, q (-) represents the prediction probability of the model, y i A sample representing the input is taken and,representing model parameters, n being the total number of samples, λ representing a regularization coefficient, | | Y | | luminance 2 Is a two-norm.
3) Constructing variation test model
3.1 Define source mutation operators
Layer Removal (LR): and the LR operator randomly deletes one layer of the deep neural network under the condition that the input and output structures of the deleted layer are the same. The operation is carried out for 5 times, a layer of structure which is not repeated with the previous selection is randomly selected from untrained original model structures each time, and 5 variation test models are constructed after deletion
Layer Addition (LA): compared with the LR operator, the LA operator adds one layer to the deep neural network structure. The emphasis of LA is to add layers, whichThe method also comprises 5 times of randomly selecting a layer structure from untrained original model structures, wherein the layer structure is not repeated with the former layer structure, and adding a new network after the layer structure to construct 5 variation test models
Activation Function Removal (AFR): since deep neural networks have a high degree of representativeness, the activation function plays an important role in the nonlinearity of deep neural networks. The AFR operation randomly deletes all active functions of a layer to simulate the situation where a developer forgets to add an active layer. Here again, 5 operations are performed, each time randomly selecting a layer structure from the untrained original model structures, which is not repeated with the previous selection, and deleting all the activation functions of the layer to construct 5 variant test models
3.2 Define model-level mutation operators
Gaussian blur (GF): weights are the basic elements of a deep neural network, describing the importance of connections between neurons. The weight is also an important component of the deep neural network decision logic. One natural way to change the weight is to blur its value to change the importance of the connection it represents. The GF operator follows a Gaussian distribution N (w, sigma) 2 ) To change a given weight w, where σ is a user-configurable standard deviation parameter. GF operator mostly blurs the weights to a range of values around them (typically the blurry values lie in [ w-3 σ, w +3 σ ]]Probability distribution) of the original model, performing 5 operations, selecting sigma with different sizes each time, performing fuzzy operation on the trained original model to construct 5 variation test models
Weight transformation (WS): the output of the neurons is usually determined by the preceding layer of neurons, each layer of neurons being connected to weights. The WS operator randomly selects the neuron and replaces the weight value connected with the neuron to the frontIn one layer, the randomly selected neurons may be left unchanged or may become the weights of the replaced neurons of the previous layer. Performing 5 operations, randomly selecting one layer of the trained original model each time, replacing 10% of neurons in the original model with the previous layer to construct 5 variation test models
Neuronal Effect Block (NEB): when a test data point is read into the deep neural network, it is processed and propagated through the connection and neuron layers of different weights until a final result is produced. Each neuron contributes to some extent to the final decision of the deep neural network according to its connection strength. The NEB operator prevents neurons from affecting all connected neurons in the next layer by resetting their next layer's connection weight to zero. The operation is carried out for 5 times, one layer in the trained original model is randomly selected each time, all neurons are set to be zero, and 5 variation test models are constructed
Neuronal reverse activation (NAI): the activation function plays a key role in the formation of the nonlinear behavior of neuronal neural networks. Many widely used activation functions (e.g., reLU, etc.) exhibit different behavior depending on their activation state. The NAI operator attempts to reverse the activation state of the neuron, which can be accomplished by changing the sign of the neuron output value before applying the neuron's activation function. This helps to generate more patterns of variant neuron activation, each of which can exhibit new mathematical properties (e.g., linear properties) of the deep neural network. Performing 5 operations, randomly selecting one layer of the trained original model each time to make the weights of all neurons in the layer take negative values to realize the inverse activation of the neurons, and constructing 5 variation test models by the weights
Neuronal Switch (NS): neurons of one deep neural network layer are typically connected to neurons of the next layerThere are different effects. NS operation exchanges within one layer the effects and influence of several neurons on the next layer. The operation is carried out for 5 times, one layer in the trained original model is randomly selected each time, the weight of 20 percent of neurons is taken for random exchange, and 5 variation test models are constructed
3.3 ) training and saving models
And directly storing the trained mutation test model generated by using the model-level mutation operator, and training and storing the untrained mutation test model generated by using the source-level mutation operator by using the parameters in the step 2).
4) Variation detection
4.1 Define a detection index
For a K classification problem, Z = { x = 1 ,...,x K For all K classes of input data, x i One class of data (i =1,2 \8230; K) is shown. For a test data point T in the test set T ', T kills x in the variant test model m' if the following condition is satisfied i Class data (where M ' is equal to M ', M ' is a set of each variation test model, and this embodiment includes 8 variation test models, and each variation test model includes 5 variation test models):
I. t is correctly classified as x by the original model m i ;
II. t is incorrectly classified as x by the mutation test model m i 。
The variant score MutationScore (T ', m ') for the variant test model m ' is defined as follows, where killclass (T ', m ') is the set of classes in T ' where the test data kills the m ' variant test model:
in general, it is difficult to accurately predict the behavior difference caused by mutation operators. Therefore, as a preferred solution, in order to avoid introducing too much behavior difference between the DL variation model and the original model, a DL variation model quality control procedure is proposed. The error rate at T 'for each variant test pattern m' was measured. If the error rate of m ' is too high for T ', it is considered that m ' is not a good variant test sample because it introduces large behavioral differences. These high error rate variant test models were excluded from M' for further analysis. Defining the Average Error Rate (AER) of each mutation test model M ' E M ' on T ' to measure the overall behavior difference effect introduced by all mutation operators:
therein, sigma m′∈M′ ErrorRate (T ', m') represents the error rate of the test data T 'in the variation test model m' test;
4.2 ) rank variation test model
And (2) performing variation detection on each stored variation test model by using the definition in 4.1), firstly excluding a plurality of abnormal variation test models with obviously higher Average Error Rate (AER), and then sequencing the rest variation test models according to variation scores from high to low, wherein the higher the score is, the lower the robustness of the model is generally.
5) Constructing model graph structures
5.1 Define a neural network
Define graph G = (V, E), where y = { V = 1 ,...,v n Is the set of nodes and is the node set,is a set of edges and each node v has a feature vector W of one node v 。
5.2 Computation graph of definition model
Defining a set of graph nodes V = { V } using a forward propagation algorithm 1 ,...,v n Is the neuron of each layer, set of edgesFor two neuron nodes v with propagation relation between every two layers i ,v j The weight of the edge is set as the component of the eigenvector matrix of the corresponding node when the edge propagates from the previous layer to the next layer, taking the fully-connected network as an example, the weight is described as follows by a formula:
W v =[w i1 ,w i2 ,...,w ij ,...]
wherein for each component w ij I denotes a position where the subscript of the neuron in the previous layer network to which the weight is connected, and j denotes a position where the subscript of the neuron in the next layer to which the weight is connected. In general, each neuron of the previous layer of network in the fully connected network has a connecting edge with all neurons of the next layer, namely from 1 st to nth.
5.3 A computational graph structure was constructed for each of the models tested for variation in 4.2) using the computational graph approach defined in 5.. 2).
6) Reconstruction model under guidance of graph structure indexes
6.1 ) define various graph structure indexes, in this embodiment, the following three graph structure indexes are adopted:
(1) characteristic path length: the characteristic path length is an index for measuring efficiency and is defined as the average shortest path length of the network. The distance matrix used to calculate the shortest path must be a connection length matrix, usually obtained by mapping from weights to lengths. The most common weighted path length is used here as a criterion for the calculation, the formula is as follows:
wherein w ij Is the edge weight defined in step 3.2), L represents the L-th layer, L =1,2, \ 8230l, L being the total number of layers of the neural network, where L is generally set to w ij The layer in which the neuron of the middle subscript i is located.
(2) Degree: the degree of a point refers to the number of edges in the graph that are connected to the point. By L i The degree representing the ith node is defined as follows:
L i ={v i :e ij ∈E∧e ji ∈E}
wherein v is i Denotes the ith node, e ij The connecting edge between the nodes i and j is shown, and lambada represents a logical intersection operation.
(3) Clustering coefficient: the clustering coefficient is a coefficient used to describe the degree of clustering between vertices in a graph. Specifically, the degree of interconnection between adjacent points of one point. By CL i The clustering coefficient representing the ith node is defined as follows:
where each e represents a connecting edge between two nodes.
6.2 ) reconstruction model
Calculating each graph index corresponding to each original target model and each graph structure of each variation test model by using the method in 6.1), observing the variation of the indexes according to the sequence of variation scores of each variation test model, finding the growth direction of the index guide graph structure with obvious variation trend, taking the characteristic path length as an example, if the characteristic path length is observed to increase along with the increase of the variation scores of each variation test model, the larger the characteristic path length is, the lower the model robustness is, therefore, the characteristic path length of the original target model is adjusted to be small and fixed, then calculating each weight value reversely to obtain a new graph structure, and finally reconstructing the graph structure back to the model to obtain m new 。
7) Furthermore, the original model and the reconstructed model are subjected to attack resistance, and the robustness of the reconstructed model is evaluated
Various antagonistic attack methods are employed, including FGSM attacks, CW attacks, and PGD attacks. Each attack randomly picked 1000 in each data set to generate an attack on the challenge sample. The three attacks set different parameters, where for the FGSM attack, the parameter ∈ =2 is set; for CW attacks, L is used 2 The norm attack is characterized in that an initial value c =0.01, a confidence coefficient k =0 and an iteration time epoch =200 are set; for PGD attack, the parameters e =2, step size a = e/10, number of iterations epoch =20 are set.
Evaluating indexes of model robustness; when the anti-attack model is subjected to anti-attack, the accuracy rate is commonly used as an evaluation index of the robustness.
The accuracy is as follows: accuracy represents the ratio of the number of samples correctly classified by the classifier to the total number of samples for a given test data set
Wherein TP represents that the positive class is judged as the positive class, FP represents that the negative class is judged as the positive class, FN represents that the positive class is judged as the negative class, TN represents that the negative class is judged as the negative class, the lower the accuracy rate is, the better the robust performance is, and the more stable the model is. The accuracy of a reconstructed model of the CIFAR-10 data set obtained through experiments under three attacks is improved by 30.7% in comparison with that of an original model on average.
The above-mentioned embodiments are intended to illustrate the technical solutions and advantages of the present invention, and it should be understood that the above-mentioned embodiments are only the most preferred embodiments of the present invention, and are not intended to limit the present invention, and any modifications, additions, equivalents, etc. made within the scope of the principles of the present invention should be included in the scope of the present invention.
Claims (8)
1. A depth model reinforcement method based on graph structure variation test is characterized by comprising the following steps:
1) Acquiring training data and test data;
2) Constructing an original target model, and training the original target model based on the acquired training data;
3) Constructing a variation test model of the original target model, the variation test model comprising a variation test model generated using a source-level variation operator and/or a model-level variation operator; the trained mutation test model generated by using the model-level mutation operator is directly stored, and the untrained mutation test model generated by using the source-level mutation operator is trained and stored on the basis of the acquired training data and by using the same parameters as those of the original target model;
4) Carrying out variation detection on the variation test model based on the test data, and calculating the variation score of each variation test model:
where M 'is a set of each variation test model, M' is an index of the variation test model in each variation test model, K is the number of kinds of labels of the test data, and killclass (T ', M') represents a set of kinds of test data in the test data T 'that kill the variation test model M'; for a test data point T among the test data T ', T kills x in the variant test model m' if the following condition is satisfied i Class data;
i, t are correctly classified as x by the original object model m i ;
II, t are not correctly classified as x by the mutation test model m i (ii) a Wherein the test data is divided into K classes, x, based on the number of the types of the tags i A test data set representing a type i of the label;
5) Constructing a graph structure of an original target model and each variation test model, wherein nodes in the graph structure correspond to neurons of each layer of neural network, edges in the graph structure correspond to connecting lines between two neuron nodes with propagation relation between each front layer and each rear layer, and the weight of each edge is a component of a characteristic vector matrix of the corresponding node when the edge propagates from the front layer to the rear layer;
6) And calculating graph structure indexes corresponding to the graph structures of the original target model and each variation test model, readjusting the growth direction of the graph structure of the original target model according to the variation trend of the graph structure index of each variation test model along with the variation score, and finally reconstructing the adjusted graph structure back to the model to realize model reinforcement.
2. The method of claim 1, wherein the source mutator comprises:
LR operator: randomly deleting one layer from the untrained original target model structure;
LA operator: randomly adding a layer from the untrained original target model structure;
and/or AFR operator: all activation functions of one layer are randomly deleted from the untrained original object model structure.
3. The method of claim 1, wherein the model-level variance operator comprises:
GF operator: changing the numerical value of the weight in the trained original target model structure based on the Gaussian distribution of the weight in the trained original target model structure;
WS operator: randomly selecting neurons in the trained original target model structure, and replacing the weight connected with the neurons to the previous layer;
the NEB operator: randomly selecting a layer of weight values of all neurons from the trained original target model structure as 0;
NAI operator: randomly selecting a layer of weight values of all neurons in the trained original target model structure as negative values;
and/or the NS operator: randomly selecting weights of 20% of neurons in a layer from the trained original target model structure for random exchange.
4. The method according to claim 3, wherein in the GF operator, the values of the weights are changed in the trained original target model structure, and the value range of the changed values is [ w-3 σ, w +3 σ ]; where σ is the standard deviation of the Gaussian distribution of weights in the trained original target model structure.
5. The method of claim 1, wherein the acquired training data and test data are image data, and the original target model is an image classification network model, and is obtained by training as follows:
and taking each training sample in the training data as input, taking the prediction type of the training sample as output, and training the constructed original target model by minimizing the loss of the output and the real label of the sample to obtain the trained original model.
6. The method according to claim 1, wherein the step 4) further comprises:
calculating the average error rate of each mutation test model M ' e M ' on the test data T ' is used for measuring the overall behavior difference effect AveErrorRate (T ', M ') of each mutation model:
therein, sigma m′∈M′ ErrorRate (T ', m') represents the error rate of the test data T 'in the variation test model m' test;
one or several variation test models M ' in which the overall behavioral difference effect AveErrorRate (T ', M ') was high were excluded.
7. The method according to claim 1, wherein in step 6), the graph structure index comprises: characteristic path length, degree, and/or clustering coefficient.
8. The method according to claim 7, wherein in the step 6), further comprising: and adopting one or more antagonism attack methods to resist the attack on the reconstructed model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210953766.6A CN115392434A (en) | 2022-08-10 | 2022-08-10 | Depth model reinforcement method based on graph structure variation test |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210953766.6A CN115392434A (en) | 2022-08-10 | 2022-08-10 | Depth model reinforcement method based on graph structure variation test |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115392434A true CN115392434A (en) | 2022-11-25 |
Family
ID=84117864
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210953766.6A Withdrawn CN115392434A (en) | 2022-08-10 | 2022-08-10 | Depth model reinforcement method based on graph structure variation test |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115392434A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116361190A (en) * | 2023-04-17 | 2023-06-30 | 南京航空航天大学 | Deep learning variation test method based on neuron correlation guidance |
-
2022
- 2022-08-10 CN CN202210953766.6A patent/CN115392434A/en not_active Withdrawn
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116361190A (en) * | 2023-04-17 | 2023-06-30 | 南京航空航天大学 | Deep learning variation test method based on neuron correlation guidance |
CN116361190B (en) * | 2023-04-17 | 2023-12-05 | 南京航空航天大学 | Deep learning variation test method based on neuron correlation guidance |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3620990A1 (en) | Capturing network dynamics using dynamic graph representation learning | |
CN109299701B (en) | Human face age estimation method based on GAN expansion multi-human species characteristic collaborative selection | |
Ferentinou et al. | Computational intelligence tools for the prediction of slope performance | |
Yousri et al. | Fractional-order comprehensive learning marine predators algorithm for global optimization and feature selection | |
Hegazy et al. | Dimensionality reduction using an improved whale optimization algorithm for data classification | |
CN112766496B (en) | Deep learning model safety guarantee compression method and device based on reinforcement learning | |
Thompson et al. | Building lego using deep generative models of graphs | |
CN117134969A (en) | Intrusion detection algorithm based on diffusion generation countermeasure network and improved white whale optimization | |
CN111126758B (en) | Academic team influence propagation prediction method, academic team influence propagation prediction equipment and storage medium | |
CN115392434A (en) | Depth model reinforcement method based on graph structure variation test | |
Lu et al. | Counting crowd by weighing counts: A sequential decision-making perspective | |
Lal et al. | The r-matrix net | |
JP7468088B2 (en) | Image processing system and image processing program | |
CN115909027B (en) | Situation estimation method and device | |
KR102110316B1 (en) | Method and device for variational interference using neural network | |
CN113743572A (en) | Artificial neural network testing method based on fuzzy | |
CN116739100A (en) | Vulnerability detection method of quantum neural network and automatic driving vulnerability detection method | |
CN114048837A (en) | Deep neural network model reinforcement method based on distributed brain-like map | |
CN113468046B (en) | Method for generating induction input of multi-target-oriented DNN model | |
Corbalán et al. | Evolving neural arrays: a new mechanism for learning complex action sequences | |
CN115131646A (en) | Deep network model compression method based on discrete coefficient | |
CN111382391A (en) | Target correlation feature construction method for multi-target regression | |
CN112862082B (en) | Link prediction method based on support vector machine | |
CN113837360B (en) | DNN robust model reinforcement method based on relational graph | |
Mei | Internet Abnormal Traffic Detection Based on WOA-BiLSTM-Attention |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20221125 |