CN114048837A - Deep neural network model reinforcement method based on distributed brain-like map - Google Patents

Deep neural network model reinforcement method based on distributed brain-like map Download PDF

Info

Publication number
CN114048837A
CN114048837A CN202111234229.8A CN202111234229A CN114048837A CN 114048837 A CN114048837 A CN 114048837A CN 202111234229 A CN202111234229 A CN 202111234229A CN 114048837 A CN114048837 A CN 114048837A
Authority
CN
China
Prior art keywords
graph
model
brain
layer
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111234229.8A
Other languages
Chinese (zh)
Inventor
陈晋音
陈宇冲
贾澄钰
郑海斌
金海波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN202111234229.8A priority Critical patent/CN114048837A/en
Publication of CN114048837A publication Critical patent/CN114048837A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a deep neural network model reinforcing method based on a distributed brain-like graph, which is characterized in that a main stem brain-like graph is generated by applying a key path to the distributed brain-like graph so as to more effectively restrict the propagation process and node behaviors on the path, for example, a new gradient loss function is constructed to weaken the propagation of noise, so that a new robust brain-like graph structure reconstruction model is generated on the main stem brain-like graph guided by the key path by using graph network indexes such as characteristic path length, participation coefficient and the like, the robustness of the model is improved, and the model is reinforced. The method of the invention uses the brain-like map to show closer contact with the biological neural network, only operates some key neurons in the neural network, and greatly maintains the integrity of the neural network.

Description

Deep neural network model reinforcement method based on distributed brain-like map
Technical Field
The invention relates to the field of distributed machine learning and artificial intelligence safety, in particular to a deep neural network model reinforcement method based on a distributed brain-like map.
Background
With the great improvement of software performance and hardware computing power in the modern society, artificial intelligence is widely applied to the fields of computer vision, natural language processing, complex network analysis and the like and achieves good effect. However, Christian et al propose that adding a misleading perturbation imperceptible to humans to the original image to generate a new sample will cause the model to give an erroneous output with high confidence. Such newly generated samples are called confrontation samples, and constitute a potential safety threat to deep learning systems such as face recognition systems, automatic verification systems, and automatic driving systems.
Over the past few years, a number of defense methods have been proposed to increase the robustness of the model to the resistant samples to avoid potential hazards in real-world applications. These methods can be broadly divided into adversarial training, input transformation, model architecture transformation, and adversarial sample detection. However, most of the above methods are directed to the pixel space of the input image, and the influence on the reactive disturbance is rarely analyzed by studying the model interlayer structure. This is because, although it is generally believed that the performance of a neural network depends on its architecture, the relationship between the accuracy of the neural network and its underlying graph structure lacks systematic knowledge. In recent studies, You et al, on the other hand, have proposed a new method of representing neural networks as graphs, called relational graphs. The figure focuses primarily on the exchange of information and not just on the directed data flow. However, the method can only construct an original robust model, and once the method is attacked by a countersample, the interpretation and the defense are difficult from the perspective of fine granularity, and meanwhile, the relation graph still does not depart from the traditional image network and lacks deep connection with a biological neural network. On the other hand, Laura et al recently simulated the network structure of human brain by using a network topology and a modular organization calculation method, and drawn a brain network graph to count indexes including characteristic path length, participation coefficient and the like to guide research, but the guidance mode lacks an interpretable theoretical basis for the robustness of a neural network.
Meanwhile, Li et al recently proposed a method for obtaining a critical attack neuron based on a gradient influence propagation strategy, further constructing a critical attack path of a neural network on a computational graph, and improving the robustness of a model by restricting the propagation process and node behavior on the path to weaken the propagation of noise. For example, in social networks, false information can pose a huge social threat through rapid propagation between nodes. While nodes with higher information capacity are more critical than other nodes, are more prone to passing spurious information, and are included in the critical path. In order to effectively suppress the propagation of the false information, an immunization strategy is generally adopted, namely, a key path is found and blocked in a graph, so that the propagation of the false information is reduced, and the security of a social network is improved. However, the existing graph network representation mode lacks universality and is disjointed from biology and neuroscience.
Aiming at the problems, the invention provides a method for applying the key path to a distributed brain-like graph to generate a trunk brain-like graph, and growing a new robust brain-like graph structure on the trunk brain-like graph by using a graph network index with guiding significance to reconstruct a model, thereby improving the robustness of the model.
Disclosure of Invention
In order to further deepen the connectivity between the deep neural network and the biological neural network and provide fine-grained explanation for a brain-like image representation mode of the neural network, the invention provides a deep neural network model reinforcement method based on a distributed brain-like image.
In order to achieve the purpose, the invention provides the following technical scheme: a deep neural network model reinforcement method based on a distributed brain-like map comprises the following steps:
(1) selecting sample data from the target model dataset;
(2) constructing a target model for the sample data selected in the step (1); training the target model, and finally storing the trained target model;
(3) defining a neural network, then defining a calculation graph of a single neural network, inputting the target model obtained by training in the step (2), and constructing an original distributed brain-like graph;
(4) calculating the influence between every two layers of neurons in the neural network, selecting key neurons, mapping the key neurons to the distributed brain-like graph constructed in the step (3), and obtaining a distributed brain-like graph structure related to key path guidance;
(5) and (4) defining graph network indexes, respectively calculating the original distributed brain-like graph obtained in the step (3) and the graph network indexes of the distributed brain-like graph structure related to the key path guidance obtained in the step (4), generating a new brain-like graph structure and reconstructing a new target model.
Further, the step (1) is specifically: the target model data set comprises n sample data, the sample data is divided into a class a sample data, and D% of sample data is extracted from each class of sample data to be used as a training set D of the target modeltrain(ii) a Wherein n, a and d are natural numbers.
Further, the step (2) is specifically:
(2.1) constructing a target model for the sample data selected in the step (1), wherein the target model adopts a distributed structure, and three identical submodels m are respectively arranged for the three RGB characteristics of the image1,m2,m3And finally an output model m for normalizing the feature matrixout
(2.2) setting a uniform hyper-parameter for all the models set in the step (2.1) and using the hyper-parameter for the training set D set in the step (1)trainTraining is carried out, specifically: self-defining and setting the times of training epochs, the batch size, the optimizer, the learning rate and the Loss function, wherein the optimizer adopts random gradient descent, the learning rate is set to be the cos cosine learning rate of initial 0.1, and the Loss function Loss is set to becThen a regularization parameter λ is added on the basis of the cross entropy function:
Figure BDA0003317183530000031
wherein p (-) denotes a sampleTrue label, q (-) represents the prediction probability of the model, xiA sample representing the input is taken and,
Figure BDA0003317183530000034
representing model parameters, and lambda represents a regularization coefficient;
and (2.3) repeating the training until the accuracy rate of the target model is converged, and then storing the target model obtained by training.
Further, the step (3) is specifically:
(3.1) defining a neural network: the definition map G ═ (V, E), where V ═ V1,...,vnIs the set of nodes and is the node set,
Figure BDA0003317183530000032
Figure BDA0003317183530000033
is a set of edges and each node v has a feature vector W of one nodev
(3.2) defining a computational graph of a single model: using a forward propagation algorithm, a set of graph nodes, V ═ V, { V, is defined1,...,vnIs the set of all neurons, edges
Figure BDA00033171835300000411
For a connecting line between two neuron nodes with a propagation relation between each two layers, the weight of an edge is set as a component of a characteristic vector matrix of a corresponding node when the edge propagates from the previous layer to the next layer, and the weight is described as follows by a formula:
Wv=[wi1,wi2,…,wij]
wherein for each component wijI represents the subscript of the neuron in the previous layer of network connected with the weight, namely the position, j represents the subscript of the neuron in the next layer connected with the weight, namely the position;
(3.3) constructing a distributed brain-like map: inputting the target model obtained by training in the step (2), firstly calculating the characteristic vector of each neuron node of each model, and respectively drawing according to the definition of the step (3.3)Making calculation graphs of each model, and finally connecting all the drawn calculation graphs of the sub-models with the set weights as connecting edges to the calculation graph of the output model to generate an original distributed brain-like graph Gori
Further, the step (4) is specifically as follows:
(4.1) calculation of the effects between two layers of neurons: for a jth neuron F of the l-th layerl jOutput of (2)
Figure BDA0003317183530000041
By a single element z of
Figure BDA0003317183530000042
The ith neuron representing the l-1 layer
Figure BDA0003317183530000043
Influence value thereon:
Figure BDA0003317183530000044
wherein the subscript L represents the ith layer, L is 1,2, … L, L is the total number of layers of the neural network,
Figure BDA0003317183530000045
is the set of elements of the ith neuron at layer l-1, and the A (-) function extracts the elements at the specified locations.
By using
Figure BDA0003317183530000046
Representing the influence value between two i, j neurons, i.e. the ith neuron at layer l-1
Figure BDA0003317183530000047
For the jth neuron of the l layer
Figure BDA0003317183530000048
Output of (2)
Figure BDA0003317183530000049
Sum of values of each element of (1):
Figure BDA00033171835300000410
(4.2) selecting key neurons: given a sample x, the loss gradient that the ith neuron of the last convolutional layer L contributes to the model decision is first computed:
Figure BDA0003317183530000051
wherein the content of the first and second substances,
Figure BDA0003317183530000052
represents the output of the ith neuron in layer L;
then, the loss gradients of each neuron are put together to carry out sorting from large to small, the first k neurons are selected as key neurons, and the key neurons are used
Figure BDA0003317183530000053
Represents the key neurons selected in the last layer L:
Figure BDA0003317183530000054
wherein the top _ k (-) function represents the first k, FLThat is, the neuron set of the last convolutional layer L, and then based on the effect of layer L-1 on layer L
Figure BDA0003317183530000055
Representing the key neurons taken by each layer:
Figure BDA0003317183530000056
finally, the key neurons of the sample x in different layers are represented by R (x):
Figure BDA0003317183530000057
(4.3) limiting the loss gradient: one loss term is obtained by limiting the gradient of key neurons:
Figure BDA0003317183530000058
and adding the loss term into the cross entropy loss to obtain a final loss function:
Loss=Lossc+δLossg
wherein LosscExpressed as a cross-entropy loss function in step (2), δ is a hyperparameter for balancing these loss terms;
(4.4) mapping the critical path: mapping the key neurons obtained in the step (4.2) to the distributed brain-like graph drawn in the step (3.3), and removing nodes and connecting edges of non-key neurons in the brain-like graph to obtain a distributed brain-like graph structure G related to key path guidancepath
Further, the step (4) is specifically as follows:
(5.1) defining graph network metrics: the graph network indicators comprise characteristic path lengths and participation coefficients; the characteristic path length is specifically the average shortest path length of the network and is used for measuring efficiency; the participation coefficient is a measure of the distribution of the connection of a node in the network community;
(5.2) growing a new brain-like structure reconstruction model:
respectively calculating the original distributed brain-like map G obtained in the step (3)oriAnd (4) obtaining a distributed brain-like graph structure G related to key path guidancepathObserving the change trend of the index according to the graph network index defined in the step (5.1), wherein the trend direction of the growth of the brain-like graph is guided by the characteristic path length, the distribution weight ratio of each sub-model is guided by the participation coefficient, and the generated new brain-like graph structure is reconstructed back to the original brain-like graph structureTarget model gets m1′,m2′,m3' submodel and mout' output model.
The technical conception of the invention is as follows: according to the deep neural network model reinforcing method based on the distributed brain-like graph, the main stem brain-like graph is generated by applying the key path to the distributed brain-like graph, so that the propagation process and node behaviors on the path are more effectively constrained, for example, a new gradient loss function is constructed to weaken the propagation of noise, and therefore a new robust brain-like graph structure reconstruction model is grown on the main stem brain-like graph guided by the key path by using graph network indexes such as characteristic path length, participation coefficient and the like, and the robustness of the model is improved. And finally, generating a countermeasure sample by using three most advanced counterattack methods, and verifying the improvement of the robustness of the model by using an attack model.
The beneficial results of the invention are mainly reflected in that: 1) the use of brain-like maps reveals a closer connection to biological neural networks than the use of computational maps and newly-appearing relational maps in traditional critical paths to select critical neurons. 2) A fine-grained explanation is provided for a brain-like graph representation mode of a neural network by using a method of a critical path. And the method for improving the robustness of the model by utilizing the key path only operates some key neurons in the neural network, thereby greatly preserving the integrity of the neural network. 3) The distributed structure breaks through a network construction method that all features of an original image are directly flattened into a matrix by the same weight ratio, and the weight ratio setting with reference basis under the guidance of key paths and graph network indexes is realized.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a schematic diagram of a model building brainlike map in accordance with the present invention;
FIG. 3 is a schematic diagram of the invention showing the selection of key neurons in the relationship graph.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1 to 3, the invention provides a deep neural network model reinforcement method based on a distributed brain-like map, which comprises the following steps:
(1) constructing a target model data set, specifically:
the target model data set comprises n sample data, the sample data is divided into a class a sample data, and D% of the sample data is extracted from each class of sample data to be used as a training set D of the target modeltrainTaking the residual sample data of each type as a test set D of the target modeltest(ii) a Wherein n, a and d are natural numbers;
in the embodiment of the invention, the CIFAR-10 data set is used for constructing the brain-like graph and verifying the robustness. The CIFAR-10 data set comprises 60000 RGB color images, each image is 32 x 32 in size and is divided into 10 types, and each type comprises 6000 samples in total. Wherein 50000 training samples and 10000 testing samples are obtained. The invention takes all 50000 training samples from the 50000 training samples of the CIFAR-10 data set as the training set D of the target modeltrainTaking all 10000 test sets D as target models from the test samplestest
(2) The method comprises the following steps of constructing and training a target model:
(2.1) constructing a target model for the sample data selected in the step (1), wherein the target model adopts a distributed structure, and three identical submodels m are respectively arranged for the three RGB characteristics of the image1,m2,m3And finally for normalizing the output of the feature matrixGo out model mout
In the embodiment of the invention, 3 5-layer MLPs with 512 hidden units are used as a target sub-model structure in a CIFAR-10 data set, 1 2-layer MLP with 32 hidden units is used as a target output model structure, the input of the MLP is 3072-dimensional flat vectors of a CIFAR-10 image (32 x 3), the output is 10-dimensional prediction, and each MLP layer has a ReLU activation function and a BatchNorm regularization layer.
(2.2) setting uniform hyper-parameters for all the target models set in the step (2.1), and setting the training set D set in the step (1)trainTraining is carried out: setting training epoch times, batch size, optimizer, learning rate and Loss function in a self-defined manner, and adopting cos cosine learning rate and Loss function Loss with random gradient descent and learning rate set as initial 0.1cThen a regularization parameter λ is added on the basis of the cross entropy function:
Figure BDA0003317183530000081
where p (-) represents the true label of the sample, q (-) represents the prediction probability of the model, xiA sample representing the input is taken and,
Figure BDA0003317183530000082
representing model parameters and lambda a regularization coefficient.
In the embodiment of the invention, uniform hyper-parameters are set: training epoch number is 200, batch size is 128, random gradient descent (SGD) is adopted, a cos cosine learning rate with the learning rate set as initial 0.1 is adopted, a regularization parameter is lambda is added to the Loss function on the basis of a cross entropy function, and the Loss function Loss is obtainedcAnd (4) calculating a formula.
(2.3) repeating the training until the accuracy rate of the target model is basically converged and is not promoted any more; and then storing the trained target model.
(3) Defining a neural network, then defining a computational graph of a single neural network, inputting the target model obtained by training in the step (2), and constructing a distributed brain-like graph, wherein the method comprises the following substeps:
(3.1) defining a neural network: the definition map G ═ (V, E), where V ═ V1,...,vnIs the set of nodes and is the node set,
Figure BDA0003317183530000083
Figure BDA0003317183530000084
is a set of edges and each node v has a feature vector W of one nodev
(3.2) defining a computational graph of a single model:
using a forward propagation algorithm, a set of graph nodes, V ═ x, is defined1,...,vnIs the set of all neurons, edges
Figure BDA0003317183530000085
For a connection line between two neuron nodes with a propagation relation between each two layers, the weight of an edge is set as a component of a feature vector matrix of a corresponding node when the edge propagates from the previous layer to the next layer, taking a fully-connected network as an example, and describing the component by a formula as follows:
Wv=[wi1,wi2,…,wij]
wherein for each component wijI denotes a position where the subscript of the neuron in the previous layer network to which the weight is connected, and j denotes a position where the subscript of the neuron in the next layer to which the weight is connected. In general, each neuron of the previous layer network in the fully connected network has a connecting edge with all neurons of the next layer, namely from 1 st to j th.
(3.3) constructing a distributed brain-like map:
loading the target model obtained by training in the step (2), firstly calculating the characteristic vector of each neuron node of each model, respectively drawing a calculation graph of each model according to the definition in the step (3.2), and finally connecting the calculation graphs of all drawn submodels with the set weight as a connecting edge with the calculation graph of the output model to generate an original distributed brain-like graph Gori
(4) The method for constraining the critical path specifically comprises the following substeps:
(4.1) calculation of the effects between two layers of neurons:
for any model, in order to construct a key attack path, key attack neurons need to be extracted from each layer and connected. The effect of the neurons of the previous layer on the neurons of the next layer is first calculated by means of back propagation.
Specifically, the effect of one neuron is the sum of the absolute values of the gradients of the elements at each of its positions with respect to the other neuron. For the jth neuron of the ith layer
Figure BDA0003317183530000091
Output of (2)
Figure BDA0003317183530000092
By a single element z of
Figure BDA0003317183530000093
The ith neuron representing the l-1 layer
Figure BDA0003317183530000094
Influence value thereon:
Figure BDA0003317183530000095
wherein the subscript L represents the ith layer, L is 1,2, … L, L is the total number of layers of the neural network,
Figure BDA0003317183530000096
is the set of elements of the ith neuron at layer l-1, and the A (-) function extracts the elements at the specified locations.
By using
Figure BDA0003317183530000097
Representing the influence value between two i, j neurons, i.e. the ith neuron at layer l-1
Figure BDA0003317183530000098
For the jth neuron of the l layer
Figure BDA0003317183530000099
Output of (2)
Figure BDA00033171835300000910
Sum of values of each element of (1):
Figure BDA00033171835300000911
(4.2) selecting key neurons:
given a sample x, the gradient of the loss contributed to the model decision by the i-th neuron of the last convolutional layer L is first derived:
Figure BDA0003317183530000101
wherein
Figure BDA0003317183530000102
Representing the output of the ith neuron in layer L.
Then, the loss gradients of each neuron are put together to carry out sorting from large to small, the first k neurons are selected as key neurons, and the key neurons are used
Figure BDA0003317183530000103
Represents the key neurons selected in the last layer L:
Figure BDA0003317183530000104
wherein the top _ k (-) function represents the first k, FLThat is, the neuron set of the last convolutional layer L, then the key neurons of the previous layers are recursively obtained by the formula of the step 4.1) based on the influence of the layer L-1 on the layer L, and the key neurons of the previous layers are obtained by the formula
Figure BDA0003317183530000105
Representing the key neurons taken by each layer:
Figure BDA0003317183530000106
finally, the key neurons of the sample x in different layers are represented by R (x):
Figure BDA0003317183530000107
(4.3) limiting the loss gradient:
in the face of an anti-attack, an intuitive way to constrain the critical path is to limit the loss gradient to reduce the impact caused by these neurons. A loss term can be derived directly by limiting the gradient of key neurons:
Figure BDA0003317183530000108
and adding the loss term into the cross entropy loss to obtain a final loss function:
Loss=Lossc+δLossg
wherein LosscRepresents the cross-entropy loss function in step (2), δ being the hyperparameter used to balance these loss terms.
(4.4) mapping the critical path:
mapping the key neurons obtained in the step (4.2) to the distributed brain-like graph drawn in the step (3.3), and removing nodes and connecting edges of non-key neurons in the brain-like graph to obtain a distributed brain-like graph structure G related to key path guidancepath
(5) Reconstructing the model under the guidance of the graph network indexes, specifically comprising the following substeps:
(5.1) defining a plurality of graph network indicators:
characteristic path length: the characteristic path length is an index for measuring efficiency and is defined as the average shortest path length of the network. The distance matrix used to calculate the shortest path must be a connection length matrix, usually obtained by mapping from weights to lengths. The most common weighted path length is used here as a criterion for the calculation, the formula is as follows:
WPL=∑wij*l
wherein wijIs the edge weight defined in step (3.2), L is the subscript in step (4.1) representing the L-th layer, L is 1,2, … L, L is the total number of layers of the neural network, where L is generally set to wijThe layer in which the neuron of the middle subscript i is located.
Participation coefficient: the participation coefficient is a measure of the distribution of the connections of a node in the network community. When the parameter is 0, the connection of the node is completely limited to its block. The closer the participation coefficient is to 1, the more evenly the distribution of the links of the nodes among the blocks. Mathematically, the participation coefficient P of node i is:
Figure BDA0003317183530000111
wherein SisIs the sum of the connection weights of node i to the nodes in block c, SiC is the total number of blocks. Here, each sub-model is set as a block;
(5.2) growing a new brain-like structure reconstruction model
Respectively calculating the original distributed brain-like map G obtained in the step (3)oriAnd (4) obtaining a distributed brain-like graph structure G related to key path guidancepathObserving the change trend of the index according to the two indexes in the step (5.1), wherein the trend direction of the growth of the brain-like graph is guided by the characteristic path length, the distribution weight ratio of each sub-model is guided by the participation coefficient, finally, a new brain-like graph structure after growth is generated, the generated new brain-like graph structure is converted back to the model, namely, the model is reconstructed back to the target model, and the m index is obtained1′,m2′,m3' and mout' the robust object model.
(6) Carrying out countermeasure attack on the target model to generate a countermeasure sample, and verifying the improvement of the robustness of the model by using the attack model:
the embodiment of the invention adopts a plurality of adversarial attack methods, including FGSM attack, CW attack and PGD attack. For each attack, 1000 randomly generated challenge samples were chosen from each data set for attack. The three attacks set different parameters, wherein for the FGSM attack, the parameter epsilon is set to 2; for CW attacks, L is used2In the norm attack, an initial value c is set to be 0.01, a confidence coefficient k is set to be 0, and the iteration number epoch is set to be 200; for PGD attack, the parameter ∈ 2, the step size α ∈/10, and the number of iterations epoch ═ 20.
Evaluating indexes of model robustness; when the anti-attack model is subjected to anti-attack, the accuracy rate is commonly used as an evaluation index of the robustness.
The accuracy is as follows: accuracy represents the ratio of the number of samples correctly classified by the classifier to the total number of samples for a given test data set
Figure BDA0003317183530000121
Wherein, TP represents that the positive class is judged as the positive class, FP represents that the negative class is judged as the positive class, FN represents that the positive class is judged as the negative class, TN represents that the negative class is judged as the negative class, and the lower the accuracy rate, the better the robust performance is. The accuracy of the robust target model of the CIFAR-10 data set obtained through experiments under three attacks is improved by 42.3% in comparison with the accuracy of the original target model on average.
The above-mentioned embodiments are intended to illustrate the technical solutions and advantages of the present invention, and it should be understood that the above-mentioned embodiments are only the most preferred embodiments of the present invention, and are not intended to limit the present invention, and any modifications, additions, equivalents, etc. made within the scope of the principles of the present invention should be included in the scope of the present invention.

Claims (6)

1. A deep neural network model reinforcement method based on a distributed brain-like map is characterized by comprising the following steps:
(1) selecting sample data from the target model dataset;
(2) constructing a target model for the sample data selected in the step (1); training the target model, and finally storing the trained target model;
(3) defining a neural network, then defining a calculation graph of a single neural network, inputting the target model obtained by training in the step (2), and constructing an original distributed brain-like graph;
(4) calculating the influence between every two layers of neurons in the neural network, selecting key neurons, mapping the key neurons to the distributed brain-like graph constructed in the step (3), and obtaining a distributed brain-like graph structure related to key path guidance;
(5) and (4) defining graph network indexes, respectively calculating the original distributed brain-like graph obtained in the step (3) and the graph network indexes of the distributed brain-like graph structure related to the key path guidance obtained in the step (4), generating a new brain-like graph structure and reconstructing a new target model.
2. The method for deep neural network model reinforcement based on the distributed brain-like map according to claim 1, wherein the step (1) is specifically as follows: the target model data set comprises n sample data, the sample data is divided into a class a sample data, and D% of sample data is extracted from each class of sample data to be used as a training set D of the target modeltrain(ii) a Wherein n, a and d are natural numbers.
3. The method for deep neural network model reinforcement based on the distributed brain-like map according to claim 1, wherein the step (2) is specifically as follows:
(2.1) constructing a target model for the sample data selected in the step (1), wherein the target model adopts a distributed structure, and three identical submodels m are respectively arranged for the three RGB characteristics of the image1,m2,m3And finally an output model m for normalizing the feature matrixout
(2.2) setting a uniform hyper-parameter for all the models set in the step (2.1) and using the hyper-parameter for the training set D set in the step (1)trainTraining is carried out, specifically: self-defined setting of the number of epochs for trainingThe method comprises the steps of batch size, an optimizer, a learning rate and a Loss function, wherein the optimizer adopts random gradient descent, the learning rate is set to be a cos cosine learning rate of 0.1 at the beginning, and the Loss function LosscThen a regularization parameter λ is added on the basis of the cross entropy function:
Figure FDA0003317183520000021
where p (-) represents the true label of the sample, q (-) represents the prediction probability of the model, xiA sample representing the input is taken and,
Figure FDA0003317183520000022
representing model parameters, and lambda represents a regularization coefficient;
and (2.3) repeating the training until the accuracy rate of the target model is converged, and then storing the target model obtained by training.
4. The method for deep neural network model reinforcement based on the distributed brain-like map according to claim 1, wherein the step (3) is specifically as follows:
(3.1) defining a neural network: the definition map G ═ (V, E), where V ═ V1,...,vnIs the set of nodes and is the node set,
Figure FDA0003317183520000023
Figure FDA0003317183520000024
is a set of edges and each node v has a feature vector W of one nodev
(3.2) defining a computational graph of a single model: using a forward propagation algorithm, a set of graph nodes, V ═ V, { V, is defined1,...,vnIs the set of all neurons, edges
Figure FDA0003317183520000025
Two gods with propagation relation between each front layer and each rear layerThrough the connection between the element nodes, the weight of the edge is set as the component of the eigenvector matrix of the corresponding node when the edge propagates from the previous layer to the next layer, and the weight is described as follows by a formula:
Wv=[wi1,wi2,…,wij]
wherein for each component wijI represents the subscript of the neuron in the previous layer of network connected with the weight, namely the position, j represents the subscript of the neuron in the next layer connected with the weight, namely the position;
(3.3) constructing a distributed brain-like map: inputting the target model obtained by training in the step (2), firstly calculating the characteristic vector of each neuron node of each model, respectively drawing a calculation graph of each model according to the definition in the step (3.3), and finally connecting the calculation graphs of all the drawn submodels with the set weight as a connecting edge with the calculation graph of the output model to generate an original distributed brain-like graph Gori
5. The method for deep neural network model reinforcement based on the distributed brain-like map according to claim 1, wherein the step (4) is specifically as follows:
(4.1) calculation of the effects between two layers of neurons: for a jth neuron F of the l-th layerl jOutput of (2)
Figure FDA0003317183520000026
By a single element z of
Figure FDA0003317183520000031
The ith neuron representing the l-1 layer
Figure FDA0003317183520000032
Influence value thereon:
Figure FDA0003317183520000033
wherein the subscript L represents the ith layer, L is 1,2, … L, L is the total number of layers of the neural network,
Figure FDA0003317183520000034
is the set of elements of the ith neuron at layer l-1, and the A (-) function extracts the elements at the specified locations.
By using
Figure FDA0003317183520000035
Representing the influence value between two i, j neurons, i.e. the ith neuron at layer l-1
Figure FDA0003317183520000036
For the jth neuron of the l layer
Figure FDA0003317183520000037
Output of (2)
Figure FDA0003317183520000038
Sum of values of each element of (1):
Figure FDA0003317183520000039
(4.2) selecting key neurons: given a sample x, the loss gradient that the ith neuron of the last convolutional layer L contributes to the model decision is first computed:
Figure FDA00033171835200000310
wherein the content of the first and second substances,
Figure FDA00033171835200000311
represents the output of the ith neuron in layer L;
then, the loss gradient of each neuron is put together to carry out the ranging from large to small, and the first k nerves are selectedThe elements are key neurons, use
Figure FDA00033171835200000312
Represents the key neurons selected in the last layer L:
Figure FDA00033171835200000313
wherein the top _ k (-) function represents the first k, FLThat is, the neuron set of the last convolutional layer L, and then based on the effect of layer L-1 on layer L
Figure FDA00033171835200000314
Representing the key neurons taken by each layer:
Figure FDA00033171835200000315
finally, the key neurons of the sample x in different layers are represented by R (x):
Figure FDA00033171835200000316
(4.3) limiting the loss gradient: one loss term is obtained by limiting the gradient of key neurons:
Figure FDA0003317183520000041
and adding the loss term into the cross entropy loss to obtain a final loss function:
Loss=Lossc+δLossg
wherein LosscExpressed as a cross-entropy loss function in step (2), δ is a hyperparameter for balancing these loss terms;
(4.4) mapping the critical path: step (4)Mapping the key neurons obtained in the step (2) to the distributed brain-like graph drawn in the step (3.3), and removing nodes and connecting edges of non-key neurons in the brain-like graph to obtain a distributed brain-like graph structure G guided by the relevant key pathspath
6. The method for deep neural network model reinforcement based on the distributed brain-like map according to claim 1, wherein the step (4) is specifically as follows:
(5.1) defining graph network metrics: the graph network indicators comprise characteristic path lengths and participation coefficients; the characteristic path length is specifically the average shortest path length of the network and is used for measuring efficiency; the participation coefficient is a measure of the distribution of the connection of a node in the network community;
(5.2) growing a new brain-like structure reconstruction model:
respectively calculating the original distributed brain-like map G obtained in the step (3)oriAnd (4) obtaining a distributed brain-like graph structure G related to key path guidancepathObserving the change trend of the index according to the graph network index defined in the step (5.1), wherein the trend direction of the growth of the similar brain graph is guided by the characteristic path length, the distribution weight ratio of each sub-model is guided by the participation coefficient, the generated new similar brain graph structure is reconstructed back to the target model to obtain m1′,m2′,m3' submodel and mout' output model.
CN202111234229.8A 2021-10-22 2021-10-22 Deep neural network model reinforcement method based on distributed brain-like map Pending CN114048837A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111234229.8A CN114048837A (en) 2021-10-22 2021-10-22 Deep neural network model reinforcement method based on distributed brain-like map

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111234229.8A CN114048837A (en) 2021-10-22 2021-10-22 Deep neural network model reinforcement method based on distributed brain-like map

Publications (1)

Publication Number Publication Date
CN114048837A true CN114048837A (en) 2022-02-15

Family

ID=80206082

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111234229.8A Pending CN114048837A (en) 2021-10-22 2021-10-22 Deep neural network model reinforcement method based on distributed brain-like map

Country Status (1)

Country Link
CN (1) CN114048837A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117764120A (en) * 2024-02-22 2024-03-26 天津普智芯网络测控技术有限公司 Picture identification architecture capable of reducing single event fault influence

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117764120A (en) * 2024-02-22 2024-03-26 天津普智芯网络测控技术有限公司 Picture identification architecture capable of reducing single event fault influence

Similar Documents

Publication Publication Date Title
CN112784881B (en) Network abnormal flow detection method, model and system
CN110909926A (en) TCN-LSTM-based solar photovoltaic power generation prediction method
CN109242223B (en) Quantum support vector machine evaluation and prediction method for urban public building fire risk
CN111324990A (en) Porosity prediction method based on multilayer long-short term memory neural network model
CN112668804B (en) Method for predicting broken track of ground wave radar ship
CN110070116B (en) Segmented selection integration image classification method based on deep tree training strategy
WO2020095321A2 (en) Dynamic structure neural machine for solving prediction problems with uses in machine learning
CN107832789B (en) Feature weighting K nearest neighbor fault diagnosis method based on average influence value data transformation
Hegazy et al. Dimensionality reduction using an improved whale optimization algorithm for data classification
CN111611785A (en) Generation type confrontation network embedded representation learning method
CN114548591A (en) Time sequence data prediction method and system based on hybrid deep learning model and Stacking
CN115051864B (en) PCA-MF-WNN-based network security situation element extraction method and system
CN112580728A (en) Dynamic link prediction model robustness enhancing method based on reinforcement learning
CN116248392A (en) Network malicious traffic detection system and method based on multi-head attention mechanism
CN115982141A (en) Characteristic optimization method for time series data prediction
Regazzoni et al. A physics-informed multi-fidelity approach for the estimation of differential equations parameters in low-data or large-noise regimes
CN115051929A (en) Network fault prediction method and device based on self-supervision target perception neural network
CN114048837A (en) Deep neural network model reinforcement method based on distributed brain-like map
CN113239809B (en) Underwater sound target identification method based on multi-scale sparse SRU classification model
CN109886405A (en) It is a kind of inhibit noise based on artificial neural network structure's optimization method
Akinwale Adio et al. Translated Nigeria stock market prices using artificial neural network for effective prediction
CN111126758B (en) Academic team influence propagation prediction method, academic team influence propagation prediction equipment and storage medium
CN117421667A (en) Attention-CNN-LSTM industrial process fault diagnosis method based on improved gray wolf algorithm optimization
CN115392434A (en) Depth model reinforcement method based on graph structure variation test
CN115131646A (en) Deep network model compression method based on discrete coefficient

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination