CN112784915B

CN112784915B - Image classification method for optimizing decision boundary to enhance robustness of deep neural network

Info

Publication number: CN112784915B
Application number: CN202110133058.3A
Authority: CN
Inventors: 刘波; 杜宾
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2021-01-29
Filing date: 2021-01-29
Publication date: 2024-03-22
Anticipated expiration: 2041-01-29
Also published as: CN112784915A

Abstract

The invention relates to an image classification method for optimizing decision boundary to enhance the robustness of a deep neural network, which is characterized in that a countermeasure sample is calculated by calculating a decision domain model of the network, and the distribution of the decision boundary corresponding to the neural network is optimized by countermeasure training, so that the network robustness is improved. The method comprises the following steps: the method comprises the steps of calculating and obtaining a corresponding decision domain model by using parameters of a trained deep neural network, judging which decision boundaries corresponding to the network are sensitive to disturbance by using the model and training samples, calculating and obtaining countersamples of samples around the decision boundaries sensitive to disturbance by using data in a training set, mixing the countersamples into the training samples, countertraining the network, moving the decision boundaries in the opposite direction of the sensitive direction by optimizing network parameters, reducing the sensitivity of the decision boundaries to disturbance, and obtaining the deep neural network with high robustness, thereby improving the anti-interference capability of the network in classification tasks.

Description

Image classification method for optimizing decision boundary to enhance robustness of deep neural network

Technical Field

The invention belongs to the field of artificial intelligence, and particularly relates to an image classification method for optimizing decision boundary enhanced depth neural network robustness.

Background

Deep neural networks can be used to accomplish classification and regression tasks. Today, deep neural networks perform well in a variety of classification tasks. When the deep neural network performs a classification task, the deep neural network is mainly dependent on decision domains obtained by dividing a sample space by the deep neural network, each decision domain corresponds to a respective category, and when a sample point falls into a certain decision domain, the sample is classified into the category of the corresponding decision domain by the network. However, for some special picture samples, by adding some small disturbance to the picture samples, the network can make a wrong judgment on the type of the picture, and the samples are countermeasure samples. The existence of the countermeasure sample indicates that the robustness of the current neural network is weak, the classification effect is not stable enough, and data can be misjudged into other types due to some external factors, such as fine fluctuation of sensor signals and the like, in the application process, so that wrong operation is executed.

Taking autopilot as an example, a deep neural network is currently used for processing the relation between signals obtained by vehicle sensors and the current operation, so that closed-loop operation is completed, vehicles are adjusted in real time according to information, but the vehicles run under complex conditions, the temperature and the moderate degree, and even jolt of the vehicles can cause sensor signals of the vehicles to be influenced, which is equivalent to adding disturbance to the input information, when the information is used for judging road conditions, if the robustness of the used neural network is not strong enough, the used neural network is likely to be misclassified, so that the vehicles are influenced to make wrong judgment, and fatal problems can occur.

So to solve this problem, it is necessary to know how to improve the robustness of the network to avoid the influence of disturbances. When a sample point is located near a decision boundary of a decision domain, it is possible that the sample becomes an antagonistic sample when a suitable disturbance is added to cross a decision boundary of a decision domain and enter a decision domain of a different class, and in order to avoid this, fine adjustments are required to the decision domain near these possible antagonistic samples. At this time, if the decision domain generated by the network can be calculated, the decision domain boundary of the decision domain can be adjusted according to the sample data, so that the robustness of the network is enhanced. The adjustment mode is to train the network by using a part of the anti-sample, so that the network parameters are influenced in the back propagation process, the decision domain boundary moves relative to the direction of the anti-sample, the network with strong robustness is obtained, and the possibility of errors of the network when the network is used for dealing with various data in real life is reduced.

Disclosure of Invention

Aiming at the problem that the classification of the images is inaccurate due to the fact that the classification is changed due to the fact that small interference which is difficult to perceive by naked eyes is added to the images when the images are classified by the existing models, the invention provides an image classification method for optimizing decision boundary to enhance the robustness of a deep neural network, and the innovation point is that a set of novel methods are provided for calculating network decision boundaries and the topological properties of the network decision boundaries to construct a decision domain model of the current network, and the decision boundaries are regulated by selecting proper strategies according to the distribution of analysis decision boundaries in samples: the countermeasure sample is calculated and generated by using the sample close to the decision boundary, and the network parameters corresponding to the decision boundary sensitive to disturbance are adjusted by countermeasure training, namely the optimized neural network is less sensitive to the sample added with disturbance, so that the robustness of the network is improved.

In the neural network, the existence of piecewise linear activation functions such as Relu, leakRelu enables a composite expression obtained by combining the expression of each neural network node with the expression of the activation function to correspond to a linear hyperplane in a decision space, the linear hyperplane divides a sample space into a plurality of subspaces with different output functions, and finally the subspaces are further divided into subspaces according to the condition that the output function of the output node is maximum, namely a decision domain, and the hyperplane forming the decision domain is called a decision boundary. The sample points inside each decision domain are the same as the nodes with highest network output response, i.e. the sample types in each decision domain are consistent. The decision domain model can therefore be considered as another equivalent representation of the classification task network, so that the decision domain of the network can be calculated to intuitively decide to adjust the parameters of those nodes according to the nature of the decision boundary.

The occurrence of the countermeasures often relates to sample points too close to the decision boundary, and the samples only need to be added with relatively small disturbance to cross the decision boundary, so that in order to avoid the situation, the decision boundary can be adjusted to make the samples nearby the decision boundary sparse as much as possible, such as balancing the distance from the decision boundary to the centers of two categories, or directly adjusting the positions of the decision boundary according to data distribution, however, the decision boundary cannot be directly adjusted according to parameter guidance in a neural network, because the positions of boundary decisions cannot be known, and a certain decision boundary is related to the parameters of nodes of the network, therefore, a decision domain model of the countermeasures needs to be constructed according to the parameters of the neural network, so that analysis and optimization are performed to reduce the possibility of the countermeasures and enhance the robustness of the countermeasures.

The method utilizes the image classification task to integrate the mode of adjusting the decision boundary into the deep neural network, thereby obtaining the deep neural network with stronger robustness, finally completing the image classification task, solving the problem of low robustness of the existing deep neural network to interference signals, reducing the possibility of misclassification of samples, and improving the accuracy to a certain extent so that the deep neural network can be more stable when applied to the fields of intelligent driving and the like. The flow chart of the invention is shown in figure 1.

Step 1: and (5) building a network.

Step 2: constructing a decision domain model

In order to calculate the contrast samples, it is necessary to know in which decision domain the samples fall, and the nearest boundary to the decision domain, i.e. the distance of the decision boundary, and therefore it is necessary to first calculate the decision domain and the decision domain model.

The decision domain model uses all connection layer parameters in the network: the weight matrix W and the threshold B are used for representing the area of the input data space of the full connection layer by a plurality of inequality groups obtained through calculation, and each inequality corresponds to an equation equal to zero, namely a decision boundary. Each inequality group corresponds to a certain category in the classification task, the inequality groups and the categories belong to a many-to-one relationship, and all convolution layers and a maximum pooling layer are used for preprocessing an input picture x to obtain full-connection layer input data x ₀ X and x ₀ The category is the same by judging x ₀ Set of constraint inequalities satisfied to determine x ₀ Category, x ₀ Is the class corresponding to the set of constraint inequalities that are satisfied.

In order to calculate the decision domain, the network needs to be pre-trained to obtain network parameters, and the loss function uses cross entropy as the loss function in the same training mode as the training process of a general convolutional neural network, and the formula is as follows:

m represents the number of pictures that one batch contains during training. The full connection layer adopts Relu as an activation function, and the expression is:

f(x)＝max(0,x)

after the pre-training is completed, the weights of the network are saved as a ckpt file containing parameters of two parts in total, parameters Ker of all convolution layers, and parameters of all full connection layers, including all weight matrix W of the full connection layers and threshold B.

In calculation, the functions of the convolution layer and the maximum pooling layer are regarded as preprocessing data, so that the parameters Ker and the maximum pooling layer of all the convolution layers are used for processing the input picture sample x to obtain the input x of the full connection layer ₀ 。

In order to obtain the set of inequalities, iterative calculations are required using parameters of the fully connected layer by layer, which are divided into three parts due to differences in the way the calculations are made: the input layer, the output layer and the middle layer respectively correspond to the first layer, the last layer and the other layers, and each layer carries out calculation in different modes according to respective operation meanings. The corresponding layer number is represented by t, and the weight of the corresponding layer is w _t (w _t E W), threshold value b _t (b _t E B), node number of t layer is N _t By j _t Representing the jth node of the t layer, the weight of the jth node of the t layer isThreshold value of->

Calculation of the input layer will be M ₁ Output x of the maximum pooling layer ₀ As input, and use the weights of the input layersAnd threshold->The calculation is performed so that the expression of the first layer node of the deep neural network with the activation function σ can be expressed as follows:

wherein d represents x ₀ T=1, the dimensions of the following input layersThe formula directly replaces t with 1

Will be due to the presence of the activation function sigmaDivided into two parts

At this time the Mth ₁ The space where the data output by the maximum pooling layer exists is determinedThe corresponding hyperplane is divided into the terms +.>And y _1j Two parts < = 0, this hyperplane is the decision boundary. In y _1j The output function of the partial node > =0 is +.>The output function of part of the jth node is 0, so that one node corresponds to one hyperplane and one output function, since the first layer coexists in N ₁ A node, so N ₁ N is correspondingly obtained by each node ₁ A hyperplane, N ₁ The hyperplane intersection divides the space where the output data of the second maximum pooling layer exist into a plurality of areas, the number of the areas depends on the relationship of the hyperplane intersection, samples of each area meet different function mappings, namely, the hyperplane corresponding to the discriminant of the activation function divides the sample space into two different areas. And so on, each node of the first layer is activated to obtain a discriminant, the input space is divided once, a plurality of areas are finally obtained, and each area corresponds to a group of first layer N ₁ The output function of each node is determined based on whether the node satisfies a discriminant in the domain greater than or less than 0.

Because the inequality group calculated at this time is theoretically not considered, all the areas possibly divided by the hyperplane interleaving relation are not considered, and because the hyperplane interleaving modes are different, some areas possibly do not exist, whether the area surrounded by the decision domain exists or not needs to be calculated, and the area which does not exist is deleted. And finally outputting all feasible inequality groups obtained by calculation of the input layer and outputting functions of each node corresponding to the inequality groups.

The calculation method of the middle layer starting from the second layer is as follows: the t layer receives the total inequality group output by the t-1 layer and the output function corresponding to the node under the inequality group as the input of the layer, and the weight of the layer is combined asAnd threshold value of->To calculate decision boundaries generated by the layer. The expression of the present layer is expressed as an expression related to the original input X by substituting the expression of the previous layer output:

let t denote the current layer number, let j _t Represents the j-th node of the t-th layer. Due to the characteristic of full connection layer calculation, the current nodeThe ordinal number i of (i) corresponds to the number of nodes of the upper layer, so the inner nested formula uses w _(t-1)i To represent

X when t=2 ₀ By usingInstead, the discriminant in the sample space obtained after activation of the layer is obtained. Due to the layerThe accepted input is related to the output function of the previous layer, and each inequality group of the output of the previous layer corresponds to a different output function, so when the node of the previous layer is calculated, each inequality group needs to be calculated once, the corresponding inequality group of the node of the previous layer is calculated by using the output function of the node of the previous layer, and a new inequality group is connected to the back of the inequality group of the previous layer used in calculation to form a new inequality group as the inequality group of the previous layer, and meanwhile, the output function of the node of the current layer below each inequality group and the inequality group are obtained and used as the output of the current layer for the calculation of the next layer.

For the output layer, in the classification task, the output layer of the deep neural network usually uses methods such as one-hot and softmax to process the output, and finally finds the node with the strongest response, and takes the ordinal number as the category. Essentially, the output with the strongest response is found, and the position of the output node is used as the classification basis. Therefore, considering the characteristics of the output layer, the layer accepts all inequality groups output by the last middle layer and corresponding output functions as the input of the layer, and uses the parameters and weights of the layer nodesAnd threshold->To calculate decision boundaries generated by the layer, because of the characteristics of the output layer, unlike the middle layer, the output of the node is not used as a discriminant and output function, but an equation of difference between the outputs of a fixed node and other nodes is used->The further division of the space as a hyperplane may satisfy a certain region +.>I.e. the i-th node of the network output layer node is always larger than the output of the other nodes in this area, thus this areaThe samples in the domain can be classified as i-th samples, so far all inequalities calculated to the fully connected layer currently used together represent a set of inequalities corresponding to a decision domain of the i-th class surrounded by decision boundaries corresponding to the inequalities, and the calculated inequalities are added to the original set of inequalities as a new set of inequalities. Repeating calculation to expand one inequality group output by the previous layer into N _t (t＝M ₂ ) And a set of inequalities, each set of inequalities corresponding to a category. And repeatedly calculating each inequality group output by the previous layer until the calculation is completed, outputting a plurality of inequality groups and categories corresponding to each inequality group, and completing the calculation of the decision domain, wherein an equation equal to zero corresponding to each inequality in the inequality groups is a decision boundary.

Since the decision domain is represented by using constraint inequality sets, redundant inequalities in the inequality sets, that is, inequalities which do not form a feasible domain boundary, need to be deleted before adjacent decision domains are combined, and the solution is performed by changing the inequality constraint of the edge to be judged into the equality constraint by adopting a simplex method, if the inequality forms a feasible domain boundary, the problem is solved for any objective function, otherwise, the problem is not solved.

And (3) carrying out the simplex method again to solve the optimal solution on the reserved decision domain boundary, repeatedly modifying the objective function to try, and when an infinite solution exists on a certain objective function, indicating that the boundary is open in the corresponding direction, namely the decision domain where the boundary is positioned is an open area. This step is repeated until all boundaries of all decision domains solve for their openness. And saves the solution under the corresponding objective function.

After the redundant inequality is removed, when the weights and the thresholds are the same in the inequality groups corresponding to the two decision domains and inequalities with different signs exist, the two decision domains are adjacent. If the solutions obtained in the previous step of adjacent sides are identical under the corresponding objective function, the adjacent sides of the two decision domains are completely overlapped, and if the categories of the two decision domains are identical, the two decision domains are combined, and the corresponding boundaries are deleted.

The region determined by the set of linear inequalities may prove in the linear programming problem to have a convex relief. And combining adjacent decision domains with completely overlapped adjacent edges, namely the inverse process of a certain division in the dividing process, so that the decision domain obtained after the combining process is a new inequality group consisting of a subset of the linear inequality group, and still corresponds to the convex feasible domain.

Meanwhile, the open decision domain and any decision domain are combined to be the open decision domain, so that the closure of the combined decision domain can be obtained together.

The decision domains may be further combined, and similarly, when there are inequalities with the same weights and thresholds and different signs in the sets of inequality corresponding to the two decision domains, the two decision domains are adjacent. If the solutions of the adjacent edges obtained in the linear programming are not identical under the corresponding objective functions, the adjacent edges of the two decision domains are not completely overlapped. At this time, the merging must occur at an angle greater than 180 ° where the adjacent edges do not completely overlap, so that the decision domain obtained by merging in this way is concave.

Traversing all the boundaries of the closed decision domains, irrespective of sign, when the boundary of one decision domain is entirely a subset of the boundary of another decision domain, it is stated that the latter is a multi-connected region and that the asperity is meaningless.

The decision domain and its topology properties are thus calculated and can be used to calculate the challenge samples.

Step 3: verification decision domain model

The verification of the decision domain model is mainly based on the consistency of the decision domain and the deep neural network to sample classification, the higher the consistency is, the more accurate the obtained decision domain model is, and the verification mode is as follows: and comparing whether the class obtained by the decision domain is consistent with the class of the deep neural network or not by substituting the picture samples of the training set and the verification set, so that points are respectively counted into a correct (consistent), an invalid and a total number of samples, when the class of the input picture classified by the decision domain model is consistent with the class of the neural network classification, the correct is increased by one, the total is equal to the number of all samples used for verifying the accuracy of the model, and the consistency of the two models is represented according to the correct/total. If there is a sample that is not in any of the decision domains, it is stated that the decision domain model calculation is in error and needs to be adjusted.

If correct=totalAt this time, the decision domain model is described as being completely consistent and can be used to calculate challenge samples.

If it isAn area that can be separated originally may not be divided into two due to accuracy caused by calculation accuracy when the index=0, so that an image of one category is misclassified, and at this time, the calculation accuracy of the calculation process needs to be improved, and a decision domain needs to be recalculated. The resulting decision domain may also be used directly in cases where the accuracy is acceptable, except that the resulting samples may not be enough to be misclassified as a challenge sample requiring a post-test deletion.

If it isinvalid +.0), then the decision domain that should exist is deleted because it was not present when the task was performed, where it may be a question if it is feasible to calculate the set of inequalities, the calculation mode needs to be modified, and the decision domain recalculated

Step 4: construction of challenge samples

Inputting the pictures x in the training set and the test set into the network constructed in the step 1, and obtaining the Mth ₁ The corresponding output of the maximum pooling layer is x ₀ Traversing all x ₀ Find the distance x in the decision domain where it is located ₀ The nearest boundary, and the distance d from the boundary, when d is smaller than the set threshold value θ, it is considered that the sample may cross the decision boundary under the condition of being slightly disturbed, and thus be misclassified, and the sample point is the obtained sample point. Calculating the distances of all sample points from all boundaries of a decision domain where the sample points are located, namely, the expression of the hyperplane corresponding to each inequality corresponding to 0, finding all samples meeting the following conditions and the corresponding hyperplanes spanned by the samplesParameter (w, b):

let ε represent a random number small enough to obtain the interference term p=d+ε, where the interference term obtained is sufficient to span x ₀ Decision boundaries, p ' in the input dimension is found using deconvolution to construct the challenge samples x ' =x+p ' for retraining.

Step 5: countermeasure network

And (3) retraining the network by using the samples obtained by training in the step (4), so that the decision boundary is adjusted to move towards the direction of the countering sample in a counter-propagation mode, d is reduced, and the disturbance p of misclassified samples is increased. Thereby improving the robustness of the network. The network is then used for image classification tasks.

The method has the advantages that the decision boundary is accurately calculated, and then the countermeasure sample is solved, so that the operability of calculating the countermeasure sample is stronger, the speed is faster, and the trouble of iteratively solving the countermeasure sample in other methods is avoided. By utilizing the optimized neural network for classification, the network robustness can be improved and the classification accuracy can be improved under the condition that the accuracy is not influenced as much as possible.

Drawings

Fig. 1 is an overall flowchart.

FIG. 2 is a flow chart of a portion of building a decision domain.

FIG. 3 is a schematic diagram of sample space decision boundary partitioning.

Detailed Description

According to the method, specific implementation details are as follows:

step 1: building a network

The specific network can be a LeNet network used in the experimental process of the invention as a basic network, and is properly adjusted according to the data set;

the input sample size is 32 x 3.

The network structure is as follows:

first convolution layer: 6 convolution kernels of 5 x 3 are used, so the convolution kernels are (5 x 1) x 6 in scale; the step size of the convolution operation is 1, the result after the first convolution is processed by using Relu as an activation function, and the output size is 28×28×6.

First max pooling layer: the window is a maximum pooling layer with the size of 2 x 2, and the output size is 14 x 6;

second convolution layer: the result after the second convolution is processed using Relu as the activation function with an output size of 10 x 16, with 16 convolution kernels of 5 x 6 being used, so that the convolution kernel scale is (5 x 6) 16, the step size of the convolution operation is 1.

Second maximum pooling layer: the window is 2 x 2 max pooling layer with output size of 5 x 16.

First full tie layer: the result obtained in the previous layer is unfolded into a one-dimensional vector form with the size of 400 x 1, so that the input size is 400 x 1, the number of nodes is 120, the size of the weight matrix w corresponding to the layer is 120 x 400, and the output size is 120 x 1. The output of the fully connected layer is processed using Relu as an activation function.

Second full tie layer: the number of nodes is 84, so that the size of w is 84×120, input is 120×1, and output is 84×1. Processing full connection layer output using Relu as activation function

Third full tie layer: the number of nodes is 10, the size of w is 10 x 84, the input size is 84 x 1, and the output is 10 x 1.

Step 2: constructing a decision domain model

In order to calculate the antagonism sample, it is necessary to know in which decision domain the sample falls, and the nearest boundary to the decision domain, i.e. the distance of the decision boundary, so it is necessary to first calculate the decision domain and the decision domain model.

The decision domain model uses all connection layer parameters in the network: the weight matrix W and the threshold B are used for representing the area of the input data space of the full connection layer by a plurality of inequality groups obtained through calculation, and each inequality corresponds to an equation equal to zero, namely a decision boundary. Each inequality group corresponds to a certain category in the classification task, the inequality groups and the categories belong to a many-to-one relationship, and the input picture x is preprocessed by using all convolution layers and a maximum pooling layer to obtain a full connection layerInput data x ₀ X and x ₀ The category is the same by judging x ₀ Set of constraint inequalities satisfied to determine x ₀ Category, x ₀ Is the class corresponding to the set of constraint inequalities that are satisfied.

In order to calculate the decision domain, the network needs to be pre-trained to obtain network parameters, and the training method is the same as that of a convolutional neural network in the training process, the network is trained by using the Cifar10 in the experimental process, ten categories of 60000 pictures are used as data sets, 6000 categories are respectively corresponding to: an aircraft; an automobile; a bird; a cat; deer; a dog; frog; a horse; a ship; and (3) a truck. The loss function uses cross entropy as the loss function, and the formula is as follows:

the network adopts Relu as an activation function, and the expression is as follows

f(x)＝max(0,x)

To obtain the set of inequalities we need to use the parameters of the fully connected layer by layer for iterative calculations, and because of the differences in the way the calculations are made we divide the fully connected layer into three parts: the input layer, the output layer and the intermediate layer correspond to the first layer, the last layer and the other layers respectively, and each layer carries out calculation in different modes according to the respective characteristics. The corresponding layer number is represented by t, and the weight of the corresponding layer is w _t (w _t E W), threshold value b _t (b _t E B), node number of t layer is N _t By j _t Representing the jth node of the t layer, the weight of the jth node of the t layer isThreshold value of->

Computation acceptance of input layer M ₁ X of the maximum pooling layer output ₀ As input, and use the weights of the input layersAnd threshold->The calculation is performed so that the expression of the first layer node of the deep neural network with the activation function σ can be expressed as follows:

wherein d represents x ₀ T=1, the following formula of the input layer directly replaces t with 1

At this time the Mth ₁ The space where the data output by the maximum pooling layer exists is determinedThe corresponding hyperplane is divided into the terms +.>And y _1j < = 0 two partsThis hyperplane is the decision boundary. In y _1j The output function of the partial node > =0 is +.>The output function of part of the jth node is 0, so that one node corresponds to one hyperplane and one output function, since the first layer coexists in N ₁ A node, so N ₁ N is correspondingly obtained by each node ₁ A hyperplane, N ₁ The hyperplane intersection divides the space where the output data of the second maximum pooling layer exist into a plurality of areas, the number of the areas depends on the relationship of the hyperplane intersection, samples of each area meet different function mappings, namely, the hyperplane corresponding to the discriminant of the activation function divides the sample space into two different areas. And so on, each node of the first layer is activated to obtain a discriminant, the input space is divided once, a plurality of areas are finally obtained, and each area corresponds to a group of first layer N ₁ The output function of each node is determined based on whether the node satisfies a discriminant in the domain greater than or less than 0.

Because the inequality group calculated at this time is theoretically not considered in all the areas possibly divided by the hyperplane interlacing relation, and because some of the areas may not exist in different hyperplane interlacing modes, we need to calculate whether the area surrounded by the decision domain exists or not, and delete the area which does not exist. And finally outputting all feasible inequality groups obtained by calculation of the input layer and outputting functions of each node corresponding to the inequality groups.

For example, taking the first layer as an example, the cases where each node satisfies the activation function may be arranged and combined to obtain an inequality group:

N ₁ for inputting the layer node number

If the set of inequalities has a solution, the corresponding region exists, and if the set of inequalities does not have a solution, the corresponding region does not exist. Solving forThe method for solving whether the inequality is solved is a simplex method, the inequality group is converted into a feasible domain in the linear programming problem, if the feasible domain exists, the solution is unique to any objective function or infinite, otherwise, the solution does not exist. The output function is determined according to the inequality corresponding to the node, e.g. the sign of the inequality output by the first node is greater than the value of y ₁₀ The sign of the inequality output by the second node is less than the output function is 0, and the sign of the inequality corresponding to the last node is greater than the output function is 0Thus, the set of output functions of the first layer node corresponding to the set of inequalities is

let t denote the current layer number, let j _t Represents the j-th node of the t-th layer. Due to the characteristic of full connection layer calculation, the current nodeThe ordinal number i of (i) corresponds to the number of nodes of the upper layer, so that the inner side is nestedThe formula of (1) uses w _(t-1)i To represent

X when t=2 ₀ By usingInstead, the discriminant in the sample space obtained after activation of the layer is obtained. Since the input accepted by the layer is related to the output function of the previous layer, and each inequality group output by the previous layer corresponds to a different output function, when the node calculation of the previous layer is performed, each inequality group needs to be calculated once, the corresponding inequality group of the node of the present layer is calculated by using the output function of the node of the previous layer, and a new inequality group is connected to the back of the inequality group of the upper layer used in the calculation to form a new inequality group as the inequality group of the present layer, and meanwhile, the output function of the node of the present layer under each inequality group and the inequality group are obtained together as the output of the present layer for the calculation of the next layer.

For the output layer, in the classification task, the output layer of the deep neural network usually uses methods such as one-hot and softmax to process the output, and finally finds the node with the strongest response, and takes the ordinal number as the category. Essentially, the output with the strongest response is found, and the position of the output node is used as the classification basis. Therefore, considering the characteristics of the output layer, the layer accepts all inequality groups output by the last middle layer and corresponding output functions as the input of the layer, and uses the parameters and weights of the layer nodesAnd threshold->To calculate the decision boundary generated by this layer, where the output of the node is taken as a discriminant and output function alone due to the characteristics of the output layer, but an equation of difference between the outputs of one fixed node and the other node is used ≡>The further division of the space as a hyperplane may satisfy a certain region +.>I.e. the i-th node of the nodes of the network output layer in this area is always larger than the outputs of other nodes, so that the samples in this area can be classified as i-th samples, all the inequalities calculated up to the currently used full connection layer together represent a set of inequalities corresponding to a decision domain of the i-th class defined by decision boundaries corresponding to the inequalities, and the calculated inequalities are added to the original set of inequalities as a new set of inequalities. Repeating calculation to expand one inequality group output by the previous layer into N _t (t＝M ₂ ) And a set of inequalities, each set of inequalities corresponding to a category. And repeatedly calculating each inequality group output by the previous layer until the calculation is completed, outputting a plurality of inequality groups and categories corresponding to each inequality group, and completing the calculation of the decision domain, wherein an equation equal to zero corresponding to each inequality in the inequality groups is a decision boundary.

Since the decision domain is represented using a set of constraint inequalities, the redundant inequalities in the set of inequalities, i.e. inequalities that do not constitute a viable domain boundary, need to be deleted before merging adjacent decision domains, otherwise situations may arise where adjacent conditions are met but are not actually adjacent, because the redundant inequalities are essentially not intersecting the viable domain outside the viable domain.

The solution is carried out by adopting a simplex method to change inequality constraint of the edge to be judged into equality constraint. At this time, the optimal solution must fall on the equation boundary of the decision domain, if the equation corresponds to the constraint to form the boundary of the feasible domain, then the problem has a solution for any objective function, otherwise, there is no solution, meanwhile, if there is the most solution for all objective functions, the decision domain is considered to be closed, and if there is an infinite solution under a certain objective function, the boundary is opened in the direction corresponding to the objective function.

In order to facilitate further solving, the solution of the decision domain under the corresponding objective function is stored, so that the subsequent use is convenient.

If a decision domain finds the optimal solution of the linear programming under any objective function, the optimal solution exists, namely the decision domain is considered to be a closed decision domain if the optimal solution is bounded by boundaries in any direction, and otherwise, the decision domain is opened.

The above problem is equivalent to that for a hyperplane corresponding to an arbitrary boundary, if it is an open plane, the decision domain is not constrained in the direction in which the hyperplane is open. Therefore, if the boundary is an open boundary, the decision domain must be open.

And judging the openness of each boundary of the decision domain by using a solution of linear programming, checking solutions under all objective functions, closing the boundary if the states of the solutions are all the only optimal solutions, and opening the boundary if an infinite optimal solution exists. And further judging the closure of the decision domain, if all the boundaries are closed, closing the decision domain, and if an open boundary exists, opening the decision domain.

This step is repeated until all decision domains solve for their openness.

After the redundant inequality is removed, when the weights and the thresholds are the same in the inequality groups corresponding to the two decision domains and inequalities with different signs exist, the two decision domains are adjacent. If the solutions obtained in the linear programming of the adjacent sides are identical under the corresponding objective functions, the adjacent sides of the two decision domains are completely overlapped, and if the categories of the two decision domains are also identical, merging is carried out, and the corresponding boundaries are deleted.

The region determined by the set of linear inequalities may prove in the linear programming problem to have a convex relief. The adjacent decision domain with completely overlapped merging adjacent sides is the inverse process of a certain division in the dividing process, so that the decision domain obtained after the merging process is a new inequality group consisting of a subset of the linear inequality group, and still corresponds to the convex feasible domain.

After the decision domains with completely overlapped adjacent edges are combined, combining the decision domains with incompletely overlapped adjacent edges, and when the inequality groups corresponding to the two decision domains have the same weight and threshold value and the inequality with different signs exist, the two decision domains are adjacent. If the solutions of the adjacent edges obtained in the linear programming problem are not identical under the corresponding objective function, the adjacent edges of the two decision domains are not completely overlapped. At this time, the merging must occur at an angle greater than 180 ° where the adjacent edges do not completely overlap, so the decision domain obtained by the merging in this process is concave.

Traversing the boundaries of all closed decision domains. For any two decision domains, irrespective of sign, when the boundary of one decision domain is entirely a subset of the boundary of the other decision domain, the latter is said to be a multi-connected region, and its asperity is meaningless.

The model can be used to compute challenge samples, with the decision domain and its topology properties computed.

Step 3: verification decision domain model

When the decision domain model is verified, according to the consistency of the decision domain and the deep neural network on sample classification, the higher the consistency is, the more accurate the obtained decision domain model is, the verification mode is to substitute the training set and the picture samples of the verification set, and compare whether the class obtained by the decision domain is consistent with the class of the deep neural network, so that the points are respectively counted into a correct (consistency), an invalid (illegal) and a total (total number of samples), when the class classified by the decision domain model of the input picture is consistent with the class classified by the neural network, the correct is added by one, the total is equal to the number of all samples used for verifying the accuracy of the model, and the consistency of the two models is represented according to the correct/total. If there is a sample that is not in any of the decision domains, it is stated that the decision domain model calculation is in error and needs to be adjusted.

Step 4: construction of challenge samples

Inputting the pictures x in the training set and the test set into the network constructed in the step 1, and obtaining the Mth ₁ The corresponding output of the maximum pooling layer is x ₀ Traversing all x ₀ Find the distance x in the decision domain where it is located ₀ The nearest boundary, and the distance d from the boundary, when d is less than the set threshold θ, then it is considered that the sample may cross the decision boundary under subtle disturbance, and be misclassified. Calculating the distances of all sample points from all boundaries of the decision domain in which the sample points are located, namely, the expression of the hyperplane corresponding to each inequality corresponding to the condition of being equal to 0, finding all samples meeting the following conditions and corresponding hyperplane parameters (w, b) spanned by the samples:

let ε represent a random number small enough to obtain the interference term p=d+ε, where the interference term obtained is sufficient to let x ₀ Crossing decision boundaries in order to enable interference terms to act directly on the inputInto image x, we need to use deconvolution to get p ' for p in the input dimension to construct the challenge sample x ' =x+p ' for retraining.

Since crossing a decision boundary may enter another same class boundary so that the picture is not misclassified, it is necessary to check here whether the constructed challenge sample is misclassified by the network, and the sample that will not cause misclassification is deleted.

Step 5: countermeasure training network

And (3) performing countermeasure training on the network by using the samples obtained in the step (4), adding countermeasure data into the training data and mixing the countermeasure data to disturb the sequence in order to avoid the influence of data division compensation used in the countermeasure training on the fitting of the network to the training data distribution, and adjusting the decision boundary to move towards the direction of the countermeasure samples in a counter-propagation mode in the training so as to reduce d, thereby increasing the disturbance p required by misclassification of the samples. Thereby improving the robustness of the network. Then, the network is used for image classification tasks, so that the robustness of the network can be improved on the premise of almost not affecting the accuracy.

Experimental results

The currently obtained network and the network before optimization are subjected to the challenge sample attack by using a challenge sample generation algorithm deepfool, a test set sample of the Cifar10 is used as a reference for generating the challenge sample, then the challenge sample attack is performed on the neural network generated by the challenge training after the completion of the challenge training again by using the deepfool algorithm, and the accuracy rate of classification of the challenge sample by the network is used as a network robustness detection standard to be compared.

/>

The method can be used for solving the problem that the classification accuracy of the optimized network on the original data set is not greatly changed, compared with the conventional countermeasure training, the accuracy is not reduced but is increased by a little, and the classification accuracy of the deepfool against the sample attack is improved by nearly one time, so that the method can be used for proving the robustness of the network.

Claims

1. An image classification method for optimizing decision boundary enhanced depth neural network robustness is characterized by comprising the following steps of:

step 1: setting up a network, and inputting the network as pictures;

step 2: training a network, and constructing a decision domain model according to parameters obtained after training;

step 3: generating an countermeasure sample by using the constructed decision domain model;

step 4: using the network built in the antagonistic sample training step 1 to enable the decision boundary of the network to move towards the antagonistic sample direction, so that the sensitivity of the network to disturbance is reduced, and inputting the image to be classified into the obtained neural network model for classification;

in particular, the method comprises the steps of,

the network structure described in step 1 is specifically as follows,

the network structure sequentially comprises a first convolution layer, a first maximum pooling layer and an Mth layer ₁ Roll base layer, mth ₁ The maximum pooling layer sequentially comprises a first full-connection layer, a second full-connection layer and an Mth layer ₂ Full tie layer, where M ₁ Representing the number of convolutional layers and pooling layers, M ₁ Is any positive integer from zero to positive infinity, M ₂ Represents the number of the full connection layers, M ₂ Is any positive integer greater than 2; m is M ₁ Is 2, M ₂ 3;

the decision domain model described in step 2 is divided into three parts: an input layer decision domain model, an output layer decision domain model and a middle layer decision domain model, wherein the input layer decision domain model corresponds to a first full connection layer, and the output layer decision domain model corresponds to an Mth layer ₂ The middle layer decision domain model corresponds to other full connection layers;

the input layer decision domain model is formed by all nodes in the first full connection layerThe corresponding decision boundaries are formed together, and the process of constructing the input layer decision domain model is as follows: after one node is activated, a discriminant is obtained, the discriminant divides an input space once, each node of a first full-connection layer is traversed, the input space is continuously divided, a plurality of areas are finally obtained, wherein the discriminant of the node is obtained by enabling an activation function of the node to be equal to 0, a hyperplane corresponding to the discriminant is a decision boundary, and the input space is a value range corresponding to the output of an M1 maximum pooling layer; the node weight is obtained by network pre-training, and the full-connection layer parameters comprise a weight matrix W and a threshold B; the corresponding layer number is represented by t, and the weight of the corresponding layer is w _t (w _t E W), threshold value b _t (b _t E B), node number of t layer is N _t By j _t Representing the jth node of the t layer, the weight of the jth node of the t layer isThreshold value of->

The construction process of the middle layer decision domain model comprises the following steps: the method comprises the steps of further dividing the spatial domain obtained by the previous layer according to a discriminant obtained after each node in the current layer is activated, traversing each node of the current full-connection layer, continuously dividing the spatial domain obtained by the previous layer, and finally obtaining a plurality of areas;

the output layer decision domain model is used for classifying categories, and the construction process is as follows: for n-categorical tasks, mth ₂ The full connection layer is provided with n nodes, and the region corresponding to each category is obtained through the n nodes, wherein the boundary condition of the region corresponding to the ith category is composed of n-1 inequality, and the method specifically comprises the following steps:

wherein,represents the Mth ₂ An output function of the first node of the full connection layer;

the output layer decision domain model receives all inequality groups output by the last middle layer and corresponding output functions as the input of the layer, and uses the parameters and weights of the nodes of the layerAnd threshold->Calculating decision boundaries generated by the layer;

the process of generating the challenge sample is as follows,

inputting the pictures x in the training set and the test set into the network constructed in the step 1, and obtaining the Mth ₁ The corresponding output of the maximum pooling layer is x ₀ Traversing all x ₀ Find the distance x in the decision domain where it is located ₀ The nearest boundary, and the distance d from the boundary, when d is less than a set threshold θ, the interference term p=d+ε is obtained, where ε represents a sufficiently small random number; then p ' in the input dimension is obtained using deconvolution to construct the challenge sample x ' =x+p '.

2. The image classification method for optimizing decision boundary enhancement depth neural network robustness according to claim 1, wherein: and (3) after constructing the decision domain model, verifying the decision domain model, specifically evaluating the consistency of the model classification model by utilizing the decision domain and the network constructed in the step (1), and generating an countermeasure sample by utilizing the decision domain model after the model classification model meets the preset requirement, otherwise, reconstructing the decision domain model.