CN115081618A

CN115081618A - Method and device for improving robustness of deep neural network model

Info

Publication number: CN115081618A
Application number: CN202210687161.7A
Authority: CN
Inventors: 刘祥龙; 张崇智; 刘艾杉; 徐一涛
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2019-12-31
Filing date: 2019-12-31
Publication date: 2022-09-20
Also published as: CN111210018A

Abstract

The invention discloses a method for improving robustness of a deep neural network model and a device for improving robustness of the deep neural network model. The method firstly provides an index of the sensitivity of the neural unit, and is used for measuring the performance difference of each neural unit in the deep neural network model facing to an original sample and an antagonistic sample; further finding out a sensitive unit in the deep neural network model; and then forming a sensitive unit set according to the sensitive units, using the sensitive unit set as a training set, and training at least one layer preset in the model to be detected. Therefore, the method can effectively find the weakness of the deep neural network model, carry out targeted repair and improve the robustness of the model to the confrontation sample.

Description

Method and device for improving robustness of deep neural network model

The invention application is a divisional application of an invention application with application number 201911421395.1, which is named as a method and a device for improving robustness of a deep neural network model and is submitted on 31.12.2019.

Technical Field

The invention relates to a method for improving robustness of a deep neural network model, and also relates to a device for improving robustness of the deep neural network model, belonging to the technical field of machine learning.

Background

The deep neural network is a multi-layer neural network structure, and each layer comprises a plurality of neural units. In recent years, deep neural networks have shown excellent performance in many fields such as computer vision, speech recognition and natural language processing. Although deep neural networks have achieved tremendous success, the occurrence of confrontational samples that contain small noise imperceptible to humans and can mislead model classification errors has received extensive attention for their interpretability. Meanwhile, the deep neural network is subjected to potential security threats against samples in the fields where the deep neural network is widely applied, such as automatic driving and face recognition, and property loss and even casualties can be caused.

To avoid potential threats to real-world applications by the countermeasure samples, many methods for improving robustness of deep neural network countermeasures have been proposed in succession. These methods can be classified into antagonistic training (adaptive training), input transformation, special model architecture design, and antagonistic sample detection.

From another perspective, the deep neural network is treated as a "black box" model due to its complex structure and the large number of nonlinear operations involved, and the challenge samples are of great value for understanding the internal behavior of the deep neural network. Understanding the antagonistic sample can find the defects and weaknesses of the model, thereby helping us understand and train a robust deep neural network. Some anti-disturbance-based model robustness analysis methods are proposed by experts in the industry, and some of the methods use anti-disturbance samples to understand the internal characterization of the deep neural network; one analyzes the vulnerability of the model by analyzing the degree of influence of each region of the input image when the model is facing the challenge sample.

From a higher level point of view, the robustness of the model to noise can be seen as a global insensitivity trait. The deep neural network can reduce the performance degradation condition of the confrontation sample by learning insensitive characterization to the confrontation sample.

Disclosure of Invention

The invention aims to provide a method for improving the robustness of a deep neural network model.

Another technical problem to be solved by the present invention is to provide a device for improving robustness of a deep neural network model.

In order to achieve the purpose, the invention adopts the following technical scheme:

according to a first aspect of the embodiments of the present invention, there is provided a method for improving robustness of a deep neural network model, including the following steps:

constructing an even pair set; the even pair set is composed of an original sample and a corresponding confrontation sample;

selecting at least one preset layer from all layers of the model to be detected;

calculating the neural unit sensitivity of each neural unit on the even-pair set layer by layer aiming at each preset layer;

obtaining a sensitive unit set of each preset layer according to the nerve unit sensitivity of each nerve unit in each preset layer;

and taking the even pair set as a training set, and training the preset layers according to the sensitive unit sets of the preset layers.

Preferably, the constructing the even pair set specifically includes:

obtaining an original sample;

attacking the to-be-detected model on an original sample by using a gradient-based PGD white box attack method to generate a corresponding confrontation sample;

each original sample and the only corresponding confrontation sample form a group of even data;

each pair of data constitutes a pair set.

Preferably, the PGD white box attack method is:

in the formula: sign (·) represents a symbolic function, x represents an original sample, y represents a class label of the original sample, θ represents a current model parameter, α represents a single iteration step value, Π _x+S (. cndot.) represents a projection function,

representing loss function

For a gradient of x.

Preferably, the calculation formula of the neural unit sensitivity is as follows:

in the formula (I), the compound is shown in the specification,

representing a neural unit

In the even-pair set

The sensitivity of the neural unit of (a) above,

and

respectively represent a nerve unit

For the original sample x _i And corresponding challenge sample x' _i To output of (c).

Preferably, the obtaining a sensing unit set of each preset layer according to the neural unit sensitivity of each neural unit in each preset layer specifically includes:

aiming at each preset layer of the model to be detected:

sorting each nerve unit in the current layer from high to low according to the sensitivity of the nerve unit;

selecting the nerve units with the nerve unit sensitivity ranked at the front k (k is more than or equal to 1) to form a sensitive unit set of the current layer:

Ω _l ＝top-k(F _l ，σ)

in the formula, omega _l The set of sensitive units representing the current layer l, top-k (-) representing the top k large neural units based on the neural unit sensitivity metric, F _l Representing the neural units of the current layer l, and σ representing the neural unit sensitivities.

Preferably, the training is performed on each preset layer by using the paired set as a training set according to the sensitive unit set of each preset layer, and specifically includes:

constructing an antitraining loss function and a stabilizing loss function;

obtaining an overall optimization loss function according to the antithetical training loss function and the stabilization loss function;

and taking the even pair set as a training set, and training each preset layer by utilizing an overall optimization loss function.

Preferably, the opponent training loss function is:

in the formula (I), the compound is shown in the specification,

representing a cross-entropy loss function, x representing the original sample, x' representing the challenge sample, y representing the originalThe class label of the sample, θ, represents the current model parameters.

Wherein preferably, the stabilization loss function is:

wherein S represents a set of predetermined deep neural network layer numbers, Ω _l A set of sensitive cells representing the lth layer of the model to be detected,

and

respectively represent a nerve unit

The output for the original sample x and the corresponding challenge sample x'.

Preferably, the overall optimization loss function is:

in the formula, λ is a loss function coefficient.

According to a second aspect of the embodiments of the present invention, there is provided an apparatus for improving robustness of a deep neural network model, including a processor and a memory, where the processor reads a computer program in the memory and is configured to perform the following operations:

Compared with the prior art, the invention firstly provides the index of the sensitivity of the nerve unit, which is used for measuring the performance difference of each nerve unit in the deep neural network model facing to the original sample and the confrontation sample; secondly, sensitive units in the deep neural network model are further found out, and the sensitive units have important contribution to the classification error of the model, namely the weakness of the deep neural network; and then forming a sensitive unit set according to the sensitive units, using the sensitive unit set as a training set, and training at least one layer preset in the model to be detected. Therefore, the method can effectively find the weakness of the deep neural network model, carry out targeted repair and improve the robustness of the deep neural network model to the countermeasure sample.

Drawings

FIG. 1 is a flow chart of the overall method provided by the present invention;

FIG. 2 is a flow chart of the present invention for achieving the stabilization of the sensitive unit by using the PGD white-box attack method;

FIG. 3 is a diagram of selecting a set of sensing units according to the present invention;

fig. 4 is a structural diagram of the apparatus provided by the present invention.

Detailed Description

The technical contents of the invention are described in detail below with reference to the accompanying drawings and specific embodiments.

In the field of machine learning, countersamples (adaptive algorithms) refer to input samples formed by deliberately adding subtle perturbations to a data set that cause the model to give an erroneous output with high confidence. Data points are deliberately constructed by an optimization process on a neural network with human-level accuracy, with an error rate close to 100%, and the output of the model at this input point is very different from nearby data points. In many cases, the two input points are very similar, and a human observer will not perceive the difference between the original and challenge samples, but the network will make very different predictions.

The robustness of a deep neural network model (short for model) facing noise can be regarded as a global insensitive behavior, the loss is small under the condition of containing noise, and the situation is consistent with the original prediction situation. The robustness of the deep neural network model under the noise setting is defined as follows:

wherein x is _i And x _j Is a sample randomly chosen from the data set D;

the loss function is represented. Meanwhile, | | · | is a measurement index for measuring the distance between two samples, and e represents a very small value. The idea of this definition is: if the two examples are very similar, their test losses should also be close.

This definition should apply equally for the original sample x e D and the confrontation sample x '∈ D'. However, in practice, the challenge samples may mislead the less robust classifier to make the wrong prediction, the formula is:

F(x′)≠y s.t.||x-x′||＜∈.

intuitively, when a model is insensitive to challenge samples, the original sample x and challenge sample x' have similar characterizations in the hidden layers of the model, resulting in similar final prediction results. Therefore, the method improves the robustness of the countermeasure of the deep neural network model by paying attention to the characteristic deviation condition of the original sample and the corresponding countermeasure sample in each hidden layer of the model.

As shown in fig. 1, an embodiment of the present invention provides a method for improving robustness of a deep neural network model, including the following steps:

101. constructing an even pair set; the even pair set is composed of an original sample and a corresponding confrontation sample;

the construction of the even pair set specifically comprises the following steps:

1011. obtaining an original sample;

for a deep neural network model to be detected, giving an original sample data set D ═ { x ═ _i |i＝1，...，N}；

1012. Attacking the to-be-detected model on an original sample by using a gradient-based PGD white box attack method to generate a corresponding confrontation sample;

the PGD white box attack method comprises the following steps:

representing loss function

For a gradient of x.

Thus, the generated countermeasure sample set D '═ x' _i |i＝1，...，N}。

1013. Each original sample and the only corresponding confrontation sample form a group of even data;

1014. each pair data constitutes a pair set:

。

102. selecting at least one preset layer from all layers of the model to be detected;

according to experimental data, a small number of preset layers are selected for training, and compared with training aiming at all layers of the deep neural network model, the training efficiency is high, and the finally obtained correction effect is good. Therefore, in the invention, the robustness of the whole deep neural network model is improved by training aiming at a small number of layers.

According to experimental data, in the embodiment of the invention, the last five layers of the model to be detected are selected from all the preset layers, and the finally obtained effect of improving the robustness is the best.

103. Calculating the neural unit sensitivity of each neural unit on the even-pair set layer by layer aiming at each preset layer; the calculation formula of the neural unit sensitivity is as follows:

in the formula (I), the compound is shown in the specification,

representing a neural unit

In the even-pair set

The sensitivity of the neural unit of (a) above,

and

respectively represent a neural unit

104. Obtaining a sensitive unit set of each preset layer according to the nerve unit sensitivity of each nerve unit in each preset layer; specifically, the method comprises the following steps:

1041. aiming at each preset layer of the model to be detected:

1042. sorting each nerve unit in the current layer from high to low according to the sensitivity of the nerve unit;

1043. selecting the nerve units with the nerve unit sensitivity ranked at the front k (k is more than or equal to 1) to form a sensitive unit set of the current layer:

Ω _l ＝top-k(F _l ，σ) (3)

in the formula, omega _l Representing the set of sensitive units of the current layer l, top-k (-) representing the top k large neural units based on the neural unit sensitivity metric, F _l Representing the neural units of the current layer l, and σ representing the neural unit sensitivities.

105. Taking the even pair set as a training set, and training each preset layer according to the sensitive unit set of each preset layer; specifically, the method comprises the following steps:

1051. constructing an antitraining loss function and a stabilizing loss function;

the opposing training loss function is:

in the formula (I), the compound is shown in the specification,

represents the cross entropy loss function, x represents the original sample, x' represents the challenge sample, y represents the class label of the original sample, and θ represents the current model parameters.

The stabilization loss function is:

and

respectively represent a nerve unit

The output for the original sample x and the corresponding challenge sample x'. In the present example S represents the last five layers.

1052. Obtaining an integral optimization loss function according to the antithetical training loss function and the stabilization loss function;

the overall optimization loss function is:

in the formula, λ is a loss function coefficient.

1053. And taking the even pair set as a training set, and training each preset layer by utilizing an overall optimization loss function.

The method firstly defines the index of the sensitivity of the nerve unit aiming at the performance difference of each nerve unit in the model facing to the original sample and the confrontation sample, further defines a part of the nerve units in the model most sensitive to the confrontation sample as sensitive units, and the sensitive units have important contribution to the classification error of the model, namely the defect of the deep neural network; after analyzing the model defects, the model is trained by using a sensitive unit stabilization method, and the method enables the model to learn the characteristics insensitive to the challenge samples by enabling the sensitive units of the model to face the internal characteristics of the challenge samples and the original samples to be similar in the training process. The method can effectively find the weakness of the model, carry out targeted repair and improve the robustness of the deep neural network model to the confrontation sample.

As shown in fig. 2, the method for implementing the stabilization of the sensitive unit by using the PGD white box attack method further includes the following steps:

step 1, generating a countermeasure sample D ' ═ { x ' by using a PGD white box attack method ' _i 1, N, and combined to obtain a set of even pairs

Step 2, calculating the sensitivity of the deep neural network on the even pair set layer by the neural unit; the formula for calculating the sensitivity of the neural unit is as follows:

in the formula (I), the compound is shown in the specification,

representing a neural unit

In the even-pair set

The sensitivity of the neural unit of (a) above,

and

respectively represent a nerve unit

Step 3, judging whether the sensitivity is lower than a threshold value;

as shown in fig. 3, for layers 1 to k of the deep neural network model, the sensitivities of the individual neural units of each layer are calculated. Setting a sensitivity threshold; for each layer, it is determined whether the sensitivity of each neural unit is below the threshold.

In FIG. 3, for the second level example, the dashed lines in the table represent thresholds, and below the dashed lines are neural units with sensitivity below the threshold that are to be discarded; the neural units with the sensitivity greater than or equal to the threshold are above the dotted line, and the neural units are sensitive units of the second layer.

Step 4, obtaining a sensitivity result of each neural unit of the deep neural network, and selecting the sensitive units layer by layer;

step 5, selecting a plurality of layers to carry out targeted repair by using a sensitive unit stabilizing method;

after all the sensitive units of the 1 st to k th layers of the deep neural network model are selected, several layers are selected from the 1 st to k th layers as repaired layers (i.e. the preset layers in the above embodiment).

Will be paired to set

As a training set, the repaired layer is trained using a sensitive cell stabilization method. The method for stabilizing the sensitive unit is realized by utilizing the following overall optimization loss function:

wherein λ is a loss function coefficient,

in order to optimize the loss function for the whole,

as a function of stabilization loss.

Furthermore, the invention also provides a device for improving the robustness of the deep neural network model. As shown in fig. 4, the apparatus includes a processor and a memory, and may further include a communication component, a sensor component, a power component, a multimedia component, and an input/output interface according to actual needs. The memory, the communication component, the sensor component, the power supply component, the multimedia component and the input/output interface are all connected with the processor. As mentioned above, the memory in the node device may be Static Random Access Memory (SRAM), Electrically Erasable Programmable Read Only Memory (EEPROM), Erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), magnetic memory, flash memory, etc., and the processor may be a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), a Digital Signal Processing (DSP) chip, etc. Other communication components, sensor components, power components, multimedia components, etc. may be implemented using common components found in existing smartphones and are not specifically described herein.

In another aspect, in the apparatus for improving robustness of a deep neural network model, the processor reads a computer program in the memory, and is configured to: constructing an even pair set; the even pair set is composed of an original sample and a corresponding confrontation sample; calculating the neural unit sensitivity of each neural unit on the even-pair set layer by layer according to the model to be detected; selecting a sensitive unit set of each layer according to the sensitivity of the nerve units of each nerve unit; and forming the sensitive unit sets of each layer into a sensitive unit set of the model to be detected, and training at least one layer preset in the model to be detected.

Compared with the prior art, the invention firstly provides the index of the sensitivity of the nerve unit, which is used for measuring the performance difference of each nerve unit in the model facing to the original sample and the confrontation sample; secondly, defining a part of neural units in the model most sensitive to the anti-sample as sensitive units, wherein the sensitive units have important contribution to the classification error of the model, namely weak points of the deep neural network; meanwhile, the invention provides a method for improving the robustness of the model, namely stabilization of the sensitive unit, and the method leads the model to learn the characterization which is not sensitive to the antagonistic sample by facing the internal characterization of the antagonistic sample to be similar to the original sample in the training process. The method can effectively find the weakness of the model, carry out targeted repair and improve the robustness of the deep neural network model to the confrontation sample.

The method and the device for improving the robustness of the deep neural network model provided by the invention are explained in detail above. It will be apparent to those skilled in the art that any obvious modifications thereof can be made without departing from the spirit of the invention, which infringes the patent right of the invention and bears the corresponding legal responsibility.

Claims

1. A method for improving robustness of a deep neural network model is characterized by comprising the following steps:

calculating the sensitivity of each neural unit on the even-pair set layer by layer aiming at each preset layer;

and taking the even pair set as a training set, and training each preset layer according to the sensitive unit set of each preset layer.

2. The method for improving robustness of a deep neural network model as claimed in claim 1, wherein the constructing the even-pair set specifically comprises:

obtaining an original sample;

each original sample and the only corresponding confrontation sample form a group of even-pair data;

each pair of data constitutes a pair set.

3. The method for improving robustness of a deep neural network model as claimed in claim 2, wherein the PGD white-box attack method is:

in the formula: sign (·) represents a symbolic function, x represents an original sample, y represents a class label of the original sample, θ represents a current model parameter, α represents a single iteration step value, Π _x+S (. represents a projection letter)The number of the first and second groups is,

representing loss function

For a gradient of x.

4. The method for improving robustness of a deep neural network model as claimed in claim 1, wherein the neural cell sensitivity is calculated by the formula:

in the formula (I), the compound is shown in the specification,

representing a neural unit

In the even-pair set

The sensitivity of the neural unit of (a) above,

and

respectively represent a nerve unit

5. The method for improving robustness of a deep neural network model according to claim 1, wherein the obtaining of the set of the sensitive units of each preset layer according to the sensitivity of the neural units of each preset layer specifically comprises:

aiming at each preset layer of the model to be detected:

Ω _l ＝top-k(F _l ，σ)

6. The method for improving robustness of a deep neural network model according to claim 1, wherein the training is performed on each preset layer according to the sensitive unit set of each preset layer by using the even-pair set as a training set, and specifically comprises:

constructing an antitraining loss function and a stabilizing loss function;

7. The method for improving robustness of a deep neural network model of claim 6, wherein: the opposing training loss function is:

in the formula (I), the compound is shown in the specification,

representing cross entropy lossThe function, x represents the original sample, x' represents the confrontation sample, y represents the class label of the original sample, and θ represents the current model parameters.

8. The method for improving robustness of a deep neural network model of claim 7, wherein: the stabilization loss function is:

and

respectively represent a nerve unit

The output for the original sample x and the corresponding challenge sample x'.

9. The method for improving robustness of a deep neural network model of claim 8, wherein the overall optimization loss function is:

in the formula, λ is a loss function coefficient.

10. An apparatus for improving robustness of a deep neural network model, comprising a processor and a memory, the processor reading a computer program in the memory for performing the following operations: