WO2023067669A1

WO2023067669A1 - Learning device, learning method, and learning program

Info

Publication number: WO2023067669A1
Application number: PCT/JP2021/038503
Authority: WO
Inventors: 智也山下; 真徳山田
Original assignee: 日本電信電話株式会社
Priority date: 2021-10-18
Filing date: 2021-10-18
Publication date: 2023-04-27
Also published as: JPWO2023067669A1

Abstract

A learning device (10) determines an identification difficulty level for an adversarial example of a model. The learning device performs model learning in which with higher determined identification difficulty levels for the adversarial example of the model, there is a progressive increase of the weight of the value of the difference between the identification result for the adversarial example and the identification result for the data before the noise of the adversarial example is added in a loss function used by a MART to learn a model. The learning device (10) then identifies input data using the learned model.

Description

LEARNING DEVICE, LEARNING METHOD AND LEARNING PROGRAM

The present invention relates to a learning device, a learning method, and a learning program.

Conventionally, there is an adversarial example in which noise is added to data to cause a model (for example, a classifier) to make an erroneous decision. MART (Misclassification Aware adversarial training) is a robust model training method for this adversarial example.

MART is a method of adversarial training, and it is a method of learning a model (adversarial training) based on a policy determined based on the difficulty of model identification for the original input data (data before adding noise). be.

However, with MART, even if the model can correctly recognize the original input data, learning to correctly identify the labels of the Adversarial Examples can be a difficult task for the model. As a result, there is a problem that the model learned by MART may not be able to accurately classify Adversarial Examples. Therefore, an object of the present invention is to learn a model with high discrimination accuracy for Adversarial Examples.

In order to solve the above-described problems, the present invention provides a data acquisition unit that acquires learning data for a model for outputting identification results of input data including adversarial examples, and a determination of the identification difficulty level for the adversarial examples of the model. The higher the discrimination difficulty of the model for the Adversarial Example, the higher the discrimination difficulty for the Adversarial Example of the model, the discrimination result for the Adversarial Example in the loss function used for model learning by MART (Misclassification Aware Adversarial Training) and the It is characterized by having a learning processing unit that increases the weight of the value of the difference between the discrimination result and the data before noise is added to the Adversarial Example, and learns the model.

According to the present invention, it is possible to learn a model with high discrimination accuracy for Adversarial Examples.

FIG. 1 is a diagram showing a configuration example of a learning device. FIG. 2 is a flow chart showing an example of the processing procedure of the learning device. FIG. 3 is a flow chart showing an example of a processing procedure of the learning device. FIG. 4 is a diagram showing an application example of the learning device. FIG. 5 is a diagram showing experimental results for the model learned by the learning device. FIG. 6 is a diagram showing a configuration example of a computer that executes the learning program.

Hereinafter, the form (embodiment) for carrying out the present invention will be described with reference to the drawings. The invention is not limited to this embodiment.

[Overview of learning device]
First, with reference to FIG. 1, the outline of the learning device of this embodiment will be described. The learning device performs adversarial training of the model by a method improved from the existing method MART.

That is, the learning device first determines the identification difficulty level for the Adversarial Example of the learning target model. Then, if the learning target model's difficulty in identifying the Adversarial Example is low, the learning device changes the output of the model for the Adversarial Example to the output of the model for the original input data (the data before noise is added to the Adversarial Example). Model learning focuses on identifying the correct label for the Adversarial Example rather than trying not to deviate greatly from .

On the other hand, if the difficulty of identifying the Adversarial Example of the learning target model is high, the learning device will change the model output for the Adversarial Example from the model output for the original input data rather than identifying the correct label for the Adversarial Example. The model is trained with an emphasis on preventing large deviations.

In this way, the learning device 10 determines the model learning policy and learns the model based on the identification accuracy (Robust Accuracy) for the Adversarial Example, so it is possible to learn a model with high identification accuracy.

[Prerequisite knowledge]
Here, Adversarial Example and Adversarial Training in this embodiment will be described.

[Adversarial Example]
Adversarial Example is an attack method that causes a model to make an erroneous judgment by adding minute noise that cannot be recognized by the human eye to input data. The objective function of the Adversarial Example is as shown in Equation (1).

l(θ, x, y) in the above formula (1) is the loss function of the model. θ is the parameter of the model, x is the input data of the model, and y is the identification result of the input data x output from the model. Typical algorithms for generating Adversarial Examples include, for example, FGSM (Fast Gradient Sign Method) and PGD (Projection Gradient Descent). For example, in FGSM, noise is added to input data x according to the following equation (2).

ε in Equation (2) is the magnitude of noise. In FGSM, noise is added to the input data x so that the value of the loss function of the model increases.

In addition, PGD adds noise to the input data x according to the algorithm shown in formula (3) below.

[Adversarial Training]
Adversarial Training is a learning method that learns a robust model for Adversarial Examples. The objective function of Adversarial Training is as shown in Formula (4).

In general adversarial training, the parameter θ of the model is updated according to the algorithm shown in formula (5) below.

　l(θ,x,y) is a loss function, and the CE (Cross Entropy) function is mainly used. x' in equation (5) is an adversarial example of input data x. The difference between this algorithm and general learning algorithms is that Adversarial Examples are used as learning data.

[MART]
In MART, model learning is performed using the loss function shown in Equation (6) below.

　BCE in this loss function is a Boosted Cross Entropy function and is represented by the following formula (7).

In addition, the function represented by KL in Equation (6) is the Kullback-Leibler distance, which is used as an index for measuring the distance between probability distributions.

MART makes it possible to determine the learning policy of the model based on the classification difficulty (1-p _y (x, θ)) for the original input data by using the loss function shown in Equation (6). there is

Here, when the model can correctly identify the original input data (that is, when the difficulty of identification is low), learning is performed with an emphasis on the first term (BCE term) of the loss function shown in Equation (6). In other words, learning is performed with an emphasis on the model identifying the Adversarial Example with the correct label.

On the other hand, when the model cannot correctly identify the original input data (that is, when the identification difficulty is high), learning is performed with emphasis on the second term (KL term) of the loss function shown in Equation (6). In other words, learning is performed with an emphasis on ensuring that the model output for the Adversarial Example does not deviate significantly from the model output for the original input data.

The loss function used in MART is designed based on the intuition that if the model cannot correctly recognize the original input data, learning to recognize the Adversarial Example as the correct label is too severe a model task. .

However, even if the model can correctly recognize the original input data, learning to recognize the correct label of the Adversarial Example may be too severe a challenge for the model.

Therefore, the learning device of this embodiment determines whether the model's learning to identify the Adversarial Example with the correct label is too strict for the model's learning, based on the model's identification difficulty for the Adversarial Example.

[Configuration example of learning device]
A configuration example of the learning device 10 will be described with reference to FIG. The learning device 10 includes an input unit 11, an output unit 12, a communication control unit 13, a storage unit 14, and a control unit 15, for example.

The input unit 11 is an interface that receives input of various data. For example, the input unit 11 receives input of data used for learning processing and prediction processing, which will be described later. The output unit 12 is an interface that outputs various data. For example, the output unit 12 outputs the label of data predicted by the control unit 15 .

The communication control unit 13 is implemented by a NIC (Network Interface Card) or the like, and controls communication between an external device such as a server and the control unit 15 via a network. For example, the communication control unit 13 controls communication between the control unit 15 and a management device (see FIG. 4) that manages learning target data.

The storage unit 14 is realized by a semiconductor memory device such as RAM (Random Access Memory) and flash memory, or a storage device such as a hard disk and an optical disk, and stores the parameters of the model learned by the learning process described later. remembered.

The control unit 15 is implemented using, for example, a CPU (Central Processing Unit) or the like, and executes a processing program stored in the storage unit 14 . Thereby, the control unit 15 functions as an acquisition unit 15a, a learning unit 15b, and a prediction unit 15c, as illustrated in FIG.

The acquisition unit 15a acquires data used for learning processing and prediction processing, which will be described later, via the input unit 11 or the communication control unit 13.

The learning unit 15b performs model learning (adversarial training) for predicting the label of input data including adversarial examples. The learning unit 15 b includes a difficulty level determination unit 151 and a learning processing unit 152 .

The difficulty level determination unit 151 determines the difficulty level of identification for the Adversarial Example of the learning target model. For example, the difficulty level determination unit 151 inputs an Adversarial Example for learning to a model to be learned, and obtains the probability p _y (x′, θ)) that the model identifies the Adversarial Example as a correct label. Then, the difficulty level determination unit 151 sets (1-p _y (x′, θ)) as the identification difficulty level for the Adversarial Example of the learning target model.

The learning processing unit 152 determines a learning policy for the model based on the identification difficulty level of the model for the Adversarial Example determined by the difficulty determination unit 151, and performs learning.

For example, the learning processing unit 152 determines the identification result for the Adversarial Example in the loss function used for learning the model by MART and the data before the noise of the Adversarial Example is added as the identification difficulty level for the Adversarial Example of the model increases. Model learning is performed by increasing the weight of the value of the difference from the identification result for .

In other words, the learning processing unit 152 emphasizes identifying the Adversarial Example with the correct label rather than ensuring that the model output for the Adversarial Example does not deviate greatly from the model output for the original input data. Train the model.

In addition, the learning processing unit 152 determines that the lower the identification difficulty level for the Adversarial Example of the model, the identification result for the Adversarial Example in the loss function used for learning the model by MART and the data before the noise of the Adversarial Example is added. Model learning is performed by reducing the weight of the value of the difference from the identification result for .

In other words, the learning processing unit 152 puts more emphasis on preventing the model output for the Adversarial Example from deviating significantly from the model output for the original input data, rather than identifying the Adversarial Example with the correct label. study.

For example, the learning processing unit 152 performs model learning using the loss function shown in Equation (8) below.

The difference from the loss function (see formula (6)) used in MART is that the identification difficulty part (1-p _y (x, θ)) that determines the learning policy of the model is the identification difficulty for Adversarial Example ( 1-p _y (x', θ)). As a result, the learning processing unit 152 uses the identification difficulty level for the Adversarial Example of the learning target model when determining the learning policy of the model, so that it is possible to learn a model with high identification accuracy for the Adversarial Example.

The prediction unit 15c uses the model learned by the learning unit 15b to predict (identify) the label of the input data. For example, the prediction unit 15c uses the learned model to calculate the probability of each label of newly acquired data, and outputs the label with the highest probability. As a result, the learning device 10 can output a correct label even when the input data is Adversarial Example, for example.

[Learning process]
Next, with reference to FIG. 2, an example of a learning processing procedure by the learning device 10 will be described. The processing shown in FIG. 2 is started, for example, when an operation input instructing the start of learning processing is performed.

First, the acquisition unit 15a acquires learning data including Adversarial Examples (S1). Next, the learning unit 15b learns a model representing the probability distribution of the labels of the input data using the learning data and the loss function (see formula (8)) (S2). The learning unit 15b stores the model parameters learned in S2 in the storage unit 14 .

[Prediction processing]
Next, an example of the input data label prediction processing by the learning device 10 will be described with reference to FIG. The processing shown in FIG. 3 is started, for example, when an operation input instructing the start of prediction processing is performed.

First, the acquisition unit 15a acquires label prediction target data (S11). Next, the prediction unit 15c uses the model learned by the learning unit 15b to predict the label of the data acquired in S11 (S12). For example, the prediction unit 15c uses the learned model to calculate p(x') of the data x' acquired in S11, and outputs the label with the highest probability. As a result, for example, even if the data x' is Adversarial Example, the learning device 10 can output a correct label.

[Example of application of learning device]
The learning device 10 described above may be applied to object recognition processing in an image. An application example in this case will be described with reference to FIG.

For example, the learning device 10 uses teacher data (learning data) acquired from the data acquisition device and the loss function described above to perform model learning (adversarial training). After that, when the image data is acquired from the data acquisition device, the learning device 10 predicts the label of the acquired image data using the trained model. Then, the learning device 10 outputs an object recognition result based on the prediction result.

[experiment]
Next, experimental results of the model learned by the learning device 10 will be described. In the experiment, model learning by MART and model learning by the learning device 10 were performed on the CIFAR10 data set, and the discrimination accuracy of each model was compared. The model used for the experiment is ResNet18. The learning parameters were set to match the settings in the MART paper (Non-Patent Document 1), and the hyperparameters λ were 1, 2, . . . , 10 used in the above paper.

FIG. 5 shows the evaluation results of the model learned by MART and the model (Propose) learned by the learning device 10. NatAcc in FIG. 5 indicates the accuracy rate of the model for the original input data, and RobAcc indicates the accuracy rate of the model for the Adversarial Example. Note that the accuracy shown in FIG. 5 is the accuracy at the epoch with the highest accuracy for the Adversarial Example.

As shown in Figure 5, it was confirmed that Propose exceeded MART for RobAcc, except for the case of λ=7. Also, when RobAcc of each epoch was confirmed at λ=7, it was confirmed that there was a large overshoot of RobAcc in a certain epoch.

[System configuration, etc.]
Also, each constituent element of each part shown in the figure is functionally conceptual, and does not necessarily need to be physically configured as shown in the figure. In other words, the specific form of distribution and integration of each device is not limited to the illustrated one, and all or part of them can be functionally or physically distributed and integrated in arbitrary units according to various loads and usage conditions. Can be integrated and configured. Furthermore, all or any part of each processing function performed by each device can be implemented by a CPU and a program executed by the CPU, or implemented as hardware based on wired logic.

Further, among the processes described in the above embodiments, all or part of the processes described as being performed automatically can be performed manually, or the processes described as being performed manually can be performed manually. All or part of this can also be done automatically by known methods. In addition, information including processing procedures, control procedures, specific names, and various data and parameters shown in the above documents and drawings can be arbitrarily changed unless otherwise specified.

[program]
The learning device 10 described above can be implemented by installing a program (learning program) as package software or online software on a desired computer. For example, the information processing device can function as the learning device 10 by causing the information processing device to execute the above program. The information processing apparatus referred to here includes mobile communication terminals such as smart phones, cellular phones, PHS (Personal Handyphone System), and terminals such as PDA (Personal Digital Assistant).

FIG. 6 is a diagram showing an example of a computer that executes a learning program. The computer 1000 has a memory 1010 and a CPU 1020, for example. Computer 1000 also has hard disk drive interface 1030 , disk drive interface 1040 , serial port interface 1050 , video adapter 1060 and network interface 1070 . These units are connected by a bus 1080 .

The memory 1010 includes a ROM (Read Only Memory) 1011 and a RAM (Random Access Memory) 1012 . The ROM 1011 stores a boot program such as BIOS (Basic Input Output System). Hard disk drive interface 1030 is connected to hard disk drive 1090 . A disk drive interface 1040 is connected to the disk drive 1100 . A removable storage medium such as a magnetic disk or optical disk is inserted into the disk drive 1100 . Serial port interface 1050 is connected to mouse 1110 and keyboard 1120, for example. Video adapter 1060 is connected to display 1130, for example.

The hard disk drive 1090 stores, for example, an OS 1091, application programs 1092, program modules 1093, and program data 1094. That is, the program that defines each process executed by the learning device 10 is implemented as a program module 1093 in which computer-executable code is described. Program modules 1093 are stored, for example, on hard disk drive 1090 . For example, the hard disk drive 1090 stores a program module 1093 for executing processing similar to the functional configuration of the learning device 10 . The hard disk drive 1090 may be replaced by an SSD (Solid State Drive).

Also, the data used in the processes of the above-described embodiments are stored as program data 1094 in the memory 1010 or the hard disk drive 1090, for example. Then, the CPU 1020 reads out the program module 1093 and the program data 1094 stored in the memory 1010 and the hard disk drive 1090 to the RAM 1012 as necessary and executes them.

The program modules 1093 and program data 1094 are not limited to being stored in the hard disk drive 1090, but may be stored in a removable storage medium, for example, and read by the CPU 1020 via the disk drive 1100 or the like. Alternatively, the program modules 1093 and program data 1094 may be stored in another computer connected via a network (LAN (Local Area Network), WAN (Wide Area Network), etc.). Program modules 1093 and program data 1094 may then be read by CPU 1020 through network interface 1070 from other computers.

REFERENCE SIGNS LIST 10 learning device 11 input unit 12 output unit 13 communication control unit 14 storage unit 15 control unit 15a acquisition unit 15b learning unit 15c prediction unit 151 difficulty level determination unit 152 learning processing unit

Claims

a data acquisition unit that acquires model learning data for outputting identification results of input data including adversarial examples;
a difficulty determination unit that determines the identification difficulty of the Adversarial Example of the model;
The higher the discrimination difficulty of the determined model for the Adversarial Example, the more the discrimination result for the Adversarial Example in the loss function used for model learning by MART (Misclassification Aware adveRsarial Training) and the noise for the Adversarial Example are added. a learning processing unit for learning the model by increasing the weight of the value of the difference between the identification result and the data before the learning.
The learning processing unit
The lower the discrimination difficulty of the determined model for the Adversarial Example, the more the value of the difference between the discrimination result for the Adversarial Example in the loss function and the discrimination result for the data before the noise of the Adversarial Example is added. 2. The learning device according to claim 1, wherein the learning of the model is performed with a small weight.
The difficulty level determination unit
2. The learning device according to claim 1, wherein an Adversarial Example is input to a model to be learned, and a discrimination difficulty level for the Adversarial Example of the model is determined.
2. The learning device according to claim 1, further comprising an identification unit that identifies input data using the adversarial trained model.
A learning method performed by a learning device,
Acquiring model learning data for outputting identification results of input data including adversarial examples;
Determining a discrimination difficulty level for the Adversarial Example of the model;
The higher the discrimination difficulty of the determined model for the Adversarial Example, the more the discrimination result for the Adversarial Example in the loss function used for model learning by MART (Misclassification Aware adveRsarial Training) and the noise for the Adversarial Example are added. and a step of learning the model by increasing the weight of the value of the difference between the identification result and the data before the training.
Acquiring model learning data for outputting identification results of input data including adversarial examples;
Determining a discrimination difficulty level for the Adversarial Example of the model;
The higher the discrimination difficulty of the determined model for the Adversarial Example, the more the discrimination result for the Adversarial Example in the loss function used for model learning by MART (Misclassification Aware adveRsarial Training) and the noise for the Adversarial Example are added. A learning program for causing a computer to execute the step of learning the model by increasing the weight of the value of the difference between the identification result and the data before being read.