WO2023188241A1

WO2023188241A1 - Generation method, generation program, and information processing device

Info

Publication number: WO2023188241A1
Application number: PCT/JP2022/016443
Authority: WO
Inventors: 海斗岸; 郁也森川; 俊也清水
Original assignee: 富士通株式会社
Priority date: 2022-03-31
Filing date: 2022-03-31
Publication date: 2023-10-05

Abstract

The present invention can efficiently generate an adversarial sample with a specific perturbation size by comprising: a process of generating a plurality of pieces of candidate data on the basis of training data to be used for training a classification model; and a process of determining an adversarial sample from among the plurality of pieces of candidate data on the basis of the level of confidence that each of the plurality of pieces of candidate data will be classified into a class associated with the training data and a distance between each of the plurality of pieces of candidate data and the surface of a sphere having a radius constituted by a target perturbation size with the training data serving as the center thereof.

Description

Generation method, generation program, and information processing device

The present invention relates to a generation method, a generation program, and an information processing device.

In recent years, the use of machine learning has been rapidly expanding. On the other hand, security issues related to machine learning have been pointed out.

For example, an adversarial sample attack is known that causes misclassification in a machine learning model (classification model) that classifies data.

Performing machine learning for a classification model can be said to find boundaries (decision boundaries) in the feature space that can separate data points to be classified into different classes. In adversarial sample attacks, data points that slightly exceed the decision boundary are intentionally created to induce misclassification of the model.

Note that the attack data points created in the adversarial sample attack may also be referred to as adversarial samples.

FIG. 7 is a diagram illustrating an adversarial sample.

In this FIG. 7, symbol A is image data representing the number 7, and symbols B and C represent adversarial samples generated based on the image data indicated by symbol A.

The adversarial sample indicated by code B is an adversarial sample generated so that AI (Artificial Intelligence) incorrectly recognizes it as 3, and the perturbation size ε=0.1. The adversarial sample shown by code C is an adversarial sample generated so that the AI misrecognizes it as 2, and the perturbation size ε=0.2.

FIG. 8 is a diagram for explaining adversarial samples.

Adversarial samples are generated based on training data, that is, classified data points (hereinafter referred to as original points). The perturbation size, which is the distance between the original point and the adversarial sample, is denoted by the symbol ε.

Additionally, adversarial training for classification models is known as a defense method against adversarial sample attacks. In adversarial training, adversarial samples are added to the training data of the classification model, and the decision boundaries corresponding to the adversarial samples are updated.

Here, in adversarial training, by training the classification model with strong adversarial samples that can more easily fool the classification model, it is possible to make the classification model less likely to be fooled. Therefore, it is expected that if the adversarial samples used for training are strong, it will be possible to generate a classification model that is difficult to fool.

Also, by adjusting the perturbation size little by little during adversarial training, it is possible to make the classification model less susceptible to deception. For example, in adversarial training, by first training with training data with a small perturbation size and then training with a large perturbation size, it is possible to generate a classification model that is difficult to make mistakes even with large perturbations. Therefore, by making the perturbation sizes of adversarial samples used for training constant, it is possible to efficiently generate a classification model that is difficult to fool.

In the past, it has been known to generate strong adversarial samples using, for example, a regularization-based method.

In this method, candidate points for adversarial samples are generated based on the original points. Calculate the confidence that the generated candidate point is classified into a class other than the target class. The target class is a class whose classification destination is to be changed from the original class using an adversarial sample. Also, the distance between the candidate point and the original point is calculated. Then, the adversarial sample candidates are updated so that the sum (a+b) of the confidence (a) of the candidate point and the distance (b) between the candidate point and the original point is minimized, and the candidate closest to the original point is selected. , is determined as the adversarial sample to generate.

JP 2017-49996 Publication JP2020-112967A International Publication No. 2020/230699 US Patent No. 10296813 US Patent Application Publication No. 2020/0065664

In order to efficiently implement adversarial training, we would like to use strong adversarial samples with perturbation sizes limited to a certain value as training data.

However, in the adversarial sample generation method using the conventional regularization-based method described above, the sum (a + b) of the confidence of the candidate point (a) and the distance (b) between the candidate point and the original point is minimized. The adversarial sample candidates are updated as follows. As a result, the value of the distance (b) between the candidate point and the original point is also minimized, and the perturbation size changes. Therefore, it is difficult to generate adversarial samples with a constant perturbation size, and it is difficult to use them for adversarial training using a constant perturbation size.

Note that by extracting only the adversarial samples with a specific perturbation size from among the adversarial samples generated by the adversarial sample generation method using the conventional regularization-based method described above, target samples can be obtained. However, this method requires a huge amount of original points (original training data) and is unrealistic.

In one aspect, the present invention aims to enable efficient generation of adversarial samples having a specific perturbation size.

Therefore, this generation method is a generation method for generating adversarial samples used for training a class classification model, and includes a process for generating a plurality of candidate data based on training data used for training the class classification model. , the confidence that each of the plurality of candidate data is classified into the class associated with the training data, and a sphere whose radius is the target perturbation size centered on the training data from each of the plurality of candidate data. and determining a hostile sample from among the plurality of candidate data based on the distance to the surface.

According to one embodiment, adversarial samples with a specific perturbation size can be efficiently generated.

1 is a diagram schematically showing the configuration of an information processing device as an example of an embodiment. FIG. 2 is a diagram for explaining processing by a hostile sample candidate updating unit of an information processing device as an example of an embodiment. 7 is a flowchart for explaining processing of a hostile sample generation unit of an information processing device as an example of an embodiment. FIG. 3 is a diagram illustrating the generation results of hostile samples by the hostile sample generation unit of the information processing device as an example of the embodiment in comparison with a conventional method. FIG. 3 is a diagram illustrating the generation results of hostile samples by the hostile sample generation unit of the information processing device as an example of the embodiment in comparison with a conventional method. 1 is a diagram illustrating a hardware configuration of an information processing device as an example of an embodiment. FIG. 3 is a diagram illustrating an adversarial sample. FIG. 2 is a diagram for explaining an adversarial sample.

Hereinafter, embodiments of the present generation method, generation program, and information processing device will be described with reference to the drawings. However, the embodiments shown below are merely illustrative, and there is no intention to exclude the application of various modifications and techniques not specified in the embodiments. That is, this embodiment can be modified and implemented in various ways without departing from the spirit thereof. Furthermore, each figure is not intended to include only the constituent elements shown in the figure, but may include other functions.

(A) Configuration FIG. 1 is a diagram schematically showing the configuration of an information processing device 1 as an example of an embodiment.

The information processing device 1 has a function as a training data generation unit 100 that generates training data used in adversarial training.

Additionally, the training data generation unit 100 has a function as an adversarial sample generation unit 101 that generates adversarial samples.

As shown in FIG. 1, the adversarial sample generation unit 101 has the functions of a candidate point generation unit 102, a confidence calculation unit 103, a distance calculation unit 104, and an adversarial sample candidate update unit 105.

The candidate point generation unit 102 generates candidate data that is a candidate for an adversarial sample based on training data used for training a class classification model (machine learning model). The training data has already been classified. Such classified training data may be referred to as original points. Further, candidate data that is a candidate for a hostile sample may be referred to as a candidate point. Furthermore, candidate data that is a candidate for a hostile sample may be referred to as hostile sample candidate data or hostile sample candidate points. A class classification model may simply be called a classification model.

The candidate point generation unit 102 may generate candidate points, for example, by copying the original point. That is, in the feature space, a point having the same coordinates as the original point may be used as a candidate point. Further, the candidate point generation unit 102 may generate candidate points by randomly moving the original point within a predetermined range in the feature space. That is, the candidate point generation unit 102 may generate candidate points by randomly moving the original point a little.

Information on candidate points generated by the candidate point generation unit 102 is stored in a predetermined storage area such as the memory 12 or the storage device 13 (see FIG. 6).

The certainty calculation unit 103 calculates the certainty that the candidate point is classified into a class other than the target class. The confidence level corresponds to the confidence level of classification into the class associated with the training data. The target class is a class whose classification destination is to be changed from the original class using an adversarial sample. For example, in FIG. 8, the class indicated by an x corresponds to the target class for the original point indicated by a circle.

Confidence is a predicted value of the degree of certainty of classification. The confidence that a candidate point is classified into a class other than the target class can simply be called the confidence of the candidate point. Further, the confidence level of a candidate point may be expressed by the symbol s.

For example, if the target class is t for the confidence level z _i of each class output by the classification model, the confidence level s that a candidate point is classified into a class other than the target class can be calculated using the following formula.
s = max(z _i :i≠t)-z _t
Note that the reliability of the candidate point may be determined based on, for example, the value of a softmax function or the value of logit, which is the output value of a neuron in the output layer of a neural network model.

The confidence calculation unit 103 stores the calculated confidence s of the candidate point in a predetermined storage area such as the memory 12 or the storage device 13.

The distance calculation unit 104 calculates the distance (shortest distance) between the candidate point and the spherical surface whose center is the original point and whose radius is the perturbation size ε ₀ . The perturbation size ε ₀ is the perturbation size of the adversarial sample generated by the adversarial sample generation unit 101, and may be set to an arbitrary value by the user or the like. The perturbation size ε ₀ may be referred to as the target perturbation size ε ₀ . Furthermore, a spherical surface whose center is the original point and whose radius is the target perturbation size ε ₀ may be referred to as the target spherical surface.

The distance calculation unit 104 calculates, for example, the distance between the candidate point and the target spherical surface. The distance between the candidate point and the target sphere surface may be, for example, a Euclidean distance. The distance between the candidate point and the target sphere surface may be expressed by the symbol d.

The distance calculation unit 104 stores the calculated distance d between the candidate point and the target sphere surface in a predetermined storage area such as the memory 12 or the storage device 13.

The adversarial sample candidate updating unit 105 updates the candidate point ( multiple candidate points are generated by updating the adversarial sample candidate points). The updating of a candidate point by the adversarial sample candidate updating unit 105 may also be referred to as moving the candidate point. The adversarial sample candidate updating unit 105 generates a plurality of candidate data (candidate points) based on training data (original points).

FIG. 2 is a diagram for explaining processing by the hostile sample candidate updating unit 105 of the information processing device 1 as an example of the embodiment.

The adversarial sample candidate updating unit 105 updates the candidate points so that the value of (cs + d) is minimized. Note that c is a coefficient indicating which of the confidence level s of the candidate point and the distance d between the candidate point and the target sphere surface is given higher priority.

In the example shown in FIG. 2, the gradient vector of the confidence level s ₁ of the candidate point P1 is represented by the symbol S ₁ . Further, the gradient vector of the distance d ₁ between the candidate point P 1 and the surface of the target sphere is represented by the symbol D ₁ . The candidate point P1 is generated by the candidate point generation unit 102 copying the original point P0. In FIG. 2, the original point P0 and the candidate point P1 have the same coordinates in the feature amount space.

Furthermore, in the example shown in FIG. 2, the gradient vector with the confidence level s ₁ of the candidate point P1 is represented by the symbol S ₁ . Further, the gradient vector of the distance d ₁ between the candidate point P 1 and the surface of the target sphere is represented by the symbol D ₁ .

The adversarial sample candidate updating unit 105 generates a candidate point P2 by updating the candidate point P1 so that the value of (cs ₁ + d ₁ ) is minimized. That is, the hostile sample candidate updating unit 105 updates the candidate point P1 to generate a candidate point P2.

In the example shown in FIG. 2, the vector corresponding to the confidence level _s2 of the candidate point P2 is represented by the symbol _S2 . Further, a vector corresponding to the distance d ₂ between the candidate point P2 and the surface of the target sphere is represented by the symbol D ₂ .

The adversarial sample candidate updating unit 105 generates a candidate point P3 by updating the candidate point P2 so that the value of (cs ₂ + d ₂ ) is minimized. That is, the hostile sample candidate updating unit 105 updates the candidate point P2 to generate a candidate point P3.

Furthermore, in the example shown in FIG. 2, the gradient vector with the confidence level _s3 of the candidate point P3 is represented by the symbol _S3 . Further, the gradient vector of the distance d ₃ between the candidate point P3 and the surface of the target sphere is represented by the symbol D ₃ .

The hostile sample candidate updating unit 105 generates a candidate point P4 by updating the candidate point P3 so that the value of (cs ₃ + d ₃ ) is minimized. That is, the hostile sample candidate updating unit 105 updates the candidate point P3 to generate a candidate point P4.

Note that when the confidence levels s ₀ to s ₃ of candidate points are not particularly distinguished, they are expressed as confidence levels s. Furthermore, when the distances d ₀ to d ₃ between the candidate point and the target sphere surface are not particularly distinguished, they are expressed as distances d.

The hostile sample candidate updating unit 105 generates new candidate points by updating (moving) the previously generated candidate points. The adversarial sample candidate update unit 105 stores information about each candidate point generated by the update in a predetermined storage area such as the memory 12 or the storage device 13.

The adversarial sample generation unit 101 calculates the confidence s of the candidate point by the certainty calculation unit 103, the distance d between the candidate point and the target sphere surface by the distance calculation unit 104, and the adversarial sample candidate update unit 105. The updating of candidate points is repeatedly executed until a predetermined number of times (first termination condition) is reached.

Additionally, the adversarial sample generation unit 101 optimizes the value of the coefficient c while updating it using a method such as binary search. The adversarial sample generation unit 101 repeatedly updates the coefficient c until a predetermined number of times (second termination condition) is reached. Thereby, the adversarial sample generation unit 101 generates a plurality of candidate points (adversarial sample candidate points) based on the original point.

The adversarial sample generation unit 101 determines, as an adversarial sample, the candidate point whose perturbation size is closest to the target perturbation size ε ₀ from among the plurality of candidate points generated by the adversarial sample candidate update unit 105 .

That is, the adversarial sample generation unit 101 generates a spherical surface whose radius is ε ₀ and the target perturbation size centered on the original point (training data) from each of the plurality of candidate data and the confidence s of each of the plurality of candidate data. An adversarial sample is determined from among a plurality of candidate points (candidate data) based on the distance d to.

The training data generation unit 100 attaches a label (correct label) of the original point used to generate the adversarial sample to the generated adversarial sample and uses it as training data for a classification model (machine learning model). .

(B) Operation The processing of the hostile sample generation unit 101 of the information processing device 1 as an example of the embodiment configured as described above will be described according to the flowchart (steps S1 to S7) shown in FIG.

Prior to this process, the candidate point generation unit 102 generates candidate points based on the original points. The candidate point generation unit 102 also initializes the coefficient c. In initializing the coefficient c, an arbitrary value may be set for the coefficient c.

In step S1, the candidate point generation unit 102 updates the coefficient c by, for example, binary search.

In step S2, the certainty calculation unit 103 calculates the certainty s that the candidate point to be processed is classified into a class other than the target class. The candidate point to be processed corresponds to candidate data extracted from a plurality of candidate data (candidate points), and the candidate point generated by the candidate point generation unit 102 based on the original point and the adversarial sample generation unit 101 are combined. , a plurality of candidate points (adversarial sample candidate points) generated based on the original point, and a plurality of candidate points (candidate point data).

In step S3, the distance calculation unit 104 calculates the distance d between the candidate point and the target spherical surface (a spherical surface whose center is the original point and whose radius is the perturbation size ε ₀ ).

In step S4, the adversarial sample candidate updating unit 105 updates the candidate point based on the confidence s of the candidate point calculated by the certainty calculation unit 103 and the distance d between the candidate point and the target sphere surface calculated by the distance calculation unit 104. , update the adversarial sample. The adversarial sample candidate updating unit 105 updates the candidate points so that the value of (cs + d) is minimized.

In step S5, the candidate point generation unit 102 checks whether the processes of steps S2 to S4 have been executed a specified number of times, that is, whether the first termination condition is satisfied. As a result of the confirmation, if the first termination condition is not satisfied, that is, if the number of processing steps S2 to S4 has not reached the specified number of times (see No route in step S5), the process returns to step S2.

On the other hand, if the first termination condition is satisfied, that is, if the number of times steps S2 to S4 are processed reaches the specified number (see the Yes route in step S5), the process moves to step S6.

In step S6, it is checked whether the coefficient c has been updated a specified number of times, that is, whether the second termination condition is satisfied. As a result of the confirmation, if the second termination condition is not satisfied, that is, if the number of updates of the coefficient c has not reached the specified number of times (see No route in step S6), the process returns to step S1. By repeating this No route in step S6, a plurality of different candidate points are generated depending on the value of each coefficient c.

On the other hand, if the second termination condition is satisfied, that is, if the number of updates of the coefficient c reaches the specified number (see the Yes route in step S6), the process moves to step S7.

In step S7, the adversarial sample generation unit 101 determines, as an adversarial sample, the candidate point whose perturbation size is closest to the target perturbation size ε ₀ from among the plurality of candidate points generated by the adversarial sample candidate update unit 105. .

The generated adversarial sample is attached with a label (correct label) of the original point used to generate the adversarial sample, and is used as training data for a classification model (machine learning model).

(C) Effects According to the information processing device 1 as an example of the embodiment, the adversarial sample candidate updating unit 105 updates (cs + Update the candidate points so that the value of d) is minimized. Then, the adversarial sample generation unit 101 determines, as an adversarial sample, the candidate point whose perturbation size is closest to the target perturbation size ε ₀ from among the plurality of candidate points generated by the adversarial sample candidate update unit 105.

As a result, it is possible to generate an adversarial sample that barely crosses the decision boundary and has an optimal deviation from the original point (original data). That is, it is possible to easily generate strong adversarial samples that are effective in adversarial training to generate a classification model that is difficult to fool.

Further, the adversarial sample candidate updating unit 105 determines a candidate point whose perturbation size is closest to the target perturbation size ε ₀ as an adversarial sample. As a result, training data with a constant perturbation size ε ₀ can be obtained. Therefore, it is possible to easily generate a plurality of adversarial samples with a constant effective perturbation size in adversarial training for generating a classification model that is difficult to fool.

FIGS. 4 and 5 are diagrams illustrating the generation results of hostile samples by the hostile sample generation unit 101 of the information processing device 1 as an example of the embodiment in comparison with the conventional method.

4 and 5 show examples in which adversarial samples were generated using the regularization-based method (C&W) [N. Carlini and D. Wagner, 2016].

4 and 5, for image classification AI (Artificial Intelligence), adversarial samples generated by C&W using a conventional method, and adversarial samples generated by C&W by the adversarial sample generation unit 101 of this information processing device. is displayed as a histogram with perturbation size ε.

FIG. 4 shows the distribution of adversarial samples generated when ε=1 is fixed, and FIG. 5 shows the distribution of adversarial samples generated when ε=2 is fixed.

As shown in FIGS. 4 and 5, the adversarial sample generation unit 101 of the information processing apparatus can generate many adversarial samples with a set specific perturbation size ε.

In adversarial training, by first training with training data with a small perturbation size (e.g. ε=0.1) and then training with a large perturbation size (e.g. ε=0.2), classification is difficult even with large perturbations. A model can be generated.

(D) Others FIG. 6 is a diagram illustrating the hardware configuration of the information processing device 1 as an example of the embodiment.

The information processing device 1 includes, for example, a processor 11, a memory 12, a storage device 13, a graphic processing device 14, an input interface 15, an optical drive device 16, a device connection interface 17, and a network interface 18 as components. These components 11 to 18 are configured to be able to communicate with each other via a bus 19.

A processor (control unit) 11 controls the entire information processing device 1 . Processor 11 may be a multiprocessor. The processor 11 includes, for example, a CPU, an MPU (Micro Processing Unit), a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), a PLD (Programmable Logic Device), an FPGA (Field Programmable Gate Array), and a GPU (Graphics Processing Unit). It may be any one of the following. Further, the processor 11 may be a combination of two or more types of elements among CPU, MPU, DSP, ASIC, PLD, FPGA, and GPU.

Then, when the processor 11 executes a control program (generation program: not shown) for the information processing device 1, the function as the training data generation section 100 illustrated in FIG. 1 is realized.

Note that the information processing device 1 realizes the function as the training data generation unit 100 by executing a program (generation program, OS program) recorded on a computer-readable non-temporary recording medium, for example.

A program that describes the processing content to be executed by the information processing device 1 can be recorded on various recording media. For example, a program to be executed by the information processing device 1 can be stored in the storage device 13. The processor 11 loads at least a portion of the program in the storage device 13 into the memory 12 and executes the loaded program.

Furthermore, the program to be executed by the information processing device 1 (processor 11) may be recorded on a non-temporary portable recording medium such as the optical disk 16a, the memory device 17a, or the memory card 17c. The program stored in the portable recording medium becomes executable after being installed in the storage device 13 under the control of the processor 11, for example. Furthermore, the processor 11 can also directly read and execute a program from a portable recording medium.

The memory 12 is a storage memory including ROM (Read Only Memory) and RAM (Random Access Memory). The RAM of the memory 12 is used as a main storage device of the information processing device 1. At least a part of the program to be executed by the processor 11 is temporarily stored in the RAM. The memory 12 also stores various data necessary for processing by the processor 11.

The storage device 13 is a storage device such as a hard disk drive (HDD), SSD (Solid State Drive), or storage class memory (SCM), and stores various data. The storage device 13 is used as an auxiliary storage device of the information processing device 1. The storage device 13 stores an OS program, a control program, and various data. The control program includes a generation program.

Note that a semiconductor storage device such as an SCM or a flash memory can also be used as the auxiliary storage device. Further, a plurality of storage devices 13 may be used to configure RAID (Redundant Arrays of Inexpensive Disks).

Furthermore, the storage device 13 may store various data generated when the training data generation unit 100 described above executes each process.

A monitor 14a is connected to the graphic processing device 14. The graphics processing device 14 displays images on the screen of the monitor 14a according to instructions from the processor 11. Examples of the monitor 14a include a display device using a CRT (Cathode Ray Tube), a liquid crystal display device, and the like.

A keyboard 15a and a mouse 15b are connected to the input interface 15. The input interface 15 transmits signals sent from the keyboard 15a and mouse 15b to the processor 11. Note that the mouse 15b is an example of a pointing device, and other pointing devices can also be used. Other pointing devices include touch panels, tablets, touch pads, trackballs, and the like.

The optical drive device 16 uses laser light or the like to read data recorded on the optical disc 16a. The optical disc 16a is a portable, non-temporary recording medium on which data is readably recorded by light reflection. Examples of the optical disc 16a include a DVD (Digital Versatile Disc), a DVD-RAM, a CD-ROM (Compact Disc Read Only Memory), and a CD-R (Recordable)/RW (ReWritable).

The device connection interface 17 is a communication interface for connecting peripheral devices to the information processing device 1. For example, a memory device 17a or a memory reader/writer 17b can be connected to the device connection interface 17. The memory device 17a is a non-temporary recording medium equipped with a communication function with the device connection interface 17, such as a USB (Universal Serial Bus) memory. The memory reader/writer 17b writes data to or reads data from the memory card 17c. The memory card 17c is a card-type non-temporary recording medium.

The network interface 18 is connected to a network. The network interface 18 sends and receives data via the network. Other information processing devices, communication devices, etc. may be connected to the network.

The disclosed technology is not limited to the embodiments described above, and can be implemented with various modifications without departing from the spirit of the present embodiments. Each configuration and each process of this embodiment can be selected or selected as necessary, or may be combined as appropriate.

For example, in the information processing device 1 described above, the training data generation unit 100 (adversarial sample generation unit 101) has a function as the candidate point generation unit 102, but the present invention is not limited to this.

For example, the function of the candidate point generation unit 102 may be performed in another information processing device connected to the information processing device 1 via a network or the like. Information on candidate points generated by another information processing device may be received via a network or the like, and used by the adversarial sample generation unit 101 to generate an adversarial sample.

In the embodiment described above, for example, as shown in the flowchart of FIG. 3, an example is shown in which one adversarial sample is generated from one candidate point, but the present invention is not limited to this. The adversarial sample generation unit 101 selects a plurality of candidate points whose perturbation size is within a predetermined range with respect to the target perturbation size ε ₀ from among the plurality of candidate points generated by the adversarial sample candidate update unit 105 as adversarial samples. It may be determined as

That is, the adversarial sample candidate updating unit 105 may determine, as an adversarial sample, candidate data for which the difference between the perturbation size and the target perturbation size ε ₀ is smaller than the reference among the plurality of candidate points (candidate data).

Furthermore, the present embodiment can be implemented and manufactured by those skilled in the art based on the above-mentioned disclosure.

1 Information processing device 11 Processor (control unit)
12 Memory 13 Storage device 14 Graphic processing device 14a Monitor 15 Input interface 15a Keyboard 15b Mouse 16 Optical drive device 16a Optical disk 17 Device connection interface 17a Memory device 17b Memory reader/writer 17c Memory card 18 Network interface 19 Bus 100 Training data generation section 10 1 Adversarial sample generation unit 102 Candidate point generation unit 103 Confidence calculation unit 104 Distance calculation unit 105 Adversarial sample update unit

Claims

A generation method for generating adversarial samples used for training a class classification model, the method comprising:
A process of generating a plurality of candidate data based on training data used for training the class classification model;
The confidence that each of the plurality of candidate data is classified into a class associated with the training data, and a spherical surface whose radius is the target perturbation size centered on the training data from each of the plurality of candidate data. A generation method characterized in that a computer executes a process of determining a hostile sample from among the plurality of candidate data based on the distance to.
The process of generating the plurality of candidate data includes:
Based on the candidate data extracted from among the plurality of candidate data, updating is performed so that a value based on the confidence of the extracted candidate data and the distance between the extracted candidate data and the spherical surface is minimized. 2. The generation method according to claim 1, further comprising a process of generating new candidate data.
The process of determining a hostile sample from among the plurality of candidate data includes:
3. Generation according to claim 1, further comprising a process of determining candidate data, among the plurality of candidate data, for which a difference between a perturbation size and the target perturbation size is smaller than a reference, as the adversarial sample. Method.
Generate multiple candidate data based on the training data used for training the class classification model,
The confidence that each of the plurality of candidate data is classified into a class associated with the training data, and a spherical surface whose radius is the target perturbation size centered on the training data from each of the plurality of candidate data. A generation program that causes a computer to execute a process of determining an adversarial sample to be used for training the classification model from among the plurality of candidate data based on the distance to the object.
The process of generating the plurality of candidate data includes:
Based on the candidate data extracted from among the plurality of candidate data, updating is performed so that a value based on the confidence of the extracted candidate data and the distance between the extracted candidate data and the spherical surface is minimized. 5. The generation program according to claim 4, further comprising a process of generating new candidate data.
The process of determining a hostile sample from among the plurality of candidate data includes:
6. Generation according to claim 4, further comprising a process of determining candidate data, among the plurality of candidate data, for which a difference between a perturbation size and the target perturbation size is smaller than a reference, as the adversarial sample. program.
Generate multiple candidate data based on the training data used for training the class classification model,
The confidence that each of the plurality of candidate data is classified into a class associated with the training data, and a spherical surface whose radius is the target perturbation size centered on the training data from each of the plurality of candidate data. An information processing apparatus comprising: a processing unit that determines an adversarial sample to be used for training the classification model from among the plurality of candidate data based on the distance to the object.
The process of generating the plurality of candidate data includes:
Based on the candidate data extracted from among the plurality of candidate data, updating is performed so that a value based on the confidence of the extracted candidate data and the distance between the extracted candidate data and the spherical surface is minimized. 8. The information processing apparatus according to claim 7, further comprising a process of generating new candidate data.
The process of determining a hostile sample from among the plurality of candidate data includes:
The information according to claim 7 or 8, further comprising a process of determining, as the adversarial sample, candidate data in which a difference between a perturbation size and the target perturbation size is smaller than a reference among the plurality of candidate data. Processing equipment.