WO2024009362A1

WO2024009362A1 - Abnormality detection device, abnormality detection method, and abnormal detection program

Info

Publication number: WO2024009362A1
Application number: PCT/JP2022/026621
Authority: WO
Inventors: 充敏熊谷
Original assignee: 日本電信電話株式会社
Priority date: 2022-07-04
Filing date: 2022-07-04
Publication date: 2024-01-11

Abstract

A training data input unit (11) acquires one or more data sets including abnormal data of a task related to a target task of an abnormality detection process, and a range of a false detection rate of each of the data sets. When normal data and a designated range of the false detection rate are input, a model training unit (13) uses the acquired data set and the range of a false detection rate and trains an abnormality detection training model (14a) which outputs an abnormality detector that performs abnormality detection in which pAUC is maximized in the range of a false detection rate.

Description

Anomaly detection device, anomaly detection method, and anomaly detection program

The present invention relates to an anomaly detection device, an anomaly detection method, and an anomaly detection program.

In recent years, anomaly detection technology that learns normal patterns from datasets and identifies whether given unknown data is abnormal has been used in various fields such as intrusion detection, medical image diagnosis, and industrial system monitoring.

In anomaly detection technology, it is important to solve a partial AUC (pAUC) maximization problem that maximizes the detection rate while keeping the false positive rate below a certain value (see Non-Patent Document 1).

However, with conventional techniques, it may be difficult to solve the pAUC maximization problem in anomaly detection. For example, the pAUC maximization method requires labeled data of abnormality and normality, so it cannot be used in cases where labeled data of abnormality and normality is not available. On the other hand, in anomaly detection, anomaly data is rare and it may be difficult to collect anomaly data for the target task.

The present invention has been made in view of the above, and an object of the present invention is to easily enable learning to maximize pAUC in abnormality detection.

In order to solve the above-mentioned problems and achieve the purpose, an anomaly detection device according to the present invention includes one or more datasets including anomaly data of tasks related to a target task to be processed for anomaly detection, and each data set. an acquisition unit that acquires the false positive rate range of the set, and the acquired data set and the false positive rate range, when normal data and the specified false positive rate range are input, The present invention is characterized by comprising a learning unit that learns a model that outputs an anomaly detector that detects an anomaly in which pAUC is maximized in a false detection rate range.

According to the present invention, learning to maximize pAUC in abnormality detection is easily possible.

FIG. 1 is a diagram for explaining the outline of an abnormality detection device. FIG. 2 is a diagram for explaining the outline of the abnormality detection device. FIG. 3 is a schematic diagram illustrating a schematic configuration of the abnormality detection device. FIG. 4 is a diagram for explaining the processing of the model learning section. FIG. 5 is a flowchart showing the learning processing procedure. FIG. 6 is a flowchart showing the detection processing procedure. FIG. 7 is a diagram illustrating a computer that executes an abnormality detection program.

Hereinafter, one embodiment of the present invention will be described in detail with reference to the drawings. Note that the present invention is not limited to this embodiment. In addition, in the description of the drawings, the same parts are denoted by the same reference numerals.

[Overview of anomaly detection device]
First, FIGS. 1 and 2 are diagrams for explaining the outline of an abnormality detection device. The anomaly detection device detects data different from a normal pattern as an anomaly using an anomaly detection model learned to maximize partial AUC (pAUC).

Here, in the abnormality detector s, using the abnormality determination threshold t, the sample x is determined to be abnormal if s(x)>t, and normal if s(x)<t. shall be taken as a thing. The ROC curve in this case represents the relationship between the false positive rate (FPR) and the true positive rate (TPR) when the threshold t for abnormality determination is changed. The area under the ROC curve indicated by diagonal lines in FIG. 1(a) is called AUC. AUC is used as a performance index of an anomaly detector.

In addition, pAUC in the false positive rate range (α, β) means the value normalized to the area when the FPR range is limited to [α, β] in AUC, which is indicated by diagonal lines in Figure 1(b). do. For example, pAUC when α=0 and β=0.1 is an index for evaluating the performance of an anomaly detector when the false positive rate is 0.1 or less.

In order to obtain an anomaly detector that maximizes pAUC through learning, data with abnormal/normal labels is required as learning data, but abnormal data is rare and may be difficult to obtain. Therefore, the anomaly detection device of this embodiment performs learning using a plurality of related data sets including normal data and abnormal data so as to be able to detect an abnormality in a target task data set containing only normal data.

Here, the related data set is a related task data set that is similar to the target task data set. For example, in a service for a plurality of users, a data set of target tasks related to a certain new user is exemplified as a data set of related tasks related to other users who have been in service for a long time. In this case, it is difficult to collect a data set including abnormal data for a new user, but it may be possible to collect a data set including abnormal data for a user who has been operating for a long time.

In this way, the anomaly detection device performs learning of the anomaly detector of the target task by using the related data set including normal data and abnormal data for learning. Specifically, the anomaly detection device performs learning to maximize pAUC in [α, β] by inputting the false detection rate range (α, β). For example, as shown in Fig. 2, the anomaly detection device detects various related data sets (

normal data sets

1, 2, 3, ...) and various false positive rate ranges ((a1, b1), (a2, b2). , (a3, b3), ...). This enables the anomaly detection device to learn an anomaly detection learning model that can be generalized to unknown data sets. In other words, by inputting the target task dataset and the desired false positive rate range (α', β') to the trained anomaly detection learning model, the pAUC at [α', β'] is maximized. Anomaly detectors will now be output.

[Configuration of anomaly detection device]
Next, FIG. 3 is a schematic diagram illustrating a schematic configuration of an abnormality detection device. The abnormality detection device 1 according to this embodiment is realized by a general-purpose computer such as a workstation or a personal computer, and executes an abnormality detection process to be described later.

As shown in FIG. 3, the abnormality detection device 1 of this embodiment includes a learning section 10 that performs learning processing and a detection section 20 that performs detection processing. The learning unit 10 uses a plurality of related data sets to learn an anomaly detection learning model 14a that maximizes pAUC within an arbitrary false positive rate range. The detection unit 20 uses the anomaly detection learning model 14a output by learning by the learning unit 10 to output an anomaly detector that maximizes pAUC within a desired false detection rate range for the data of the target task. Then, the detection unit 20 uses the output abnormality detector to detect an abnormality in the data of the target task. The detection unit 20 may be implemented in the same hardware as the learning unit 10, or may be implemented in different hardware.

[Study Department]
The learning section 10 includes a learning data input section 11 , a feature extraction section 12 , a model learning section 13 , and a storage section 14 .

The learning data input unit 11 is realized using an input device such as a keyboard or a mouse, and inputs various instruction information to the control unit in response to input operations by an operator. In the present embodiment, the learning data input unit 11 functions as an acquisition unit and collects one or more datasets including abnormal data of tasks related to the target task to be processed for abnormality detection, and false detections of each dataset. Get the rate range. As the false detection rate range, a set of predetermined ranges {(a1, b1), (a2, b2), (a3, b3), ...} may be obtained in advance.

Note that the related data set may be input to the learning unit 10 from an external server device or the like via a communication control unit (not shown) implemented by a NIC (Network Interface Card) or the like.

The control unit is realized using a CPU (Central Processing Unit) that executes a processing program, and functions as a feature extraction unit 12 and a model learning unit 13.

The feature extraction unit 12 converts each sample of the acquired related data set into a feature vector in preparation for processing in the model learning unit 13, which will be described later. Here, the feature vector is a representation of the features of necessary data as an n-dimensional numerical vector. The feature extraction unit 12 performs conversion into a feature vector using a method commonly used in machine learning. For example, when the data is text, the feature extraction unit 12 can apply a method using morphological analysis, a method using n-grams, a method using delimiters, etc.

The model learning section 13 functions as a learning section. In other words, the model learning unit 13 uses the acquired related data set and the false positive rate range to calculate the pAUC in the false positive rate range when normal data and the specified false positive rate range are input. An anomaly detection learning model 14a that outputs an anomaly detector that performs maximized anomaly detection is learned.

Specifically, the model learning unit 13 learns an anomaly detection learning model 14a using a permutation-invariant neural network. The model learning unit 13 also learns the anomaly detection learning model 14a so as to output an anomaly detector using a differentiable model such as a feedforward neural network. The differentiable model is, for example, an autoencoder or a one-class SVM.

Here, FIG. 4 is a diagram for explaining the processing of the model learning section. FIG. 4 exemplifies the pseudo code of the processing of the model learning unit 13.

First, assume that the target data set S _- containing only normal data is expressed by equation (1).

Assume that T related data sets expressed by the following equation (2) are given to the model learning unit 13. Furthermore, it is assumed that the dimension D of the feature vector is the same for all datasets.

The purpose here is that when the anomaly detection unit 23 (described later) receives a target data set S _- which is not included in the related data set and false positive rate range inputs (α, β), The objective is to obtain an anomaly detector that maximizes pAUC at [α, β].

The model learning unit 13 generates an abnormality detection learning model 14a through learning. Then, an anomaly detection unit 23, which will be described later, outputs an anomaly detector s from the target data set S _- and the false detection rate range (α, β) using the anomaly detection learning model 14a. In that case, the feature extraction unit 22, which will be described later, converts S ₋ into a vector representation z shown in the following equation (3).

Here, f and g are arbitrary neural networks. Since the "sum" of f does not depend on the order of the samples in the target data set S _- , one vector z is determined for the set S by the above equation (3). Note that the neural network is not particularly limited, and for example, any permutation-invariant neural network such as "maximum value" or set transformer can be applied.

The anomaly detector s is a function that outputs an anomaly score for the sample x, and is defined by a neural network shown in the following equation (4).

The linear weight parameter w (α, β) is defined by another neural network that receives two-dimensional (α, β) as input. Since this anomaly detector s depends on the vector representation z of normal data and the false positive rate range (α, β), its properties change when these values change. That is, when the vector representation z of normal data and the false positive rate range (α, β) are newly given, the model learning unit 13 maximizes pAUC in the false positive rate range (α, β). The purpose is to output an abnormality detector s.

The model learning unit 13 generates an anomaly detection learning model 14a by learning using the related data set. Here, normal data selected from the related data set is represented by S _- . The objective function of the abnormality detection learning model 14a is expressed by the following equations (5) and (6). Further, the learning parameters of the abnormality detection learning model 14a are the parameters of the neural models f, g, and w.

Here, \tilde{pAUC} in the above equation (5) is a function obtained by replacing the indicator function I of \hat{pAUC} in the above equation (6) with a differentiable sigmoid function. Moreover, Q represents normal/abnormal data randomly sampled from the same related data set as S ₋ . R is a set of false positive rate ranges specified by the user in advance. By optimizing the objective functions of equations (5) and (6) above, the pAUC calculated by Q can be maximized for the abnormality detector s defined by the normal data S _- and the false positive rate range (α, β). Learning takes place in such a way that it becomes A stochastic gradient method is used for this learning.

Returning to the explanation of FIG. 3. The storage unit 14 is realized by a semiconductor memory device such as a RAM (Random Access Memory) or a flash memory, or a storage device such as a hard disk or an optical disk. The learned abnormality detection learning model 14a is stored in the storage unit 14 of this embodiment.

[Detection part]
The detection section 20 includes a data input section 21 , a feature extraction section 22 , an anomaly detection section 23 , and a result output section 24 .

The data input unit 21 is realized using an input device such as a keyboard or a mouse, and inputs various instruction information to the control unit and receives data in response to input operations by an operator. In this embodiment, the data input unit 21 receives input of a data set of a target task, a user-specified false positive rate range, and test data of a target task to be subjected to abnormality detection processing.

Note that this information may be input to the detection unit 20 from an external server device or the like via a communication control unit (not shown) implemented by a NIC or the like. Further, the data input section 21 may be the same hardware as the learning data input section 11.

The control unit is realized using a CPU or the like that executes a processing program, and includes a feature extraction unit 22 and an abnormality detection unit 23.

Similar to the feature extraction unit 12 of the learning unit 10, the feature extraction unit 22 converts each sample of the acquired target data set into a feature vector in preparation for processing in the anomaly detection unit 23.

The abnormality detection section 23 functions as a detection section. That is, the anomaly detection unit 23 inputs the normal data of the target task and the specified false detection rate range to the learned anomaly detection learning model 14a, and uses the output anomaly detector to detect the data of the target task. Detect abnormalities. Specifically, as described above, the anomaly detection unit 23 inputs the target data set S _- which is not included in the related data set and the false positive rate range input (α, β), and inputs the false positive rate range [α , β] is obtained. Further, the abnormality detection unit 23 uses the output abnormality detector to determine whether each test data of the target task is normal or abnormal.

The result output unit 24 is realized by a display device such as a liquid crystal display, a printing device such as a printer, an information communication device, etc., and outputs the result of the abnormality detection process to the operator. For example, the determination result of whether the input test data of the target task is normal or abnormal is output.

[Anomaly detection processing]
Next, with reference to FIGS. 5 and 6, abnormality detection processing by the abnormality detection device 1 will be described. The anomaly detection process of the anomaly detection device 1 includes a learning process by the learning section 10 and a detection process by the detecting section 20.

[Learning process]
FIG. 5 is a flowchart illustrating the learning processing procedure. The flowchart in FIG. 5 starts, for example, at the timing when the user inputs an operation instructing to start the learning process.

First, the learning data input unit 11 receives input of a plurality of related data sets, each of which includes normal data and abnormal data, and a false detection rate range of each data set (step S1). Next, the feature extraction unit 12 converts each sample of the received related data set into a feature vector (step S2).

Next, using the input related data set and the false positive rate range, the model learning unit 13 calculates the pAUC in the false positive rate range when normal data and the specified false positive rate range are input. An anomaly detection learning model 14a that outputs an anomaly detector that detects an anomaly with the maximum value is learned (step S3).

That is, the model learning unit 13 generates the anomaly detection learning model 14a by learning using the input related data set and the false positive rate range. This anomaly detection learning model 14a outputs an anomaly detector when normal data and a specified false detection rate range are input. The anomaly detector outputs an anomaly score of input data so as to maximize pAUC in a specified detection rate range.

Additionally, the model learning unit 13 stores the learned anomaly detection learning model 14a in the storage unit 14. This completes the series of learning processes.

[Detection processing]
Next, FIG. 6 is a flowchart illustrating the detection processing procedure. The flowchart in FIG. 6 is started, for example, at the timing when the user inputs an operation instructing the start of the estimation process.

First, the data input unit 21 receives normal data of the target task and a specified false positive rate range (step S11), and the feature extraction unit 22 converts each received sample (normal data) into a feature vector (step S11). S12).

Next, the anomaly detection unit 23 inputs the normal data of the target task and the specified false detection rate range to the learned anomaly detection learning model 14a, and tests the target task using the output anomaly detector. An abnormality in the data is detected (step S13).

In other words, the anomaly detection unit 23 inputs the target data set S _{- which} is not included in the related data set and the false positive rate range input (α, β), and maximizes pAUC in the false positive rate range [α, β]. Obtain an anomaly detector that can Further, the abnormality detection unit 23 inputs the test data of the target task to the output abnormality detector, and obtains a determination result as to whether each test data is normal or abnormal.

Then, the result output unit 24 outputs the abnormality detection result, that is, the determined result of whether it is normal or abnormal (step S14). This completes the series of detection processes.

[effect]
As described above, in the anomaly detection device 1, the learning data input unit 11 inputs one or more data sets containing abnormal data of tasks related to the target task to be processed for anomaly detection, and the error data of each data set. Obtain the detection rate range. The model learning unit 13 uses the acquired data set and the false positive rate range, and when normal data and the specified false positive rate range are input, the pAUC in the false positive rate range is the maximum. An anomaly detection learning model 14a that outputs an anomaly detector that performs standardized anomaly detection is learned.

Specifically, the model learning unit 13 learns an anomaly detection learning model 14a using a permutation-invariant neural network. The model learning unit 13 also learns the anomaly detection learning model 14a so as to output an anomaly detector using a differentiable model.

In this way, the anomaly detection device 1 can maximize pAUC in the desired false positive rate range by learning using related data sets including abnormal data, even when only normal data is obtained for the target task. It becomes possible to obtain an anomaly detector that can detect anomalies in data of a target task. Further, once the anomaly detection learning model 14a is generated by performing learning on the related data set, an anomaly detector can be obtained without relearning on any normal data set. Therefore, it is possible to perform abnormality detection with high accuracy without performing relearning that requires expensive calculations. For example, anomalies can be detected even on low-resource computers where anomalies are generally difficult to detect. In this way, learning to maximize pAUC in abnormality detection is easily possible.

Further, the anomaly detection unit 23 inputs the normal data of the target task and the specified false detection rate range to the learned anomaly detection learning model 14a, and uses the output anomaly detector to generate test data of the target task. Detects abnormalities. Thereby, even if only normal data is obtained for the target task, it is possible to perform highly accurate abnormality detection that maximizes pAUC within a desired false detection rate range.

[program]
It is also possible to create a program in which the processing executed by the abnormality detection device 1 according to the embodiment described above is written in a computer-executable language. As one embodiment, the anomaly detection device 1 can be implemented by installing an anomaly detection program that executes the above-described anomaly detection process on a desired computer as packaged software or online software. For example, by causing the information processing device to execute the above abnormality detection program, the information processing device can be made to function as the abnormality detection device 1. In addition, information processing devices include mobile communication terminals such as smartphones, mobile phones, and PHSs (Personal Handyphone Systems), as well as slate terminals such as PDAs (Personal Digital Assistants). Further, the functions of the abnormality detection device 1 may be implemented in a cloud server.

FIG. 7 is a diagram showing an example of a computer that executes the abnormality detection program. Computer 1000 includes, for example, memory 1010, CPU 1020, hard disk drive interface 1030, disk drive interface 1040, serial port interface 1050, video adapter 1060, and network interface 1070. These parts are connected by a bus 1080.

The memory 1010 includes a ROM (Read Only Memory) 1011 and a RAM 1012. The ROM 1011 stores, for example, a boot program such as BIOS (Basic Input Output System). Hard disk drive interface 1030 is connected to hard disk drive 1031. Disk drive interface 1040 is connected to disk drive 1041. A removable storage medium such as a magnetic disk or an optical disk is inserted into the disk drive 1041, for example. For example, a mouse 1051 and a keyboard 1052 are connected to the serial port interface 1050. For example, a display 1061 is connected to the video adapter 1060.

Here, the hard disk drive 1031 stores, for example, an OS 1091, an application program 1092, a program module 1093, and program data 1094. Each piece of information described in the above embodiments is stored in, for example, the hard disk drive 1031 or the memory 1010.

Further, the abnormality detection program is stored in the hard disk drive 1031, for example, as a program module 1093 in which commands to be executed by the computer 1000 are written. Specifically, a program module 1093 in which each process executed by the abnormality detection device 1 described in the above embodiment is described is stored in the hard disk drive 1031.

Further, data used for information processing by the abnormality detection program is stored as program data 1094 in, for example, the hard disk drive 1031. Then, the CPU 1020 reads out the program module 1093 and program data 1094 stored in the hard disk drive 1031 to the RAM 1012 as necessary, and executes each of the above-described procedures.

Note that the program module 1093 and program data 1094 related to the abnormality detection program are not limited to being stored in the hard disk drive 1031; for example, they may be stored in a removable storage medium and read by the CPU 1020 via the disk drive 1041 or the like. May be served. Alternatively, the program module 1093 and program data 1094 related to the abnormality detection program are stored in another computer connected via a network such as a LAN (Local Area Network) or a WAN (Wide Area Network), and are transmitted via the network interface 1070. The data may also be read out by the CPU 1020.

Although embodiments to which the invention made by the present inventor is applied have been described above, the present invention is not limited by the description and drawings that form part of the disclosure of the present invention by this embodiment. That is, all other embodiments, examples, operational techniques, etc. made by those skilled in the art based on this embodiment are included in the scope of the present invention.

1 Anomaly detection device 10 Learning section 11 Learning data input section 12 Feature extraction section 13 Model learning section 14 Storage section 14a Anomaly detection learning model 20 Detection section 21 Data input section 22 Feature extraction section 23 Anomaly detection section 24 Result output section

Claims

an acquisition unit that acquires one or more datasets containing abnormality data of tasks related to the target task to be processed for abnormality detection and a false positive rate range of each dataset;
Using the acquired data set and the false positive rate range, when normal data and a specified false positive rate range are input, detect an abnormality in which pAUC is maximized in the false positive rate range. a learning unit that learns a model that outputs an anomaly detector to perform;
An anomaly detection device characterized by having.
further comprising: a detection unit that inputs normal data of the target task and a specified false positive rate range to the trained model, and detects an abnormality in the data of the target task using an output anomaly detector; The abnormality detection device according to claim 1, further comprising:
The anomaly detection device according to claim 1, wherein the learning unit learns the model using a permutation-invariant neural network.
The anomaly detection device according to claim 1, wherein the learning unit learns the model so as to output the anomaly detector using a differentiable model.
An anomaly detection method performed by an anomaly detection device, the method comprising:
an acquisition step of acquiring one or more datasets containing abnormality data of tasks related to the target task to be processed for abnormality detection and a false positive rate range of each dataset;
Using the acquired data set and the false positive rate range, when normal data and a specified false positive rate range are input, detect an abnormality in which pAUC is maximized in the false positive rate range. a learning process for learning a model that outputs an anomaly detector;
An anomaly detection method characterized by comprising:
to the computer,
an acquisition step of acquiring one or more datasets including anomaly data of tasks related to the target task to be processed for anomaly detection, and a false positive rate range of each dataset;
Using the acquired data set and the false positive rate range, when normal data and a specified false positive rate range are input, detect an abnormality in which pAUC is maximized in the false positive rate range. a learning step of learning a model that outputs an anomaly detector to perform;
An anomaly detection program characterized by executing the following.