KR20220073652A

KR20220073652A - Method and apparatus for anomaly detection by reward-based retraining of GAN

Info

Publication number: KR20220073652A
Application number: KR1020210156841A
Authority: KR
Inventors: 김용국
Original assignee: 세종대학교산학협력단
Priority date: 2020-11-26
Filing date: 2021-11-15
Publication date: 2022-06-03

Abstract

본 발명은 적대적 생성 네트워크의 보상 기반 재훈련에 의한 이상 탐지 방법 및 장치를 개시한다. 본 발명에 따르면, 프로세서 및 상기 프로세서에 연결되는 메모리를 포함하되, 상기 메모리는, 생성기가 잠재 벡터를 입력 받아 가상 이미지를 생성하고, 상기 판별기가 훈련 데이터셋의 미니 배치와 상기 가상 이미지를 입력 받아 확률값을 출력하고, 상기 생성기의 손실 함수와 상기 판별기의 손실 함수에 의한 검증 손실과 초기 최선 검증 손실(initial best validation loss)을 비교하여 보상 증가 여부를 판단하고, 상기 보상이 증가하는 것으로 판단되는 경우 상기 미니 배치를 상기 판별기에 다시 입력하여 재훈련을 수행하도록, 상기 프로세서에 의해 실행 가능한 프로그램 명령어들을 저장하는 이상 탐지 장치가 제공된다.The present invention discloses an anomaly detection method and apparatus by reward-based retraining of an adversarial generative network. According to the present invention, a processor and a memory connected to the processor, wherein the memory generates a virtual image by receiving a latent vector, and the discriminator receives a mini-batch of a training dataset and the virtual image Outputs a probability value, compares the verification loss by the loss function of the generator and the loss function of the discriminator with the initial best validation loss to determine whether compensation increases, and it is determined that the compensation increases In this case, an anomaly detection device for storing program instructions executable by the processor is provided to re-enter the mini-batch to the discriminator to perform retraining.

Description

Method and apparatus for anomaly detection by reward-based retraining of GAN

본 발명은 적대적 생성 네트워크의 보상 기반 재훈련에 의한 이상 탐지 방법 및 장치에 관한 것이다. The present invention relates to an anomaly detection method and apparatus by reward-based retraining of an adversarial generative network.

비디오 이상 탐지(Video anomaly detection)는 일련의 프레임에서 나타나는 비정상적인 동작이나 상황을 감지하는 방법이다. 이러한 이상 상황은 빈번하게 발생하지 않기 때문에 다양한 이상 데이터를 얻기 어렵다. Video anomaly detection is a method of detecting abnormal motions or situations appearing in a series of frames. Since such anomalies do not occur frequently, it is difficult to obtain various abnormal data.

또한, CCTV와 같은 영상에서 다양한 이상 상황이 있을 수 있기 때문에 이들에 레이블을 지정하는 것이 어렵거나 많은 시간이 걸리는 문제가 있다. In addition, since there may be various abnormal situations in an image such as CCTV, it is difficult or takes a lot of time to label them.

예를 들어, 공원에서 자전거를 타는 사람의 장면을 탐지하는 경우, 비디오에서 단순히 원시 프레임을 학습하는 것만으로는 이러한 장면을 분류하기 위한 합리적인 성능을 달성하기 어렵기 때문에 옵티컬 플로우 벡터 추정 또는 연속 프레임 내에서 동작을 감지하도록 설계된 필터를 사용하여 프레임 내에서 훈련 기능을 극대화하기 위한 연구가 있었다. For example, when detecting scenes of a cyclist in a park, optical flow vector estimation or within successive frames is difficult to achieve reasonable performance for classifying these scenes simply by learning the raw frames from the video. There has been a study to maximize the training function within a frame using a filter designed to detect motion in

그러나 이들은 개발자가 특정 상황에 대해 최적화된 매개변수를 찾아야 하고 다양한 상황에 대처하기 어려운 문제점이 있다. However, they have a problem in that the developer needs to find parameters optimized for a specific situation, and it is difficult to cope with various situations.

KR 공개특허공보 10-2020-0085490KR Patent Publication No. 10-2020-0085490

상기한 종래기술의 문제점을 해결하기 위해, 본 발명은 학습 데이터가 충분하지 않은 환경에서도 효율적인 이상 탐지를 할 수 있는 적대적 생성 네트워크의 보상 기반 재훈련에 의한 이상 탐지 방법 및 장치를 제안하고자 한다. In order to solve the problems of the prior art, the present invention intends to propose a method and apparatus for anomaly detection by reward-based retraining of an adversarial generation network that can efficiently detect anomalies even in an environment in which learning data is insufficient.

상기한 바와 같은 목적을 달성하기 위하여, 본 발명의 일 실시예에 따르면, 적대적 생성 네트워크의 보상 기반 재훈련에 의한 이상 탐지 장치로서, 프로세서; 및 상기 프로세서에 연결되는 메모리를 포함하되, 상기 메모리는, 생성기가 잠재 벡터를 입력 받아 가상 이미지를 생성하고, 상기 판별기가 훈련 데이터셋의 미니 배치와 상기 가상 이미지를 입력 받아 확률값을 출력하고, 상기 생성기의 손실 함수와 상기 판별기의 손실 함수에 의한 검증 손실과 초기 최선 검증 손실(initial best validation loss)을 비교하여 보상 증가 여부를 판단하고, 상기 보상이 증가하는 것으로 판단되는 경우 상기 미니 배치를 상기 판별기에 다시 입력하여 재훈련을 수행하도록, 상기 프로세서에 의해 실행 가능한 프로그램 명령어들을 저장하는 이상 탐지 장치가 제공된다. In order to achieve the above object, according to an embodiment of the present invention, there is provided an apparatus for detecting anomaly by reward-based retraining of an adversarial generating network, comprising: a processor; and a memory connected to the processor, wherein the generator generates a virtual image by receiving a latent vector, the discriminator receives a mini-batch of a training dataset and the virtual image, and outputs a probability value, Comparing the loss function of the generator and the loss function of the discriminator with the validation loss and the initial best validation loss, it is determined whether compensation is increased, and when it is determined that the compensation is increased, the mini-batch is said An anomaly detection device is provided for storing program instructions executable by the processor to re-enter the discriminator to perform retraining.

상기 판별기의 손실 함수는 real 손실 함수 및 fake 손실 함수를 포함하고, 상기 프로그램 명령어들은, 상기 초기 최선 검증 손실이 상기 생성기의 손실 함수와 상기 real 손실 함수 및 fake 손실 함수의 합과 비교하여 미리 설정된 보상 상수보다 큰 경우, 상기 미니 배치를 상기 판별기에 다시 입력하여 재훈련할 수 있다. The loss function of the discriminator includes a real loss function and a fake loss function, and the program instructions are configured such that the initial best verification loss is preset by comparing the loss function of the generator and the sum of the real loss function and the fake loss function. If it is greater than the compensation constant, the mini-batch may be re-entered into the discriminator to retrain.

상기 미니 배치에 의한 재훈련에서, 상기 초기 최선 검증 손실은 상기 생성기의 손실 함수와 상기 real 손실 함수 및 fake 손실 함수의 합으로 대체될 수 있다. In the retraining by the mini-batch, the initial best validation loss may be replaced by the loss function of the generator and the sum of the real loss function and the fake loss function.

상기 프로그램 명령어들은, 상기 미니 배치에 의한 재훈련을 상기 보상을 증가시키지 않을때까지 반복 수행할 수 있다. The program instructions may repeatedly perform retraining by the mini-batch until the reward does not increase.

상기 판별기는, 복수의 컨볼루션 레이어 및 LSTM이 결합되는 네트워크일 수 있다. The discriminator may be a network in which a plurality of convolutional layers and LSTMs are combined.

상기 프로그램 명령어들은, 상기 재훈련을 통해 상기 생성기와 상기 판별기의 손실 함수가 최적화된 이후, 테스트 샘플을 입력 받아 상기 테스트 샘플의 이상 점수를 계산할 수 있다. After the loss functions of the generator and the discriminator are optimized through the retraining, the program instructions may receive a test sample and calculate an abnormal score of the test sample.

본 발명의 다른 측면에 따르면, 적대적 생성 네트워크의 보상 기반 재훈련에 의한 이상 탐지 장치로서, 잠재 벡터를 입력 받아 가상 이미지를 생성하는 생성기; 훈련 데이터셋의 미니 배치와 상기 가상 이미지를 입력 받아 확률값을 출력하는 판별기; 및 상기 생성기의 손실 함수와, 상기 판별기의 손실 함수에 의한 검증 손실과 초기 최선 검증 손실(initial best validation loss)을 비교하여 보상 증가 여부를 판단하는 보상 증가 판단부를 포함하되, 상기 판별기와 상기 보상 증가 판단부는 이상 탐지를 위한 내부 루프로 정의되며, 상기 보상 증가 판단부가 상기 보상이 증가하는 것으로 판단되는 경우 상기 미니 배치를 상기 판별기에 다시 입력하여 재훈련이 수행되는 이상 탐지 장치가 제공된다. According to another aspect of the present invention, there is provided an apparatus for detecting anomaly by reward-based retraining of an adversarial generation network, comprising: a generator for generating a virtual image by receiving a latent vector; a discriminator that receives a mini-batch of a training dataset and the virtual image and outputs a probability value; and a compensation increase determination unit for determining whether to increase compensation by comparing the loss function of the generator with the verification loss by the loss function of the discriminator and an initial best validation loss, wherein the discriminator and the compensation The increase determining unit is defined as an inner loop for anomaly detection, and when the reward increase determining unit determines that the reward increases, the mini-batch is input back into the discriminator to perform retraining.

본 발명의 또 다른 측면에 따르면, 프로세서 및 메모리를 포함하는 장치에서 적대적 생성 네트워크의 보상 기반 재훈련을 통해 이상 여부를 탐지하는 방법으로서, 생성기가 잠재 벡터를 입력 받아 가상 이미지를 생성하는 단계; 상기 판별기가 훈련 데이터셋의 미니 배치와 상기 가상 이미지를 입력 받아 확률값을 출력하는 단계; 상기 생성기의 손실 함수와, 상기 판별기의 손실 함수에 의한 검증 손실과 초기 최선 검증 손실(initial best validation loss)을 비교하여 보상 증가 여부를 판단하는 단계; 및 상기 보상이 증가하는 것으로 판단되는 경우 상기 미니 배치를 상기 판별기에 다시 입력하여 재훈련을 수행하는 단계를 포함하는 이상 탐지 방법이 제공된다. According to another aspect of the present invention, there is provided a method for detecting abnormality through reward-based retraining of an adversarial generation network in a device including a processor and a memory, the method comprising: a generator receiving a latent vector and generating a virtual image; receiving, by the discriminator, a mini-batch of a training dataset and the virtual image, and outputting a probability value; determining whether to increase compensation by comparing the loss function of the generator, the verification loss by the loss function of the discriminator, and an initial best validation loss; and when it is determined that the reward increases, re-entering the mini-batch into the discriminator to perform retraining.

본 발명의 또 다른 측면에 따르면, 상기한 방법을 수행하는 프로그램을 저장하는 컴퓨터 판독 가능한 기록매체가 제공된다. According to another aspect of the present invention, there is provided a computer-readable recording medium storing a program for performing the above method.

본 발명에 따르면, 주어진 데이터셋에서 이상 탐지에 중요한 샘플을 통해 재훈련을 수행하여 이상 탐지 효율을 높일 수 있는 장점이 있다. According to the present invention, there is an advantage in that anomaly detection efficiency can be increased by performing retraining on samples important for anomaly detection in a given dataset.

도 1은 본 발명의 바람직한 일 실시예에 따른 이상 탐지를 위한 네트워크 구성을 도시한 도면이다.
도 2는 본 실시예에 따른 알고리즘을 도시한 도면이다.
도 3은 본 실시예에 따른 네트워크의 상세 구성을 도시한 도면이다.
도 4는 본 실시예에 따른 각 미니 배치 재훈련 횟수를 나타낸 도면이다.
도 5는 본 실시예에 따른 외부 루프 및 내부 루프의 손실을 나타낸 것이다.
도 6은 본 실시예에 따른 이상 탐지 효율을 나타낸 것이다. 1 is a diagram illustrating a network configuration for anomaly detection according to a preferred embodiment of the present invention.
2 is a diagram illustrating an algorithm according to the present embodiment.
3 is a diagram illustrating a detailed configuration of a network according to the present embodiment.
4 is a view showing the number of each mini-batch retraining according to the present embodiment.
5 shows the loss of the outer loop and the inner loop according to the present embodiment.
6 illustrates anomaly detection efficiency according to the present embodiment.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세하게 설명하고자 한다.Since the present invention can have various changes and can have various embodiments, specific embodiments are illustrated in the drawings and described in detail.

그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. However, this is not intended to limit the present invention to specific embodiments, it should be understood to include all modifications, equivalents and substitutes included in the spirit and scope of the present invention.

도 1은 본 발명의 바람직한 일 실시예에 따른 이상 탐지를 위한 네트워크 구성을 도시한 도면이다. 1 is a diagram illustrating a network configuration for anomaly detection according to a preferred embodiment of the present invention.

도 1에 도시된 바와 같이, 본 실시예에 따른 이상 탐지 네트워크는 생성기(Generator, 100) 및 판별기(Discriminator, 102)가 서로 대립하여 학습을 수행하는 적대적 생성 네트워크(Generative Adversarial Network)일 수 있다. 1 , the anomaly detection network according to this embodiment may be a Generative Adversarial Network in which a generator 100 and a discriminator 102 oppose each other to perform learning. .

생성기(100)는 잠재 벡터(Latent vector)를 입력으로 하여 가상 이미지를 생성하고, 판별기(102)는 실제 이미지와 가상 이미지를 입력으로 하여 확률값을 출력한다. 여기서, 확률값은 Real인 경우 1, Fake인 경우 0이 출력된다. The generator 100 receives a latent vector as an input to generate a virtual image, and the discriminator 102 receives a real image and a virtual image as input and outputs a probability value. Here, the probability value is 1 in case of Real and 0 in case of Fake.

학습을 반복함에 따라 생성기(100)가 출력하는 이미지 분포는 실제 이미지의 분포와 같아진다. As the learning is repeated, the image distribution output by the generator 100 becomes the same as the distribution of the actual image.

본 실시예에 따른 네트워크는 생성기(100)와 판별기(102)를 모두 포함하는 외부 루프(external loop)와, 판별기(102) 및 상기한 판별기(102)의 출력을 입력으로 하여 보상 증가 여부를 판단하는 보상 증가 판단부(104)를 포함하는 내부 루프(internal loop)를 포함할 수 있다. The network according to this embodiment increases the compensation by inputting an external loop including both the generator 100 and the discriminator 102 and the discriminator 102 and the output of the discriminator 102 as an input. It may include an internal loop including a compensation increase determination unit 104 to determine whether or not.

본 실시예에 따르면, 상기한 내부 루프에서 보상 기반 재훈련(Reward-Driven Retraining)을 통해 이상 탐지를 수행한다. According to this embodiment, anomaly detection is performed through reward-driven retraining in the inner loop.

일반적으로 훈련 데이터셋이 주어지면 네트워크 훈련 과정에서 데이터셋의 각 샘플(미니 배치)이 네트워크로 공급되며, 각 훈련 샘플은 전체 학습 기간(하나의 에포크) 동안 동일한 횟수로 이용된다. In general, given a training dataset, each sample (mini-batch) of the dataset is fed to the network during network training, and each training sample is used the same number of times during the entire training period (one epoch).

그러나, 네트워크 학습 동안 이상 탐지에 유용한 훈련 샘플이 존재할 수 있다. However, there may be training samples useful for anomaly detection during network learning.

이러한 점을 고려하여, 본 실시예는 GAN에 보상 개념을 도입하여 유용한 훈련 샘플을 반복적으로 사용할 수 있도록 한다. In consideration of this point, this embodiment introduces the concept of compensation to the GAN so that useful training samples can be used repeatedly.

본 실시예에서는, 초기 최선 검증 손실(initial best validation loss)

에서, 훈련 샘플 미니 배치(batch) M을 사용하는 에포크(epoch) n 내의 반복(iteration) t를 고려할 때, 만일 이러한 반복 내에서 최적화된 네트워크의 반복 검증 손실(iteration validation loss)

이 가장 좋은 경우 (

)이면 동일한 미니 배치

가 내부 루프에서 반복적으로 사용된다. In this embodiment, the initial best validation loss (initial best validation loss)

In , considering an iteration t in an epoch n using a training sample mini-batch M, if the iteration validation loss of the optimized network within this iteration is

In this best case (

) if the same mini-batch

is used repeatedly in the inner loop.

이때, 검증 결과가 더 이상 개선되지 않을 때까지, 즉 보상을 더 이상 증가시키지 않을때까지 동일한 미니 배치

를 판별기(102)의 입력으로 사용한다. At this time, the same mini-batch until the verification result no longer improves, i.e. no longer increases the reward.

is used as an input of the discriminator 102 .

이를 통해 본 실시예에 따른 네트워크는 주어진 데이터셋에서 이상 탐지에 중요한 샘플에 집중할 수 있다. Through this, the network according to the present embodiment can focus on samples important for anomaly detection in a given dataset.

이는 중요한 샘플을 통해 네트워크 가중치를 강화하는 과정을 간주될 수 있다. This can be considered as a process of strengthening network weights through important samples.

도 2는 본 실시예에 따른 알고리즘을 도시한 도면이고, 도 3은 본 실시예에 따른 네트워크의 상세 구성을 도시한 도면이다. 2 is a diagram illustrating an algorithm according to the present embodiment, and FIG. 3 is a diagram illustrating a detailed configuration of a network according to the present embodiment.

도 2 내지 도 3에 도시된 바와 같이, 본 실시예에 따른 네트워크는 GAN과 LSTM(long short term memory)이 결합된 네트워크일 수 있다. 2 to 3 , the network according to the present embodiment may be a network in which a GAN and a long short term memory (LSTM) are combined.

전술한 바와 같이, 본 실시예에 따른 GAN은 생성기(100) 및 판별기(102) 를 포함하고, 아래와 같이 학습을 통해 이진 교차 엔트로피 손실(binary cross entropy loss)을 최대화/최소화한다. As described above, the GAN according to this embodiment includes the generator 100 and the discriminator 102, and maximizes/minimizes binary cross entropy loss through learning as follows.

여기서, x는 훈련 샘플, D는 판별기, G는 생성기, z는 잠재 벡터이다. where x is the training sample, D is the discriminator, G is the generator, and z is the latent vector.

도 2를 참조하면, 생성기가 잠재 벡터(z)를 입력 받아 가상 이미지(G(z))를 생성하고, 판별기가 훈련 데이터셋의 미니 배치와 상기 가상 이미지를 입력 받아 확률값을 출력한다. Referring to FIG. 2 , a generator receives a latent vector (z) and generates a virtual image (G(z)), and a discriminator receives a mini-batch of a training dataset and the virtual image and outputs a probability value.

초기 검증 손실은 손실은 초기 최선 검증 손실(

, initial best validation loss)로 설정된다. The initial validation loss is the initial best validation loss (

, initial best validation loss).

본 실시예에 따르면, 생성기(100)의 손실 함수와 판별기(102)의 손실 함수에 의한 검증 손실과 초기 최선 검증 손실(initial best validation loss)을 비교하여 보상 증가 여부를 판단하고, 보상이 증가하는 것으로 판단되는 경우 미니 배치를 판별기(102)에 다시 입력하여 재훈련을 수행한다. According to this embodiment, it is determined whether compensation is increased by comparing the verification loss by the loss function of the generator 100 and the loss function of the discriminator 102 with the initial best validation loss, and the compensation is increased. If it is determined that the mini-batch is re-entered into the discriminator 102, retraining is performed.

여기서, 생성기(100)의 출력은 G(z)로 표현되고, 생성기(100)의 손실 함수(g_loss)는 다음과 같이 표현된다. Here, the output of the generator 100 is expressed as G(z), and the loss function g_loss of the generator 100 is expressed as follows.

판별기(102)의 손실 함수는 real 손실 함수(d_loss_real) 및 fake 손실 함수(d_loss_fake)를 포함하며, real 손실 함수는 다음과 같이 표현된다. The loss function of the discriminator 102 includes a real loss function (d_loss_real) and a fake loss function (d_loss_fake), and the real loss function is expressed as follows.

또한, fake 손실 함수는 다음과 같다. Also, the fake loss function is:

도 2에 나타난 바와 같이, 초기 최선 검증 손실이 상기 생성기의 손실 함수와 상기 real 손실 함수 및 fake 손실 함수의 합(g_loss+d_loss_real+d_loss_fake)과 비교하여 미리 설정된 보상 상수보다 큰 경우, 미니 배치를 판별기(102)에 다시 입력하여 재훈련하는 것으로 결정된다.As shown in FIG. 2 , when the initial best verification loss is larger than a preset compensation constant by comparing the loss function of the generator and the sum of the real loss function and the fake loss function (g_loss+d_loss_real+d_loss_fake), the mini-batch is determined It is decided to re-enter the unit 102 to retrain.

재훈련 시, 초기 최선 검증 손실은 새로 계산된 생성기(100)와 판별기(102) 손실 함수의 합으로 대체된다. Upon retraining, the initial best validation loss is replaced by the newly calculated sum of the generator 100 and discriminator 102 loss functions.

도 4에 도시된 바와 같이, 외부 루프가 반복되는 동안, 내부 루프에서 미니 배치에 의한 반복은 10회 이상까지 수행될 수 있으며, 이는 각 미니 배치를 다른 횟수로 사용할 수 있음을 의미한다. As shown in Fig. 4, while the outer loop is iterated, iteration by mini-batch in the inner loop can be performed up to 10 or more times, which means that each mini-batch can be used a different number of times.

또한, 도 5에 도시된 바와 같이, 외부 루프가 반복될 수록 전체 손실은 감소하며, 아울러 하나의 외부 루프에서 내부 루프가 반복될수록 생성기와 판별기의 손실 함수의 합이 점차 감소하는 것을 알 수 있다. Also, as shown in Fig. 5, it can be seen that the total loss decreases as the outer loop is repeated, and the sum of the loss functions of the generator and the discriminator gradually decreases as the inner loop is repeated in one outer loop. .

이때, 동일한 미니 배치는 생성기와 판별기의 손실 함수의 합이 미리 설정된 보상 상수 이상인 경우까지만 재훈련에 사용된다. At this time, the same mini-batch is used for retraining only when the sum of the loss functions of the generator and the discriminator is equal to or greater than the preset compensation constant.

도 3에 도시된 바와 같이, 본 실시예에 따른 판별기(102)는 복수의 컨볼루션 레이어, 배치 정규화 레이어, Leaky Rectified Linear Unit (LeakyReLU)를 포함할 수 있다. 3 , the discriminator 102 according to the present embodiment may include a plurality of convolutional layers, a batch normalization layer, and a Leaky Rectified Linear Unit (LeakyReLU).

판별기(100)는 시간 t에서 생성기(100)의 출력과 훈련 샘플 중 하나인 실제 이미지를 결합한 미니 배치를 수신하고, 각 컨볼루션 레이어는 각각 차원이 m×m인 특징을 추출한다. The discriminator 100 receives a mini-batch combining the output of the generator 100 and the real image as one of the training samples at time t, and each convolutional layer extracts features of dimension m×m, respectively.

LSTM(104)은 판별기(100)에서 출력한 특징의 시간적 종속성을 캡쳐하여 일련의 비디오 프레임을 구별하는데 더 나은 성능을 제공한다. The LSTM 104 captures the temporal dependence of the features output by the discriminator 100 to provide better performance for distinguishing a series of video frames.

본 실시예에서, 생성기(100)의 출력은 이상 점수(anomaly score)를 계산하는데 직접 사용되지 않는다. 충분한 수의 에포크에 대해 본 실시예에 따른 네트워크를 훈련시킨 후, 이상 탐지를 위한 직접 분류기로서 판별기(102)를 사용한다. In this embodiment, the output of the generator 100 is not directly used to calculate an anomaly score. After training the network according to the present embodiment for a sufficient number of epochs, the discriminator 102 is used as a direct classifier for anomaly detection.

본 실시예에 따른 네트워크의 이상 점수는 판별기(102) 확률 출력의 음수로 정의된다. The anomaly score of the network according to the present embodiment is defined as a negative number of the discriminator 102 probability output.

도 2와 같이, 재훈련을 통해 상기 생성기와 상기 판별기의 손실 함수가 최적화된 이후, 테스트 샘플을 입력 받아 테스트 샘플의 이상 점수를 계산한다. 2 , after the loss functions of the generator and the discriminator are optimized through retraining, a test sample is received and an abnormal score of the test sample is calculated.

여기서 S는 이상 점수이고 x는 입력 샘플이다. where S is the outlier score and x is the input sample.

본 실시예에 따른 이상 탐지는 프로세서 및 메모리를 포함하는 장치에서 수행될 수 있다. Anomaly detection according to the present embodiment may be performed in a device including a processor and a memory.

프로세서는 컴퓨터 프로그램을 실행할 수 있는 CPU(central processing unit)나 그 밖에 가상 머신 등을 포함할 수 있다. The processor may include a central processing unit (CPU) capable of executing a computer program or other virtual machines.

메모리는 고정식 하드 드라이브나 착탈식 저장 장치와 같은 불휘발성 저장 장치를 포함할 수 있다. 착탈식 저장 장치는 콤팩트 플래시 유닛, USB 메모리 스틱 등을 포함할 수 있다. 메모리는 각종 랜덤 액세스 메모리와 같은 휘발성 메모리도 포함할 수 있다.The memory may include a non-volatile storage device such as a fixed hard drive or a removable storage device. The removable storage device may include a compact flash unit, a USB memory stick, and the like. The memory may also include volatile memory, such as various random access memories.

메모리에는 상기한 바와 같이 가상 이미지 생성, 확률값 출력, 보상 증가 여부 판단과 같은 과정을 수행하는 프로그램 명령어들이 저장된다. As described above, the memory stores program instructions for generating a virtual image, outputting a probability value, and determining whether to increase compensation.

도 6은 본 실시예에 따른 이상 탐지 효율을 나타낸 것이다. 6 illustrates anomaly detection efficiency according to the present embodiment.

도 6에 도시된 바와 같이, 본 실시예에 따른 이상 탐지는 종래의 다른 연구에 비해 높은 다양한 데이터셋에서 높은 성능을 나타낸 것을 확인할 수 있다. As shown in FIG. 6 , it can be confirmed that the anomaly detection according to the present embodiment exhibits high performance in a variety of datasets that are higher than other conventional studies.

상기한 본 발명의 실시예는 예시의 목적을 위해 개시된 것이고, 본 발명에 대한 통상의 지식을 가지는 당업자라면 본 발명의 사상과 범위 안에서 다양한 수정, 변경, 부가가 가능할 것이며, 이러한 수정, 변경 및 부가는 하기의 특허청구범위에 속하는 것으로 보아야 할 것이다.The above-described embodiments of the present invention have been disclosed for purposes of illustration, and various modifications, changes, and additions will be possible within the spirit and scope of the present invention by those skilled in the art having ordinary knowledge of the present invention, and such modifications, changes and additions should be considered as belonging to the following claims.

Claims

As an anomaly detection device by reward-based retraining of hostile generative networks,
processor; and
a memory coupled to the processor;
The memory is
The generator takes the latent vector as input and generates a virtual image,
The discriminator receives the mini-batch of the training dataset and the virtual image and outputs a probability value,
Comparing the verification loss by the loss function of the generator and the loss function of the discriminator with the initial best validation loss to determine whether compensation is increased,
When it is determined that the reward increases, the mini-batch is re-entered into the discriminator to perform retraining,
Anomaly detection device for storing program instructions executable by the processor.

According to claim 1,
The loss function of the discriminator includes a real loss function and a fake loss function,
The program instructions are
When the initial best validation loss is larger than a preset compensation constant by comparing the loss function of the generator and the sum of the real loss function and the fake loss function, it is determined that retraining is performed by re-entering the mini-batch into the discriminator detection device.

3. The method of claim 2,
In the retraining by the mini-batch, the initial best validation loss is replaced by the loss function of the generator and the sum of the real loss function and the fake loss function.

According to claim 1,
The program instructions are
An anomaly detection device that repeatedly performs retraining by the mini-batch until the reward is not increased.

According to claim 1,
The discriminator is an anomaly detection device in which a plurality of convolutional layers and an LSTM are combined.

According to claim 1,
The program instructions are
After the loss functions of the generator and the discriminator are optimized through the retraining, an anomaly detection apparatus receives a test sample and calculates an anomaly score of the test sample.

As an anomaly detection device by reward-based retraining of hostile generative networks,
a generator that receives a latent vector and generates a virtual image;
a discriminator that receives a mini-batch of a training dataset and the virtual image and outputs a probability value; and
Comprising a compensation increase determining unit for determining whether to increase compensation by comparing the loss function of the generator and the verification loss by the loss function of the discriminator with an initial best validation loss,
The discriminator and the compensation increase determining unit are defined as an inner loop for anomaly detection, and when the reward increase determining unit determines that the reward increases, the mini-batch is input back into the discriminator to perform retraining. .

A method for detecting anomalies through reward-based retraining of an adversarial generative network in a device comprising a processor and a memory, the method comprising:
generating a virtual image by receiving the latent vector as an input;
receiving, by the discriminator, a mini-batch of a training dataset and the virtual image, and outputting a probability value;
determining whether to increase compensation by comparing the loss function of the generator, the verification loss by the loss function of the discriminator, and the initial best validation loss; and
and performing retraining by re-entering the mini-batch into the discriminator when it is determined that the reward increases.

9. The method of claim 8,
The loss function of the discriminator includes a real loss function and a fake loss function,
The step of determining whether to increase the compensation includes:
When the initial best validation loss is larger than a preset compensation constant by comparing the loss function of the generator and the sum of the real loss function and the fake loss function, it is determined that retraining is performed by re-entering the mini-batch into the discriminator detection method.

A computer-readable recording medium storing a program for performing the method according to claim 8.