KR102546822B1

KR102546822B1 - Unsupervised outlier detection apparatus and method

Info

Publication number: KR102546822B1
Application number: KR1020220168919A
Authority: KR
Inventors: 김준모; 안재성
Original assignee: 국방과학연구소; 한국과학기술원
Priority date: 2022-12-06
Filing date: 2022-12-06
Publication date: 2023-06-22

Abstract

An unsupervised outlier detection device includes: an embedding module which generates input information about input data; a masking module which generates a mask to limit the input information and performs masking to limit the input information using a mask and noise; a restoration module which outputs restored data which restores original data from the masked input information through masking; and a refinement module which receives the data, the input information, the mask, and the noise and outputs a refinement result which refines parts which the restoration module cannot restore. Therefore, the present invention does not utilize additional information about normal and abnormal data and has robust properties against hyperparameters.

Description

Unsupervised outlier detection device and method {UNSUPERVISED OUTLIER DETECTION APPARATUS AND METHOD}

본 발명은 비지도 이상치 감지 장치 및 방법에 관한 것으로, 더욱 상세하게는 정상 데이터만으로 이루어진 훈련 데이터셋을 활용하여 학습된 모델이 정상 데이터와 비정상 데이터를 구분할 수 있는 능력을 가지도록 하는 비지도 이상치 감지 장치 및 방법에 관한 것이다.The present invention relates to an apparatus and method for detecting unsupervised outliers, and more particularly, to detect unsupervised outliers using a training dataset consisting only of normal data so that a learned model has the ability to distinguish between normal data and abnormal data. It relates to an apparatus and method.

이상치 감지를 위한 방법으로 생성모델 기반 방법, 표현학습 기반 방법, 부가 정보 기반 방법 등이 이용되고 있다. As methods for detecting outliers, generative model-based methods, expression learning-based methods, and additional information-based methods are used.

생성모델 기반 방법은 오토인코더와 같은 생성모델들이 정상 데이터를 생성하도록 하는 학습 과정 속에서 정상 데이터의 분포에 대한 중요한 특징을 배울 것이라 가정한다. 그렇게 학습된 생성모델들은 정상 데이터만을 잘 생성할 수 있게 되고 비정상 데이터는 잘 생성하지 못해 비교적 큰 복원 손실을 가질 것이 기대된다. 생성모델 기반 방법은 이러한 원리로 발생하는 복원 손실 차이를 통해 이상치 감지를 하고자 하는 것이다. 그러나 생성모델들은 예상보다 좋은 일반화 능력을 가져 비정상 데이터에 대해서도 복원을 잘 수행하여 이상치 감지를 잘 수행하지 못하는 문제가 있다. 따라서 적대적 손실(Adversarial loss)을 활용하여 생성모델의 용량을 제한하는 방법들이 최근 많이 제안되고 있다.The generative model-based method assumes that generative models such as autoencoders will learn important characteristics of the distribution of normal data during the learning process to generate normal data. The generative models learned in this way can generate only normal data well and do not generate abnormal data well, so they are expected to have a relatively large restoration loss. The generative model-based method attempts to detect outliers through the difference in restoration loss generated by this principle. However, generative models have a problem in that they do not perform well in detecting outliers because they have better generalization ability than expected and perform restoration well even for abnormal data. Therefore, many methods of limiting the capacity of generative models using adversarial loss have recently been proposed.

표현학습 기반 방법은 정상 데이터만을 감싸는 가장 작은 하이퍼스피어(Hypersphere)를 찾고 새로운 테스트 데이터가 하이퍼스피어에 속하는지의 여부로 이상치를 탐지한다. 찾고자 하는 하이퍼스피어는 정상 데이터만을 포함하고 비정상 데이터는 포함하지 않아야 하므로 이를 찾기 위해서는 하이퍼스피어의 범위를 특정할 수 있는 하이퍼파라미터 조절이 필수적으로 요구된다. The representation learning-based method finds the smallest hypersphere that encloses only normal data and detects outliers by determining whether the new test data belongs to the hypersphere. Since the hypersphere to be found must contain only normal data and not abnormal data, it is essential to adjust hyperparameters that can specify the range of the hypersphere in order to find it.

부가 정보 기반 방법은 정상 데이터와 비정상 데이터가 가진 차이점에 대한 사전 지식을 활용한 방법이다.The additional information-based method is a method that utilizes prior knowledge about the difference between normal data and abnormal data.

적대적 손실을 활용한 생성모델 기반 방법과 정상 데이터만을 감싸는 하이퍼스피어를 찾고자 하는 표현학습 기반 방법은 모두 충돌되는 목적을 가진 손실 함수들을 활용한다. 그리고 충돌되는 손실 함수들을 이용한 학습 방법들은 그들의 균형을 맞추는 하이퍼파라미터의 값에 따라 결과물이 크게 달라질 수밖에 없다. 하지만 비지도 이상치 감지의 특성상 학습 과정에 있어 비정상 데이터를 사용할 수 없고 어떠한 하이퍼파라미터 값을 가질 때에 제안된 모델이 이상치 감지를 잘 수행할지 알 수 없다. 즉, 생성모델 기반 방법과 표현학습 기반 방법은 하이퍼파라미터에 예민하다는 문제가 있다. Both the generative model-based method using adversarial loss and the representation learning-based method that seeks to find a hypersphere covering only normal data utilize loss functions with conflicting purposes. In addition, learning methods using conflicting loss functions inevitably have very different results depending on the value of the hyperparameter that balances them. However, due to the nature of unsupervised outlier detection, it is not possible to use anomaly data in the learning process, and it is not known whether the proposed model will perform well in outlier detection under certain hyperparameter values. That is, the generative model-based method and the expression learning-based method have a problem in that they are sensitive to hyperparameters.

부가 정보에 기반한 비지도 이상치 감지 기술은 어쩔 수 없이 사용된 정보에 편향될 수밖에 없으며, 사용된 부가 정보가 정상 데이터와 비정상 데이터의 구분에 도움이 되지 않을 시 그 성능이 크게 떨어진다는 단점이 있다. 뿐만 아니라, 비지도 이상치 감지 기술은 학습 과정에서 정상 데이터와 비정상 데이터의 차이에 대한 부가 정보를 알 수 없는 경우가 더 많으므로 부가 정보를 사용하지 않는 것이 더 옳다고 할 수 있다. Unsupervised anomaly detection technology based on side information is inevitably biased to the used information, and its performance greatly deteriorates when the used side information is not helpful in distinguishing normal data from abnormal data. In addition, unsupervised outlier detection technology has more cases in which additional information about the difference between normal data and abnormal data is not known during the learning process, so it can be said that it is more correct not to use additional information.

따라서, 비지도 이상치 감지 모델은 반드시 하이퍼파라미터에 강인한 성질을 가져야 하고, 정상 데이터와 비정상 데이터에 대한 부가 정보를 활용하지 않아야 한다. Therefore, an unsupervised outlier detection model must have robustness to hyperparameters and must not utilize additional information on normal and abnormal data.

본 발명이 해결하고자 하는 기술적 과제는 정상 데이터와 비정상 데이터에 대한 부가 정보를 활용하지 않으며 하이퍼파라미터에 강인한 성질을 가지는 비지도 이상치 감지 장치 및 방법을 제공함에 있다.A technical problem to be solved by the present invention is to provide an apparatus and method for detecting an unsupervised anomaly that does not utilize additional information on normal data and abnormal data and has robustness to hyperparameters.

본 발명의 일 실시예에 따른 비지도 이상치 감지 장치는 입력되는 데이터에 대한 입력 정보를 생성하는 임베딩 모듈, 상기 입력 정보를 제한하기 위한 마스크를 생성하고, 상기 마스크와 노이즈를 활용하여 상기 입력 정보를 제한하는 마스킹을 수행하는 마스킹 모듈, 상기 마스킹을 통해 마스킹된 입력 정보로부터 원래의 데이터를 복원하는 복원 데이터를 출력하는 복원 모듈, 및 상기 데이터, 상기 입력 정보, 상기 마스크 및 상기 노이즈를 입력받아 상기 복원 모듈이 복원하지 못하는 부분을 정제해주는 정제 결과를 출력하는 정제 모듈을 포함한다.An apparatus for detecting unsupervised anomalies according to an embodiment of the present invention includes an embedding module that generates input information for input data, generates a mask to limit the input information, and uses the mask and noise to detect the input information. A masking module that performs limiting masking, a restoration module that outputs restoration data for restoring original data from input information masked through the masking, and the restoration by receiving the data, the input information, the mask, and the noise It includes a refinement module outputting a refinement result for refining a part that the module cannot restore.

상기 마스킹 모듈은 상기 데이터를 입력받아 마스킹 네트워크의 출력값을 출력하고, 상기 마스킹 네트워크의 출력값에 바이어스를 더한 것에 시그모이드 함수를 적용하여 상기 마스크를 생성할 수 있다.The masking module may generate the mask by receiving the data, outputting an output value of a masking network, and applying a sigmoid function to a value obtained by adding a bias to the output value of the masking network.

상기 마스킹 모듈은 상기 마스크의 평균값을 정보 제한 정도로 만들 수 있도록 이분법으로 상기 바이어스를 찾을 수 있다.The masking module may find the bias in a dichotomy to make the average value of the mask about the information limit.

상기 정보 제한 정도는 1/(L-1) 간격으로 L개(L은 2 이상 자연수) 사용될 수 있다. L (L is a natural number of 2 or more) may be used for the information restriction degree at intervals of 1/(L-1).

상기 마스킹은 L 단계의 정보 제한 정도를 가진 마스크를 모두 활용하여 아다마르 곱을 통해 이루어지고, 상기 마스킹된 입력 정보는 총 L개가 생성될 수 있다. The masking is performed through Hadamard multiplication by using all masks having L levels of information restriction, and a total of L pieces of masked input information can be generated.

상기 복원 모듈은 상기 마스킹된 입력 정보에 대응하여 L개의 복원 데이터를 출력할 수 있다. The restoration module may output L pieces of reconstruction data corresponding to the masked input information.

상기 정제 모듈은 상기 L개의 복원 데이터에 대응하여 L개의 정제 결과를 출력하고, 상기 정제 결과와 상기 복원 데이터의 합과 상기 데이터 사이의 평균제곱오차를 통해 정제 손실을 산출하고, 상기 정제 손실을 이용하여 학습할 수 있다.The refinement module outputs L refinement results corresponding to the L reconstructed data, calculates a refinement loss through a mean square error between the sum of the refinement result and the reconstructed data and the data, and uses the refinement loss. you can learn by

상기 복원 모듈은 상기 복원 데이터와 상기 데이터 사이의 평균제곱오차를 통해 복원 손실을 산출하고, 상기 임베딩 모듈, 상기 마스킹 모듈 및 상기 복원 모듈은 상기 복원 손실을 이용하여 학습할 수 있다. The restoration module may calculate a restoration loss through a mean square error between the restored data and the data, and the embedding module, the masking module, and the restoration module may learn using the restoration loss.

본 발명의 다른 실시예에 따른 임베딩 네트워크, 마스킹 네트워크, 복원 네트워크 및 정제 네트워크를 포함하는 비지도 이상치 감지 장치에 의한 비지도 이상치 감지 방법은, 입력되는 데이터에 대한 입력 정보를 생성하는 단계, 상기 입력 정보를 제한하기 위한 마스크를 생성하고, 상기 마스크와 노이즈를 활용하여 상기 입력 정보를 제한하는 마스킹을 수행하는 단계, 상기 마스킹을 통해 마스킹된 입력 정보로부터 원래의 데이터를 복원하는 복원 데이터를 출력하는 단계, 및 상기 데이터, 상기 입력 정보, 상기 마스크 및 상기 노이즈를 입력받아 상기 복원 모듈이 복원하지 못하는 부분을 정제해주는 정제 결과를 출력하는 단계를 포함한다.An unsupervised outlier detection method using an unsupervised outlier detection apparatus including an embedding network, a masking network, a restoration network, and a refinement network according to another embodiment of the present invention includes generating input information for input data, the input Generating a mask for limiting information and performing masking to limit the input information by utilizing the mask and noise, outputting restored data for restoring original data from the masked input information through the masking , and receiving the data, the input information, the mask, and the noise, and outputting a refinement result for refining a part that cannot be restored by the restoration module.

상기 마스킹 네트워크가 상기 데이터를 입력받아 출력하는 마스킹 네트워크의 출력값에 바이어스를 더한 것에 시그모이드 함수를 적용하여 상기 마스크를 생성할 수 있다. The mask may be generated by applying a sigmoid function to a value obtained by adding a bias to an output value of the masking network that receives and outputs the data.

상기 마스크의 평균값을 정보 제한 정도로 만들 수 있도록 이분법으로 상기 바이어스를 찾을 수 있다.The bias can be found by the bisection method to make the average value of the mask about the information limit.

상기 마스킹된 입력 정보에 대응하여 L개의 복원 데이터를 출력할 수 있다. L pieces of restored data may be output corresponding to the masked input information.

상기 L개의 복원 데이터에 대응하여 L개의 정제 결과를 출력하고, 상기 정제 결과와 상기 복원 데이터의 합과 상기 데이터 사이의 평균제곱오차를 통해 정제 손실을 산출하고, 상기 정제 네트워크는 상기 정제 손실을 이용하여 학습할 수 있다. Outputs L refinement results corresponding to the L reconstructed data, calculates a refinement loss through a mean square error between the sum of the refinement result and the reconstructed data and the data, and the refinement network uses the refinement loss you can learn by

상기 복원 데이터와 상기 데이터 사이의 평균제곱오차를 통해 복원 손실을 산출하고, 상기 임베딩 네트워크, 상기 마스킹 네트워크 및 상기 복원 네트워크는 상기 복원 손실을 이용하여 학습할 수 있다.A restoration loss may be calculated through a mean square error between the restored data and the data, and the embedding network, the masking network, and the restoration network may learn using the restoration loss.

본 발명의 실시예에 따른 비지도 이상치 감지 장치 및 방법은 정상 데이터와 비정상 데이터에 대한 부가 정보를 활용하지 않으며 하이퍼파라미터에 강인한 성질을 가진다.The apparatus and method for detecting unsupervised anomalies according to an embodiment of the present invention do not utilize additional information on normal data and abnormal data and have robustness to hyperparameters.

도 1은 본 발명의 일 실시예에 따른 비지도 이상치 감지 장치를 나타내는 블록도이다.
도 2는 본 발명의 일 실시예에 따른 비지도 이상치 감지 방법을 나타내는 블록도이다.
도 3은 비지도 이상치 감지에서 사용되는 벤치마크 데이터셋(benchmark dataset)을 나타낸다.
도 4는 도 3의 MNIST, FMNIST, CIFAR 데이터셋에서의 AUROC(Area Under ROC) 성능의 평균을 나타낸다.
도 5는 도 3의 MVTecAD 데이터셋에서의 AUROC 성능의 평균을 나타낸다.
도 6은 CIFAR 데이터셋에서 사용되는 정보 제한 정도의 개수(L)에 따른 AUROC 성능을 나타낸다.1 is a block diagram illustrating an apparatus for detecting an unsupervised anomaly according to an embodiment of the present invention.
2 is a block diagram illustrating a method for detecting an unsupervised anomaly according to an embodiment of the present invention.
3 shows a benchmark dataset used in unsupervised outlier detection.
FIG. 4 shows the average of AUROC (Area Under ROC) performance in the MNIST, FMNIST, and CIFAR datasets of FIG. 3 .
FIG. 5 shows the average AUROC performance in the MVTecAD dataset of FIG. 3 .
6 shows AUROC performance according to the number (L) of information restriction degrees used in the CIFAR dataset.

이하, 첨부한 도면을 참고로 하여 본 발명의 실시예들에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예들에 한정되지 않는다.Hereinafter, with reference to the accompanying drawings, embodiments of the present invention will be described in detail so that those skilled in the art can easily carry out the present invention. This invention may be embodied in many different forms and is not limited to the embodiments set forth herein.

본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 동일 또는 유사한 구성요소에 대해서는 동일한 참조 부호를 붙이도록 한다.In order to clearly describe the present invention, parts irrelevant to the description are omitted, and the same reference numerals are assigned to the same or similar components throughout the specification.

또한, 명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함" 한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.In addition, throughout the specification, when a certain component is said to "include", it means that it may further include other components without excluding other components unless otherwise stated.

도 1은 본 발명의 일 실시예에 따른 비지도 이상치 감지 장치를 나타내는 블록도이다.1 is a block diagram illustrating an apparatus for detecting an unsupervised anomaly according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 실시예에 따른 비지도 이상치 감지 장치(100)는 정상 데이터만으로 이루어진 데이터를 활용하여 학습된 모델이 정상 데이터와 비정상 데이터를 구분할 수 있는 능력을 갖도록 한다. 비지도 이상치 감지 장치(100)는 제한된 정보로부터 원래의 데이터를 복원하는 과정을 통해 일어나는 복원 정도의 차이를 통해 이상치 감지를 수행할 수 있다. 이를 위하여, 이상치 감지 장치(100)는 임베딩 모듈(110), 마스킹 모듈(120), 복원 모듈(130) 및 정제 모듈(140)을 포함할 수 있다.Referring to FIG. 1 , an unsupervised anomaly detection apparatus 100 according to an embodiment of the present invention utilizes data consisting only of normal data so that a learned model has the ability to distinguish between normal data and abnormal data. The apparatus 100 for detecting unsupervised outliers may perform outlier detection through a difference in restoration degree that occurs through a process of restoring original data from limited information. To this end, the anomaly detection device 100 may include an embedding module 110 , a masking module 120 , a restoration module 130 and a refinement module 140 .

임베딩(Embedding) 모듈(110)은 입력되는 데이터에 대한 입력 정보를 생성할 수 있다. 임베딩 모듈(110)은 정보 생성 네트워크로 이루어질 수 있으며, 정보 생성 네트워크는 데이터를 입력받아 마스킹 모듈(120)에 의한 마스킹 작업에도 복원 모듈(130)이 원래의 데이터를 최대한 복원할 수 있도록 입력 정보를 생성할 수 있다. 임베딩 모듈(110)을 임베딩 네트워크라 지칭할 수 있다.The embedding module 110 may generate input information for input data. The embedding module 110 may be composed of an information generating network, and the information generating network receives input data and converts the input information so that the restoration module 130 can restore the original data as much as possible even in the masking operation by the masking module 120. can create The embedding module 110 may be referred to as an embedding network.

마스킹(Masking) 모듈(120)은 임베딩 모듈(110)에서 생성된 입력 정보를 제한하기 위한 마스크를 생성하여 입력 정보에 대해 마스킹을 수행하는 마스킹 네트워크(Masking Network)로 이루어질 수 있다. 마스킹 모듈(120)은 데이터를 입력받아 마스킹 작업에 이용될 마스크를 생성할 수 있다. 마스킹은 마스크와 노이즈를 활용하여 입력 정보를 제한하는 과정을 의미한다. The masking module 120 may include a masking network that generates a mask to limit the input information generated by the embedding module 110 and performs masking on the input information. The masking module 120 may receive data and generate a mask to be used for masking work. Masking refers to the process of limiting input information using a mask and noise.

복원(Restoration) 모듈(130)은 마스킹을 통해 제한된 입력 정보로부터 원래의 데이터를 복원할 수 있다. 복원 모듈(130)은 복원 네트워크(Restoration Network)로 이루어질 수 있으며, 제한된 입력 정보를 입력받아 원래의 데이터를 추측한 결과값을 출력할 수 있다. The restoration module 130 may restore original data from limited input information through masking. The restoration module 130 may be composed of a restoration network, and may receive limited input information and output values obtained by estimating original data.

입력 정보의 제한과 복원에 사용된 모든 네트워크, 즉 임베딩 네트워크, 마스킹 네트워크와 복원 네트워크는 복원 손실만으로 학습될 수 있다. 제한된 입력 정보로부터 데이터 복원은 정상 데이터에 있어서는 성공적인 복원으로, 비정상 데이터에 있어서는 복원 실패로 이어지며 이상치 감지를 가능하게 한다. All networks used for limiting and restoring input information, that is, embedding networks, masking networks, and restoration networks can be learned only with restoration loss. Data restoration from limited input information leads to successful restoration for normal data and restoration failure for abnormal data, enabling detection of outliers.

정제(Refinement) 모듈(140)은 데이터의 특성을 고려하지 않고 정해진 정도로 정보를 제한하는 마스킹 방식으로 인해 발생하는 불공평성을 해결하기 위한 것으로, 정제 네트워크(Refinement Network)로 이루어질 수 있다. 정제 모듈(140)은 앞선 과정에서 발생한 정보들(데이터, 입력 정보, 마스크, 노이즈)을 입력받고 복원 모듈(130)에서의 출력(원래의 데이터를 추측한 결과값의 출력)을 예측하고 어떠한 값을 더해야 원래의 데이터가 될지를 예측할 수 있다. 정제 모듈(140)은 복원 모듈(130)에서의 출력에 예측된 값을 더하여 정보의 부족으로 인해 미처 복원하지 못하는 부분을 채워줄 수 있다. 정제 네트워크는 정제 손실로 학습될 수 있다. The refinement module 140 is intended to solve inequalities caused by a masking method that limits information to a predetermined degree without considering the characteristics of data, and may include a refinement network. The refinement module 140 receives the information (data, input information, mask, noise) generated in the previous process, predicts the output from the restoration module 130 (output of the result value obtained by estimating the original data), and predicts a certain value It is possible to predict whether the original data will be obtained by adding . The refinement module 140 may add a predicted value to the output of the restoration module 130 to fill in a part that cannot be restored due to lack of information. A refinement network can be learned with refinement loss.

이하, 도 2를 참조하여 비지도 이상치 감지 장치(100)에 의한 비지도 이상치 감지 방법에 대하여 더욱 상세하게 설명한다. Hereinafter, a method for detecting an unsupervised anomaly by the apparatus 100 for detecting an unsupervised anomaly will be described in more detail with reference to FIG. 2 .

도 2는 본 발명의 일 실시예에 따른 비지도 이상치 감지 방법을 나타내는 블록도이다.2 is a block diagram illustrating a method for detecting an unsupervised anomaly according to an embodiment of the present invention.

도 2를 참조하면, 다중 단계 마스킹(Multi-Level Masking) 과정이 수행된다(S110). 다중 단계 마스킹 과정은 임베딩 모듈(110)에 의한 입력 정보 생성 단계, 마스킹 모듈(120)에 의한 마스크 생성 단계와 마스킹 단계를 포함할 수 있다.Referring to FIG. 2, a multi-level masking process is performed (S110). The multi-step masking process may include a step of generating input information by the embedding module 110, a step of generating a mask by the masking module 120, and a masking step.

입력 정보 생성 단계에서, 임베딩 모듈(110)은 데이터

를 입력받아 입력 정보

를 생성할 수 있다.In the input information generation step, the embedding module 110 data

input information

can create

마스크 생성 단계에서, 마스킹 모듈(120)은 데이터

를 입력받아 마스킹 네트워크의 출력값

을 출력할 수 있다. 여기서,

은 마스킹 네트워크의 함수를 의미한다. 마스킹 모듈(120)은 수학식 1 및 2와 같이 마스킹 네트워크의 출력값

과 주어진 입력 정보의 제한값(정보 제한 정도)

을 이용하여 마스크를 생성할 수 있다. In the mask generation step, the masking module 120

is input and the output value of the masking network

can output here,

denotes a function of the masking network. The masking module 120 is an output value of the masking network as shown in

Equations

1 and 2

and the limiting value of the given input information (degree of information limiting)

You can use to create a mask.

즉, 마스킹 네트워크의 출력값

에 바이어스

를 더한 것에 시그모이드 함수(Sigmoid function)

를 적용하여 마스크

이 생성될 수 있다. 마스킹 모듈(120)은 마스크의 평균값을 정보 제한 정도(마스킹 레벨)

로 만들 수 있도록 이분법(bisection method)으로 바이어스

를 찾을 수 있다. 다시 말해, 마스크의 평균값은 바이어스를 통해 정해진 정보 제한 정도가 될 수 있다. 시그모이드 함수는 단조 증가(monotonically increase)하는 함수이므로 이분법이 적용될 수 있다.That is, the output of the masking network

bias on

The sigmoid function to the addition of

mask by applying

this can be created. The masking module 120 converts the average value of the mask into information restriction degree (masking level).

biased by the bisection method to make

can be found. In other words, the average value of the mask may be the degree of information restriction determined through the bias. Since the sigmoid function is a monotonically increasing function, the dichotomy can be applied.

정보 제한 정도(마스킹 레벨)

는 1/(L-1) 간격으로 [0, 1/(L-1), ..., 1]의 L개가 사용될 수 있다(L은 2 이상 자연수). 즉, 하나의 데이터에 대한 이상치 감지를 수행하기 위해 총 L개의 마스크가 활용될 수 있다. 도 2에서는

이 0 일 때, 1 일 때 및 중간 단계일 때의 마스크

을 나타내고 있다. Degree of information restriction (masking level)

L of [0, 1/(L-1), ..., 1] can be used at 1/(L-1) intervals (L is a natural number of 2 or more). That is, a total of L masks may be utilized to perform outlier detection on one piece of data. In Figure 2

Mask when is 0, when 1, and intermediate steps

represents

정해진 정보 제한 정도(마스킹 레벨)를 갖는 마스크를 생성할 수 있는 본 발명의 실시예에 따른 비지도 이상치 감지 장치(100)는 종래의 적대적 손실을 활용하기 때문에 발생하는 하이퍼파라미터에 예민하다는 문제를 해결할 수 있다.The apparatus 100 for detecting unsupervised anomalies according to an embodiment of the present invention capable of generating a mask having a predetermined information restriction degree (masking level) solves the problem of being sensitive to hyperparameters caused by using conventional adversarial loss. can

마스킹 단계에서, 마스킹 모듈(120)은 임베딩 모듈(110)에 의해 생성된 입력 정보

를 정보 제한 정도

를 가진 마스크

와 수학식 3과 같이 획일적으로(uniformly) 샘플링된 노이즈

를 이용하여 수학식 4와 같이 마스킹할 수 있다. In the masking step, the masking module 120 uses the input information generated by the embedding module 110

the degree of information limitation

mask with

and uniformly sampled noise as shown in Equation 3

It can be masked as shown in Equation 4 using

여기서,

는 마스킹된(제한된) 입력 정보이고,

는 아다마르 곱(Hadamard product)이다. 즉, 마스킹은 L 단계의 정보 제한 정도를 가진 마스크를 모두 활용하여 아다마르 곱을 통해 이루어지며, 마스킹된 입력 정보는 총 L개가 생성된다. here,

Is masked (restricted) input information,

is the Hadamard product. That is, masking is performed through Hadamard multiplication by utilizing all masks having L levels of information restriction, and a total of L masked input information is generated.

입력 정보는 마스크와 아다마르 곱으로 제한되기 때문에 마스크의 평균값이 마스크의 정보 제한 정도를 나타낸다고 볼 수 있다. 따라서, 수학식 2에서 볼 수 있듯이 마스크의 평균값을 정보 제한 정도

로 만들 수 있는 바이어스를 값을 구하여 더함으로써 마스크의 제한 정도가 조절될 수 있다. 바이어스는 이분법을 통해 찾을 수 있으며, 이분법은 미분 가능하지 않기 때문에 적절한 이분법이 찾아졌다는 가정 하에 마스킹 네트워크의 출력에의 역전파(backpropagation)는 수학식 5와 같이 적용될 수 있다.Since the input information is limited by the product of the mask and Hadamard, the average value of the mask represents the degree of information restriction of the mask. Therefore, as shown in Equation 2, the average value of the mask is the information limiting degree.

The limiting degree of the mask can be adjusted by obtaining and adding the value of the bias that can be made with . The bias can be found through the dichotomy, and since the dichotomy is not differentiable, backpropagation to the output of the masking network can be applied as shown in Equation 5 under the assumption that an appropriate dichotomy is found.

다음으로, 복원(Restoration) 과정이 수행된다(S120). 복원 과정에서, 복원 모듈(130)은 다중 단계 마스킹으로 마스킹된 입력 정보

를 입력받아 복원 데이터

를 출력할 수 있다.

는 복원 네트워크의 함수이다. 복원 데이터

는 마스킹된 입력 정보

에 대응하여 L개가 생성될 수 있다. 복원 데이터

는 원래의 데이터

처럼 되고자 한다. 하지만, 도 2에서 예시한 바와 같이 매우 간단한 데이터임에도 매우 낮은

에서 복원이 잘 이루어지지 않는 것을 볼 수 있다.

이 클수록 데이터에 대한 정보가 많이 남겨져 있을 것을 의미하므로 낮은

에서 복원이 잘 되지 않으며 높은

에서 복원이 잘 될 것으로 기대된다.Next, a restoration process is performed (S120). In the restoration process, the restoration module 130 input information masked by multi-step masking.

and restore data

can output

is a function of the resilient network. restore data

is the masked input information

Corresponding to L may be generated. restore data

is the original data

want to be like However, as illustrated in FIG. 2, even with very simple data, very low

It can be seen that restoration does not work well in .

The larger the value is, the more information about the data will be left.

is not well restored and high

Restoration is expected to go well.

다음으로, 정제(Refinement) 과정이 수행된다(S130). 정해진 정보 제한 정도

로 마스크

이 생성되므로 적대적 손실에 기반하는 종래의 기술과는 달리 본 발명의 실시예에 따른 비지도 이상치 감지 방법은 하이퍼파라미터에 강인하지만, 정해진 정보 제한 정도로 마스크를 생성하는 것은 어떤 측면에서 불공평할 수 있다. 동일한 정보 제한 정도

를 가진 마스크를 통해 입력 정보

가 제한될 때에 복원에 필요한 정보의 양이 비교적 적은 단조로운 데이터의 경우 비교적 복잡한 데이터에 비해 작은 복원 손실을 가질 것이기 때문이다. 데이터 자체가 가진 복잡성 차이 때문에 마스킹과 복원만으로는 공평한 이상치 감지가 수행되지 않을 수 있으며, 이를 해결하기 위해 정제 모듈(140)이 정제 과정을 수행하게 된다. Next, a refinement process is performed (S130). the degree of limited information

raw mask

Unlike conventional techniques based on adversarial loss, the unsupervised anomaly detection method according to the embodiment of the present invention is robust to hyperparameters, but generating a mask to a predetermined information limit may be unfair in some respects. The same degree of information limitation

input information through a mask with

This is because monotonic data, in which the amount of information required for restoration is relatively small, will have a small restoration loss compared to relatively complex data when . Due to the difference in complexity of the data itself, fair outlier detection may not be performed only by masking and restoration. To solve this problem, the refinement module 140 performs a refinement process.

정제 모듈(140)은 데이터

, 입력 정보

, 마스크

및 노이즈

를 입력받아 정제 결과

를 출력할 수 있다. 여기서,

는 정제 네트워크의 함수이다. 정제 결과

는 작은 정보의 양으로 인해 복원 모듈(130)이 어쩔 수 없이 복원하지 못하는 부분을 정제해주는 역할을 하며, 이에 따라 데이터 자체가 가진 복잡성 차이로 인한 불공평성을 해결할 수 있다. 즉, 정제 모듈(140)이 복잡성 차이로 인한 불공평성을 해결함으로써 마스킹과 복원을 통해 일어나는 비정상 데이터가 정상 데이터로 복원되며 일어나는 의도된 복원 손실만으로 이상치 감지가 가능해진다. 정제 모듈(140)은 L개의 복원 데이터에 대응하여 L개의 정제 결과

를 출력할 수 있다. The refinement module 140 data

, input information

, mask

and noise

Refinement result by inputting

can output here,

is a function of the refinement network. refinement result

? serves to refine the part that the restoration module 130 cannot inevitably restore due to the small amount of information, and accordingly, unfairness due to the complexity difference of the data itself can be resolved. That is, since the refinement module 140 resolves inequity due to the complexity difference, abnormal data generated through masking and restoration is restored to normal data, and an outlier can be detected only with the intended restoration loss. The refinement module 140 corresponds to the L number of reconstructed data and results in L number of refinements.

can output

비지도 이상치 감지 장치(100)의 학습 과정은 다음과 같이 수행될 수 있다. 먼저, 임베딩 모듈(임베딩 네트워크)(110), 마스킹 모듈(마스킹 네트워크)(120) 및 복원 모듈(복원 네트워크)(130)은 수학식 6의 복원 손실을 이용하여 학습될 수 있다. The learning process of the unsupervised outlier detection device 100 may be performed as follows. First, the embedding module (embedding network) 110, the masking module (masking network) 120, and the restoration module (reconstruction network) 130 may be learned using the restoration loss of Equation 6.

여기서,

는 복원 손실이다. 복원 모듈(130)은 복원 네트워크의 출력

과 데이터

사이의 평균제곱오차(Mean Square Error, MSE)를 통해 복원 손실을 산출할 수 있다.here,

is the restoration loss. The restoration module 130 outputs the restoration network

and data

The restoration loss can be calculated through the mean square error (MSE) between

임베딩 모듈(임베딩 네트워크)(110), 마스킹 모듈(마스킹 네트워크)(120) 및 복원 모듈(복원 네트워크)(130)의 학습이 완료된 후 모든 파라미터를 고정한 후 정제 모듈(정제 네트워크)(140)이 수학식 7의 정제 손실을 이용하여 학습될 수 있다.After the learning of the embedding module (embedding network) 110, the masking module (masking network) 120, and the restoration module (reconstruction network) 130 are completed and all parameters are fixed, the refinement module (refinement network) 140 performs math It can be learned using the refinement loss of Equation 7.

여기서,

는 정제 손실이다. 정제 모듈(140)은 정제 네트워크의 출력(정제 결과)

과 복원 네트워크의 출력(복원 데이터)

의 합과 데이터

사이의 평균제곱오차(MSE)를 통해 정제 손실을 산출할 수 있다.here,

is the refinement loss. The refinement module 140 outputs the refinement network (the refinement result).

and the output of the restoration network (restoration data)

sum of and data

The mean square error (MSE) between

전체 학습 과정에서 정보 제한 정도를 조절하기 위한 사전 지정 마스크 평균값은 수학식 8을 통해 샘플링되어 사용될 수 있다. In the entire learning process, the average value of the pre-specified mask for adjusting the degree of information restriction may be sampled and used through Equation 8.

비지도 이상치 감지 장치(100)의 이상치 감지 과정(또는 테스트 과정)은 데이터

에 대하여 수학식 9와 같이 0부터 1까지 1/(L-1) 간격으로 분포된 L 단계의 정보 제한 정도 모두에서의 수학식 7의 정제 손실의 합으로 수행될 수 있다. 정제 손실의 총합이 클수록 비정상 데이터임을 나타낸다. The outlier detection process (or test process) of the unsupervised outlier detection device 100 is

For Equation 9, it can be performed as the sum of the refinement loss of Equation 7 in all information restriction degrees of L stages distributed from 0 to 1 at 1/(L-1) intervals. The larger the sum of refinement losses, the more abnormal data.

이하, 도 3 내지 6을 참조하여 본 발명의 실시예에 따른 비지도 이상치 감지 장치(100)의 성능을 실험한 결과에 대하여 설명한다.Hereinafter, the results of testing the performance of the apparatus 100 for detecting unsupervised anomalies according to an embodiment of the present invention will be described with reference to FIGS. 3 to 6 .

도 3은 비지도 이상치 감지에서 사용되는 벤치마크 데이터셋(benchmark dataset)을 나타낸다. 도 4는 도 3의 MNIST, FMNIST, CIFAR 데이터셋에서의 AUROC(Area Under ROC) 성능의 평균을 나타낸다. 도 5는 도 3의 MVTecAD 데이터셋에서의 AUROC 성능의 평균을 나타낸다. 도 6은 CIFAR 데이터셋에서 사용되는 정보 제한 정도의 개수(L)에 따른 AUROC 성능을 나타낸다.3 shows a benchmark dataset used in unsupervised outlier detection. FIG. 4 shows the average of AUROC (Area Under ROC) performance in the MNIST, FMNIST, and CIFAR datasets of FIG. 3 . FIG. 5 shows the average AUROC performance in the MVTecAD dataset of FIG. 3 . 6 shows AUROC performance according to the number (L) of information restriction degrees used in the CIFAR dataset.

도 3 내지 6을 참조하면, 비지도 이상치 감지에서 사용되는 벤치마크 데이터셋으로 본 발명의 실시예에 따른 비지도 이상치 감지 장치(100)의 성능을 실험한 결과에 따르면, 정보 제한 정도의 개수가 증가함에 따라 AUROC 성능이 증가함을 볼 수 있다. 본 발명의 실시예에 따른 비지도 이상치 감지 장치(100)의 성능에 치명적인 영향을 끼칠 수 있는 하이퍼파라미터인 정보 제한 정도의 개수 L이 증가함에 따라 성능이 증가하는 모습을 통해 본 발명의 실시예에 따른 비지도 이상치 감지 장치(100)는 하이퍼파라미터에 강인하다는 것을 알 수 있다.3 to 6, according to the results of testing the performance of the unsupervised anomaly detection apparatus 100 according to an embodiment of the present invention with a benchmark dataset used in unsupervised outlier detection, the number of information restriction degrees is It can be seen that the AUROC performance increases as The performance of the apparatus 100 for detecting unsupervised anomalies according to an embodiment of the present invention increases as the number L of the information restriction degree, which is a hyperparameter that can have a fatal effect on the performance, increases. It can be seen that the unsupervised anomaly detection apparatus 100 according to the above is robust to hyperparameters.

지금까지 참조한 도면과 기재된 발명의 상세한 설명은 단지 본 발명의 예시적인 것으로서, 이는 단지 본 발명을 설명하기 위한 목적에서 사용된 것이지 의미 한정이나 특허청구범위에 기재된 본 발명의 범위를 제한하기 위하여 사용된 것은 아니다. 그러므로 본 기술 분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 타 실시 예가 가능하다는 점을 이해할 것이다. 따라서, 본 발명의 진정한 기술적 보호 범위는 첨부된 특허청구범위의 기술적 사상에 의해 정해져야 할 것이다. The drawings and detailed description of the present invention referred to so far are only examples of the present invention, which are only used for the purpose of explaining the present invention, and are used to limit the scope of the present invention described in the meaning or claims. It is not. Therefore, those skilled in the art will understand that various modifications and equivalent other embodiments are possible therefrom. Therefore, the true technical protection scope of the present invention should be determined by the technical spirit of the appended claims.

100: 비지도 이상치 감지 장치
110: 임베딩 모듈
120: 마스킹 모듈
130: 복원 모듈
140: 정제 모듈100: unsupervised outlier detection device
110: embedding module
120: masking module
130: restoration module
140: purification module

Claims

an embedding module generating input information for input data;
a masking module generating a mask for limiting the input information and performing masking for limiting the input information using the mask and noise;
a restoration module outputting restoration data for restoring the input data from the input information masked through the masking; and
and a refinement module receiving the input data, the input information, the mask, and the noise, and outputting a refinement result for refining a part that the restoration module cannot restore.

According to claim 1,
wherein the masking module receives the input data, outputs an output value of a masking network, and generates the mask by applying a sigmoid function to the result of adding a bias to the output value of the masking network.

According to claim 2,
wherein the masking module finds the bias in a dichotomy method so as to make the average value of the mask about an information limit.

According to claim 3,
The unsupervised anomaly detection device in which L (L is a natural number equal to or greater than 2) is used at intervals of 1/(L-1).

According to claim 4,
The masking is performed through Hadamard multiplication by utilizing all masks having L levels of information restriction, and a total of L masked input information is generated.

According to claim 5,
The reconstruction module outputs L pieces of reconstruction data corresponding to the masked input information.

According to claim 6,
The refinement module outputs L refinement results corresponding to the L reconstructed data, calculates a refinement loss through a mean square error between the sum of the refinement result and the reconstructed data and the input data, and calculates the refinement loss An unsupervised outlier detection device that learns using

According to claim 6,
The restoration module calculates a restoration loss through a mean square error between the restoration data and the input data;
The embedding module, the masking module, and the reconstruction module learn using the restoration loss.

An unsupervised outlier detection method using an unsupervised outlier detection device including an embedding network, a masking network, a restoration network, and a refinement network,
generating input information for the input data;
generating a mask for limiting the input information and performing masking for limiting the input information using the mask and noise;
outputting restored data for restoring the input data from input information masked through the masking; and
and receiving the input data, the input information, the mask, and the noise, and outputting a refinement result for refining a part that cannot be restored by the restoration network.

According to claim 9,
The method of detecting an unsupervised anomaly in which the masking network generates the mask by applying a sigmoid function to a value obtained by adding a bias to an output value of the masking network that receives and outputs the input data.

According to claim 10,
An unsupervised anomaly detection method for finding the bias by a bisection method so as to make the average value of the mask about the information limit.

According to claim 11,
The unsupervised anomaly detection method in which L (L is a natural number equal to or greater than 2) is used at intervals of 1/(L-1).

According to claim 12,
The masking is performed through Hadamard multiplication by utilizing all masks having L levels of information restriction, and a total of L pieces of masked input information are generated.

According to claim 13,
An unsupervised anomaly detection method for outputting L pieces of reconstructed data corresponding to the masked input information.

According to claim 14,
Outputs L refinement results corresponding to the L pieces of restored data, calculates a refinement loss through a mean square error between the sum of the refinement result and the restored data and the input data, and the refinement network calculates the refinement loss An unsupervised outlier detection method learned using

According to claim 14,
Calculating a restoration loss through a mean square error between the restored data and the input data;
The embedding network, the masking network, and the restoration network learn using the restoration loss.